agents.document_modifiers.tnt.agent

Taxonomy generation agent implementation.

from typing import Any, Dict, Optional This module implements an agent that generates taxonomies from conversation histories through an iterative process of document summarization, clustering, and refinement. It uses LLM-based processing at each step to generate high-quality taxonomies.

The agent follows these main steps: 1. Document summarization 2. Minibatch creation 3. Initial taxonomy generation 4. Iterative taxonomy refinement 5. Final taxonomy review

Examples

Basic usage of the taxonomy agent:

config = TaxonomyAgentConfig(
    state_schema=TaxonomyGenerationState,
    visualize=True,
    name="TaxonomyAgent"
)
agent = TaxonomyAgent(config)
result = agent.run(input_data={"documents": [...]})

Classes

TaxonomyAgent

Agent that generates a taxonomy from a conversation history.

TaxonomyAgentConfig

Agent configuration for generating a taxonomy from conversation history.

Module Contents

class agents.document_modifiers.tnt.agent.TaxonomyAgent(config)

Bases: haive.core.engine.agent.agent.Agent[TaxonomyAgentConfig]

Agent that generates a taxonomy from a conversation history.

Initialize the taxonomy agent.

Parameters:

config (TaxonomyAgentConfig)

generate_taxonomy(state, config)

Generates an initial taxonomy from the first document minibatch.

Parameters:
  • state (TaxonomyGenerationState) – The current state of the taxonomy process.

  • config (RunnableConfig) – Configuration for the taxonomy generation.

Returns:

Updated state with the initial taxonomy.

Return type:

TaxonomyGenerationState

get_content(state)

Extracts document content for processing.

Parameters:

state (haive.agents.document_modifiers.tnt.state.TaxonomyGenerationState)

get_minibatches(state, config)

Splits documents into minibatches for iterative taxonomy generation.

Parameters:
  • state (TaxonomyGenerationState) – The current state containing documents.

  • config (RunnableConfig) – Configuration object specifying batch size.

Returns:

Dictionary with a ‘minibatches’ key containing grouped document indices.

Return type:

dict

invoke_taxonomy_chain(chain_config, state, config, mb_indices)

Invokes the taxonomy LLM to generate or refine taxonomies.

Parameters:
  • chain (Runnable) – LLM pipeline for taxonomy generation.

  • state (TaxonomyGenerationState) – Current taxonomy state.

  • config (RunnableConfig) – Configurable parameters.

  • mb_indices (List[int]) – Indices of documents to process in this iteration.

  • chain_config (haive.core.engine.aug_llm.AugLLMConfig)

Returns:

Updated state with new taxonomy clusters.

Return type:

TaxonomyGenerationState

reduce_summaries(combined)

Reduces summarized documents into a structured format.

Parameters:

combined (dict)

Return type:

haive.agents.document_modifiers.tnt.state.TaxonomyGenerationState

review_taxonomy(state, config)

Evaluates the final taxonomy after all updates.

Parameters:
  • state (TaxonomyGenerationState) – The current state with completed taxonomies.

  • config (RunnableConfig) – Configuration settings.

Returns:

Updated state with reviewed taxonomy.

Return type:

TaxonomyGenerationState

setup_workflow()

Sets up the taxonomy generation workflow in LangGraph.

Return type:

None

update_taxonomy(state, config)

Iteratively refines the taxonomy using new minibatches of data.

Parameters:
  • state (TaxonomyGenerationState) – The current state containing previous taxonomies.

  • config (RunnableConfig) – Configuration settings.

Returns:

Updated state with revised taxonomy clusters.

Return type:

TaxonomyGenerationState

class agents.document_modifiers.tnt.agent.TaxonomyAgentConfig

Bases: haive.core.engine.agent.agent.AgentConfig

Agent configuration for generating a taxonomy from conversation history.