agents.document_modifiers.summarizer.map_branch.agent¶
Map-Reduce Summarizer Agent for document summarization.
This module provides the SummarizerAgent class which implements a map-reduce approach to document summarization. It can handle large documents by splitting them into manageable chunks, summarizing each chunk, and then combining the summaries into a final coherent summary.
The agent handles token limit constraints and provides automatic fallback mechanisms for oversized documents.
- Classes:
SummarizerAgent: Main agent for map-reduce document summarization
Examples
Basic usage:
from haive.agents.document_modifiers.summarizer.map_branch import SummarizerAgent
from haive.agents.document_modifiers.summarizer.map_branch.config import SummarizerAgentConfig
config = SummarizerAgentConfig(
token_max=1000,
name="document_summarizer"
)
agent = SummarizerAgent(config)
documents = ["Long document text 1...", "Long document text 2..."]
result = agent.run({"contents": documents})
summary = result["final_summary"]
With custom token limits:
config = SummarizerAgentConfig(
token_max=2000, # Allow longer intermediate summaries
engines={
"map_chain": custom_map_config,
"reduce_chain": custom_reduce_config
}
)
agent = SummarizerAgent(config)
See also
SummarizerAgentConfig
: Configuration classSummaryState
: State management
Classes¶
Agent that summarizes documents using a map-reduce approach. |
Functions¶
Create a default SummarizerAgent instance. |
Module Contents¶
- class agents.document_modifiers.summarizer.map_branch.agent.SummarizerAgent(config=SummarizerAgentConfig())¶
Bases:
haive.core.engine.agent.agent.Agent
[haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig
]Agent that summarizes documents using a map-reduce approach.
This agent implements a sophisticated document summarization workflow that can handle large documents and multiple documents simultaneously. It uses a map-reduce pattern where documents are first summarized individually (map phase), then combined and reduced to a final summary (reduce phase).
The agent automatically handles token limit constraints by: 1. Splitting oversized documents into manageable chunks 2. Summarizing chunks individually 3. Collapsing intermediate summaries when they exceed token limits 4. Producing a coherent final summary
- Parameters:
config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig) – Configuration object containing token limits, LLM settings, and workflow parameters.
- token_max¶
Maximum token limit for intermediate summaries
- map_chain¶
Runnable for individual document summarization
- reduce_chain¶
Runnable for combining and reducing summaries
- text_splitter¶
Utility for splitting oversized documents
Examples
Basic document summarization:
config = SummarizerAgentConfig(token_max=1000) agent = SummarizerAgent(config) docs = ["First document content...", "Second document content..."] result = agent.run({"contents": docs}) print(result["final_summary"])
Handling large documents:
# Agent automatically splits and processes large documents large_doc = "Very long document content..." * 1000 result = agent.run({"contents": [large_doc]}) # The agent will chunk, summarize, and combine automatically
With custom configuration:
config = SummarizerAgentConfig( token_max=2000, engines={ "map_chain": custom_map_config, "reduce_chain": custom_reduce_config } ) agent = SummarizerAgent(config)
Note
The agent uses recursive text splitting to handle documents that exceed token limits. Chunk summaries are automatically combined using the reduce chain to maintain coherence.
- Raises:
ValueError – If no documents are provided for summarization
RuntimeError – If summarization fails after all retry attempts
- Parameters:
config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig)
See also
SummarizerAgentConfig
: Configuration optionsSummaryState
: State management for the workflowsetup_workflow()
: Workflow construction details
Initialize the SummarizerAgent with configuration.
Sets up the map and reduce chains for document processing and initializes the text splitter for handling oversized documents.
- Parameters:
config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig) – Agent configuration with token limits and LLM settings. Defaults to a new instance with default values.
- async collapse_summaries(state)¶
Collapse summaries that exceed token limits.
When intermediate summaries collectively exceed the token limit, this method splits them into groups and reduces each group to a more concise summary.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing Document objects to collapse. Must have ‘collapsed_summaries’ key.
- Returns:
Command updating the state with reduced summaries.
- Return type:
langgraph.types.Command
- collect_summaries(state)¶
Collect individual summaries into document objects.
Transforms the list of summary strings into Document objects for further processing in the collapse phase.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing individual summaries. Must have a ‘summaries’ key with a list of summary texts.
- Returns:
Command updating the state with collapsed_summaries as Document objects.
- Return type:
langgraph.types.Command
- async generate_final_summary(state)¶
Generate the final consolidated summary.
Processes all collapsed summaries through the reduce chain to create a single, coherent final summary.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state with collapsed summaries ready for final reduction.
- Returns:
Command updating the state with the final summary text.
- Return type:
langgraph.types.Command
- async generate_summary(state)¶
Generate a summary for a single document.
Processes a document through the map chain to create an individual summary. If the document exceeds token limits, it automatically splits the document into chunks and summarizes each chunk before combining them.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing the document content to summarize. Must have a ‘content’ key with the document text.
- Returns:
Dictionary with ‘summaries’ key containing a list with the generated summary text.
- Return type:
Note
This method includes automatic error recovery for token limit issues by splitting oversized documents into manageable chunks.
- length_function(documents)¶
Calculate total token count for documents.
Computes the sum of tokens across all provided documents using the reduce chain’s tokenizer.
- map_summaries(state)¶
Map documents to summary generation tasks.
Creates parallel summary generation tasks for each input document. Each document is sent to the generate_summary node for processing.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing the list of documents to summarize. Must have a ‘contents’ key with a list of document texts.
- Returns:
List of Send commands, one for each document to summarize.
- Return type:
list[langgraph.types.Send]
- setup_workflow()¶
Set up the map-reduce summarization workflow.
Constructs a StateGraph that implements the following workflow: 1. Map phase: Generate summaries for each input document 2. Collect phase: Gather all individual summaries 3. Collapse phase: Combine summaries if they exceed token limits 4. Final phase: Generate the final consolidated summary
The workflow includes conditional edges that determine whether intermediate summaries need to be collapsed based on token counts.
Note
This method is called automatically during agent initialization and does not need to be invoked manually.
- Return type:
None
- should_collapse(state)¶
Determine if summaries need further collapsing.
Checks if the total token count of collapsed summaries exceeds the configured limit. If so, directs to further collapsing; otherwise proceeds to final summary generation.
- Parameters:
state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state with collapsed summaries to evaluate.
- Returns:
‘collapse_summaries’ if over limit, ‘generate_final_summary’ otherwise.
- Return type:
Next node name
- agents.document_modifiers.summarizer.map_branch.agent.build_agent()¶
Create a default SummarizerAgent instance.
- Returns:
SummarizerAgent with default configuration.
- Return type: