agents.document_modifiers.summarizer.map_branch.agent

Map-Reduce Summarizer Agent for document summarization.

This module provides the SummarizerAgent class which implements a map-reduce approach to document summarization. It can handle large documents by splitting them into manageable chunks, summarizing each chunk, and then combining the summaries into a final coherent summary.

The agent handles token limit constraints and provides automatic fallback mechanisms for oversized documents.

Classes:

SummarizerAgent: Main agent for map-reduce document summarization

Examples

Basic usage:

from haive.agents.document_modifiers.summarizer.map_branch import SummarizerAgent
from haive.agents.document_modifiers.summarizer.map_branch.config import SummarizerAgentConfig

config = SummarizerAgentConfig(
    token_max=1000,
    name="document_summarizer"
)
agent = SummarizerAgent(config)

documents = ["Long document text 1...", "Long document text 2..."]
result = agent.run({"contents": documents})
summary = result["final_summary"]

With custom token limits:

config = SummarizerAgentConfig(
    token_max=2000,  # Allow longer intermediate summaries
    engines={
        "map_chain": custom_map_config,
        "reduce_chain": custom_reduce_config
    }
)
agent = SummarizerAgent(config)

See also

  • SummarizerAgentConfig: Configuration class

  • SummaryState: State management

Classes

SummarizerAgent

Agent that summarizes documents using a map-reduce approach.

Functions

build_agent()

Create a default SummarizerAgent instance.

Module Contents

class agents.document_modifiers.summarizer.map_branch.agent.SummarizerAgent(config=SummarizerAgentConfig())

Bases: haive.core.engine.agent.agent.Agent[haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig]

Agent that summarizes documents using a map-reduce approach.

This agent implements a sophisticated document summarization workflow that can handle large documents and multiple documents simultaneously. It uses a map-reduce pattern where documents are first summarized individually (map phase), then combined and reduced to a final summary (reduce phase).

The agent automatically handles token limit constraints by: 1. Splitting oversized documents into manageable chunks 2. Summarizing chunks individually 3. Collapsing intermediate summaries when they exceed token limits 4. Producing a coherent final summary

Parameters:

config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig) – Configuration object containing token limits, LLM settings, and workflow parameters.

token_max

Maximum token limit for intermediate summaries

map_chain

Runnable for individual document summarization

reduce_chain

Runnable for combining and reducing summaries

text_splitter

Utility for splitting oversized documents

Examples

Basic document summarization:

config = SummarizerAgentConfig(token_max=1000)
agent = SummarizerAgent(config)

docs = ["First document content...", "Second document content..."]
result = agent.run({"contents": docs})
print(result["final_summary"])

Handling large documents:

# Agent automatically splits and processes large documents
large_doc = "Very long document content..." * 1000
result = agent.run({"contents": [large_doc]})
# The agent will chunk, summarize, and combine automatically

With custom configuration:

config = SummarizerAgentConfig(
    token_max=2000,
    engines={
        "map_chain": custom_map_config,
        "reduce_chain": custom_reduce_config
    }
)
agent = SummarizerAgent(config)

Note

The agent uses recursive text splitting to handle documents that exceed token limits. Chunk summaries are automatically combined using the reduce chain to maintain coherence.

Raises:
  • ValueError – If no documents are provided for summarization

  • RuntimeError – If summarization fails after all retry attempts

Parameters:

config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig)

See also

  • SummarizerAgentConfig: Configuration options

  • SummaryState: State management for the workflow

  • setup_workflow(): Workflow construction details

Initialize the SummarizerAgent with configuration.

Sets up the map and reduce chains for document processing and initializes the text splitter for handling oversized documents.

Parameters:

config (haive.agents.document_modifiers.summarizer.map_branch.config.SummarizerAgentConfig) – Agent configuration with token limits and LLM settings. Defaults to a new instance with default values.

async collapse_summaries(state)

Collapse summaries that exceed token limits.

When intermediate summaries collectively exceed the token limit, this method splits them into groups and reduces each group to a more concise summary.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing Document objects to collapse. Must have ‘collapsed_summaries’ key.

Returns:

Command updating the state with reduced summaries.

Return type:

langgraph.types.Command

collect_summaries(state)

Collect individual summaries into document objects.

Transforms the list of summary strings into Document objects for further processing in the collapse phase.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing individual summaries. Must have a ‘summaries’ key with a list of summary texts.

Returns:

Command updating the state with collapsed_summaries as Document objects.

Return type:

langgraph.types.Command

async generate_final_summary(state)

Generate the final consolidated summary.

Processes all collapsed summaries through the reduce chain to create a single, coherent final summary.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state with collapsed summaries ready for final reduction.

Returns:

Command updating the state with the final summary text.

Return type:

langgraph.types.Command

async generate_summary(state)

Generate a summary for a single document.

Processes a document through the map chain to create an individual summary. If the document exceeds token limits, it automatically splits the document into chunks and summarizes each chunk before combining them.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing the document content to summarize. Must have a ‘content’ key with the document text.

Returns:

Dictionary with ‘summaries’ key containing a list with the generated summary text.

Return type:

dict

Note

This method includes automatic error recovery for token limit issues by splitting oversized documents into manageable chunks.

length_function(documents)

Calculate total token count for documents.

Computes the sum of tokens across all provided documents using the reduce chain’s tokenizer.

Parameters:

documents (list[langchain_core.documents.Document]) – List of Document objects to count tokens for.

Returns:

Total number of tokens across all documents.

Return type:

int

map_summaries(state)

Map documents to summary generation tasks.

Creates parallel summary generation tasks for each input document. Each document is sent to the generate_summary node for processing.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state containing the list of documents to summarize. Must have a ‘contents’ key with a list of document texts.

Returns:

List of Send commands, one for each document to summarize.

Return type:

list[langgraph.types.Send]

setup_workflow()

Set up the map-reduce summarization workflow.

Constructs a StateGraph that implements the following workflow: 1. Map phase: Generate summaries for each input document 2. Collect phase: Gather all individual summaries 3. Collapse phase: Combine summaries if they exceed token limits 4. Final phase: Generate the final consolidated summary

The workflow includes conditional edges that determine whether intermediate summaries need to be collapsed based on token counts.

Note

This method is called automatically during agent initialization and does not need to be invoked manually.

Return type:

None

should_collapse(state)

Determine if summaries need further collapsing.

Checks if the total token count of collapsed summaries exceeds the configured limit. If so, directs to further collapsing; otherwise proceeds to final summary generation.

Parameters:

state (haive.agents.document_modifiers.summarizer.map_branch.state.SummaryState) – Current state with collapsed summaries to evaluate.

Returns:

‘collapse_summaries’ if over limit, ‘generate_final_summary’ otherwise.

Return type:

Next node name

agents.document_modifiers.summarizer.map_branch.agent.build_agent()

Create a default SummarizerAgent instance.

Returns:

SummarizerAgent with default configuration.

Return type:

SummarizerAgent