agents.document_modifiers.tnt.state¶
State management for taxonomy generation workflow.
This module defines the state schema used throughout the taxonomy generation process. It provides a structured way to track documents, their groupings into minibatches, and the evolution of taxonomy clusters over multiple iterations.
Examples
Basic usage of the state class:
state = TaxonomyGenerationState(
documents=[Doc(id="1", content="text")],
minibatches=[[0]],
clusters=[[{"id": 1, "name": "Category"}]]
)
Classes¶
Represents the state passed between graph nodes in the taxonomy generation process. |
Module Contents¶
- class agents.document_modifiers.tnt.state.TaxonomyGenerationState(/, **data)¶
Bases:
pydantic.BaseModel
Represents the state passed between graph nodes in the taxonomy generation process.
This class maintains the complete state of the taxonomy generation workflow, tracking raw documents, their organization into processing batches, and the history of taxonomy revisions.
- Parameters:
data (Any)
- documents¶
List of document objects, each containing: - id: Unique identifier - content: Raw text - summary: Generated summary (added in first step) - explanation: Summary explanation (added in first step) - category: Assigned taxonomy category (added later)
- Type:
List[Doc]
- minibatches¶
Groups of document indices for batch processing. Each inner list contains indices referencing documents in the documents list.
- Type:
List[List[int]]
- clusters¶
History of taxonomy revisions. Each revision is a list of cluster dictionaries containing: - id: Cluster identifier - name: Category name - description: Category description
- Type:
List[List[dict]]
Examples
>>> docs = [Doc(id="1", content="text")] >>> state = TaxonomyGenerationState( ... documents=docs, ... minibatches=[[0]], ... clusters=[[{"id": 1, "name": "Tech", "description": "Technology"}]] ... )
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.