haive.core.schema.schema_composer¶
SchemaComposer for the Haive Schema System.
from typing import Any, Optional This module provides the SchemaComposer class, which offers a streamlined API for building state schemas dynamically from various components. The SchemaComposer is designed for schema composition, enabling the creation of complex state schemas by combining fields from multiple sources.
The SchemaComposer is particularly useful for: - Building schemas from heterogeneous components (engines, models, dictionaries) - Dynamically creating schemas at runtime based on available components - Composing schemas with proper field sharing, reducers, and engine I/O mappings - Ensuring consistent state handling across complex agent architectures
Key features include: - Automatic field extraction from components - Field definition management with comprehensive metadata - Support for shared fields between parent and child graphs - Tracking of engine input/output relationships - Integration with structured output models - Rich visualization for debugging and analysis
Examples
from haive.core.schema import SchemaComposer from typing import List from langchain_core.messages import BaseMessage from pydantic import Field import operator
# Create a new composer composer = SchemaComposer(name=”ConversationState”)
# Add fields manually composer.add_field(
name=”messages”, field_type=List[BaseMessage], default_factory=list, description=”Conversation history”, shared=True, reducer=”add_messages”
)
- composer.add_field(
name=”context”, field_type=List[str], default_factory=list, description=”Retrieved document contexts”, reducer=operator.add
)
# Extract fields from components composer.add_fields_from_components([
retriever_engine, llm_engine, memory_component
])
# Build the schema ConversationState = composer.build()
# Use the schema state = ConversationState()
Classes¶
Utility for building state schemas dynamically from component fields. |
Module Contents¶
- class haive.core.schema.schema_composer.SchemaComposer(name='ComposedSchema', base_state_schema=None)[source]¶
Utility for building state schemas dynamically from component fields.
The SchemaComposer provides a high-level, builder-style API for creating state schemas by combining fields from various sources. It handles the complex details of field extraction, metadata management, and schema generation, providing a streamlined interface for schema composition.
Key capabilities include: - Dynamically extracting fields from components (engines, models, dictionaries) - Adding and configuring fields individually with comprehensive options - Tracking field relationships, shared status, and reducer functions - Managing engine I/O mappings for proper state handling - Building optimized schema classes with the right configuration - Supporting nested state schemas and structured output models - Providing rich visualization for debugging and analysis
This class is the primary builder interface for dynamic schema creation in the Haive Schema System, offering a more declarative approach than StateSchemaManager. It’s particularly useful for creating schemas at runtime based on available components, ensuring consistent state handling across complex agent architectures.
SchemaComposer is designed to be used either directly or through class methods like from_components() for simplified schema creation from a list of components.
Initialize a new SchemaComposer.
- Parameters:
name (str) – The name for the composed schema class. Defaults to “ComposedSchema”.
base_state_schema (type[haive.core.schema.state_schema.StateSchema] | None) – Optional custom base state schema to use. If not provided, the composer will auto-detect the appropriate base class.
Examples
Creating a schema composer for a conversational agent:
composer = SchemaComposer(name="ConversationState") composer.add_field("messages", List[BaseMessage], default_factory=list) schema_class = composer.build()
Using a custom base schema:
from haive.core.schema.prebuilt import MessagesStateWithTokenUsage composer = SchemaComposer( name="TokenAwareState", base_state_schema=MessagesStateWithTokenUsage )
- add_engine(engine)[source]¶
Add an engine to the composer for tracking and later updates.
- Parameters:
engine (Any) – Engine to add
- Returns:
Self for chaining
- Return type:
- add_engine_management()[source]¶
Add standardized engine management fields to the schema.
This method adds the new engine management pattern to support: - Optional ‘engine’ field for primary/main engine - Explicit ‘engines’ dict field (was implicit before) - Automatic synchronization between the two
This is part of the schema simplification effort to provide clearer patterns for engine management while maintaining backward compatibility.
- Returns:
Self for chaining
- Return type:
- add_field(name, field_type, default=None, default_factory=None, description=None, shared=False, reducer=None, source=None, input_for=None, output_from=None)[source]¶
Add a field definition to the schema.
This method adds a field to the schema being composed, with comprehensive configuration for type, default values, sharing behavior, reducer functions, and engine I/O relationships. It handles special cases like fields provided by the base class and nested StateSchema fields.
The method performs validation on the field type and ensures proper tracking of metadata for schema generation. It’s the core building block for schema composition, allowing fine-grained control over field properties.
- Parameters:
name (str) – Field name
field_type (type) – Type of the field (e.g., str, List[int], Optional[Dict[str, Any]])
default (Any) – Default value for the field
default_factory (collections.abc.Callable[[], Any] | None) – Optional factory function for creating default values
description (str | None) – Optional field description for documentation
shared (bool) – Whether field is shared with parent graph (enables state synchronization)
reducer (collections.abc.Callable | None) – Optional reducer function for merging field values during state updates
source (str | None) – Optional source identifier (component or module name)
input_for (list[str] | None) – Optional list of engines this field is input for
output_from (list[str] | None) – Optional list of engines this field is output from
- Returns:
Self for method chaining to enable fluent API style
- Return type:
Examples
composer = SchemaComposer(name=”MyState”) composer.add_field(
name=”messages”, field_type=List[BaseMessage], default_factory=list, description=”Conversation history”, shared=True, reducer=add_messages, input_for=[“memory_engine”], output_from=[“llm_engine”]
)
- add_fields_from_components(components)[source]¶
Add fields from multiple components to the schema.
This method intelligently processes a list of heterogeneous components, automatically detecting their types and extracting fields using the appropriate extraction strategy. It supports engines, Pydantic models, dictionaries, and other component types, providing a unified interface for schema composition from diverse sources.
The method first detects base class requirements (such as the need for messages or tools fields) and then processes each component individually, delegating to specialized field extraction methods based on component type. After processing all components, it ensures standard fields are present and properly configured.
- Parameters:
components (list[Any]) – List of components to extract fields from, which can include: - Engine instances (with engine_type attribute) - Pydantic BaseModel instances or classes - Dictionaries of field definitions - Other component types with field information
- Returns:
Self for method chaining to enable fluent API style
- Return type:
Examples
# Create a schema from multiple components composer = SchemaComposer(name=”AgentState”) composer.add_fields_from_components([
llm_engine, # Engine instance retriever_engine, # Engine instance MemoryConfig, # Pydantic model class {“context”: (List[str], list, {“description”: “Retrieved documents”})}
])
Note
This is one of the most powerful methods in SchemaComposer, as it can automatically build a complete schema from a list of components without requiring manual field definition. It’s particularly useful for dynamic composition of schemas at runtime.
- add_fields_from_dict(fields_dict)[source]¶
Add fields from a dictionary definition.
- add_fields_from_engine(engine)[source]¶
Extract fields from an Engine object with enhanced nested schema handling.
- Parameters:
engine (Any) – Engine to extract fields from
- Returns:
Self for chaining
- Return type:
- add_fields_from_model(model)[source]¶
Extract fields from a Pydantic model with improved handling of nested schemas.
- Parameters:
model (type[pydantic.BaseModel]) – Pydantic model to extract fields from
- Returns:
Self for chaining
- Return type:
- add_standard_field(field_name, **kwargs)[source]¶
Add a standard field from the field registry.
- Parameters:
field_name (str) – Name of the standard field (e.g., ‘messages’, ‘context’)
**kwargs – Additional arguments to pass to the field factory
- Returns:
Self for chaining
- Return type:
- build()[source]¶
Build and return a StateSchema class with all defined fields and metadata.
This method finalizes the schema composition process by generating a concrete StateSchema subclass with the appropriate base class (determined by detected requirements) and all the fields, metadata, and behaviors defined during composition. It performs comprehensive setup of the schema class, including:
Field generation with proper types, defaults, and metadata
Configuration of shared fields for parent-child graph relationships
Setup of reducer functions for state merging
Engine I/O tracking for proper state routing
Structured output model integration
Schema post-initialization for nested fields, dictionaries, and engine tool synchronization
Rich visualization for debugging (when debug logging is enabled)
The generated schema is a fully functional Pydantic model subclass that can be instantiated directly or used as a state schema in a LangGraph workflow.
Engine Tool Synchronization:¶
This method stores engines directly on the schema class and implements an enhanced model_post_init that ensures:
Class-level engines are made available on instances
For ToolState subclasses, tools from class-level engines are automatically synced to the instance’s tools list
This functionality bridges the gap between class-level engine storage and instance-level tool management, ensuring that tools from engines stored by SchemaComposer are properly synchronized with ToolState instances.
- returns:
A StateSchema subclass with all defined fields, metadata, and behaviors
- Return type:
- build_with_engine_generics(name=None)[source]¶
Build a StateSchema with resolved engine generics.
- Parameters:
name (str | None) – Optional name for the schema class
- Returns:
StateSchema class with concrete engine types
- Return type:
- classmethod compose_input_schema(components, name='InputSchema')[source]¶
Create an input schema from components, focusing on input fields.
- classmethod compose_output_schema(components, name='OutputSchema')[source]¶
Create an output schema from components, focusing on output fields.
- compose_state_from_io(input_schema, output_schema)[source]¶
Compose a state schema from input and output schemas using this composer.
- Parameters:
- Returns:
StateSchema subclass
- Return type:
- configure_messages_field(with_reducer=True, force_add=False)[source]¶
Configure a messages field with appropriate settings if it exists or if requested.
- Parameters:
- Returns:
Self for chaining
- Return type:
- classmethod create_message_state(additional_fields=None, name='MessageState')[source]¶
Create a schema with messages field and additional fields.
- Parameters:
- Returns:
StateSchema subclass with messages field
- Return type:
- classmethod create_state_from_io_schemas(input_schema, output_schema, name='ComposedStateSchema')[source]¶
Create a state schema that combines input and output schemas.
- Parameters:
- Returns:
StateSchema subclass that inherits from both input and output schemas
- Return type:
- extract_tool_schemas(tools)[source]¶
Extract input and output schemas from tools with improved parsing detection.
- Parameters:
tools (list[Any]) – List of tools to analyze
- Return type:
None
- classmethod from_components(components, name='ComposedSchema', base_state_schema=None)[source]¶
Create and build a StateSchema directly from a list of components.
This convenience class method provides a simplified, one-step approach to schema creation from components. It creates a SchemaComposer instance, processes all components to extract fields, ensures standard fields are present, and builds the final StateSchema in a single operation.
This is the recommended entry point for most schema composition needs, as it handles all the details of schema composition in a single method call. It’s particularly useful when you want to quickly create a schema from existing components without detailed customization.
- Parameters:
components (list[Any]) – List of components to extract fields from, which can include: - Engine instances (with engine_type attribute) - Pydantic BaseModel instances or classes - Dictionaries of field definitions - Other component types with field information
name (str) – Name for the generated schema class
base_state_schema (type[haive.core.schema.state_schema.StateSchema] | None) – Optional custom base state schema to use. If not provided, the composer will auto-detect the appropriate base class.
- Returns:
A fully constructed StateSchema subclass ready for instantiation
- Return type:
Examples
# Create a schema from components in one step ConversationState = SchemaComposer.from_components(
[llm_engine, retriever_engine, memory_component], name=”ConversationState”
)
# Use the schema state = ConversationState()
# With custom base schema for token tracking from haive.core.schema.prebuilt import MessagesStateWithTokenUsage TokenAwareState = SchemaComposer.from_components(
[llm_engine], name=”TokenAwareState”, base_state_schema=MessagesStateWithTokenUsage
)
Note
This method automatically detects which base class to use (StateSchema, MessagesStateWithTokenUsage, or ToolState) based on the components provided, ensuring the schema has the appropriate functionality for the detected requirements. When messages are detected, it now uses MessagesStateWithTokenUsage by default for better token tracking.
- get_engine_union_type()[source]¶
Get a Union type of all concrete engine types.
- Return type:
Any | None
- mark_as_input_field(field_name, engine_name)[source]¶
Mark a field as input field for a specific engine.
- Parameters:
- Returns:
Self for chaining
- Return type:
- mark_as_output_field(field_name, engine_name)[source]¶
Mark a field as output field for a specific engine.
- Parameters:
- Returns:
Self for chaining
- Return type:
- classmethod merge(first, second, name='MergedSchema')[source]¶
Merge two SchemaComposer instances.
- Parameters:
first (SchemaComposer) – First composer
second (SchemaComposer) – Second composer
name (str) – Name for merged composer
- Returns:
New merged SchemaComposer
- Return type:
- to_manager()[source]¶
Convert to a StateSchemaManager for further manipulation.
- Returns:
StateSchemaManager instance
- Return type: