haive.core.schema.schema_composer

SchemaComposer for the Haive Schema System.

from typing import Any, Optional This module provides the SchemaComposer class, which offers a streamlined API for building state schemas dynamically from various components. The SchemaComposer is designed for schema composition, enabling the creation of complex state schemas by combining fields from multiple sources.

The SchemaComposer is particularly useful for: - Building schemas from heterogeneous components (engines, models, dictionaries) - Dynamically creating schemas at runtime based on available components - Composing schemas with proper field sharing, reducers, and engine I/O mappings - Ensuring consistent state handling across complex agent architectures

Key features include: - Automatic field extraction from components - Field definition management with comprehensive metadata - Support for shared fields between parent and child graphs - Tracking of engine input/output relationships - Integration with structured output models - Rich visualization for debugging and analysis

Examples

from haive.core.schema import SchemaComposer from typing import List from langchain_core.messages import BaseMessage from pydantic import Field import operator

# Create a new composer composer = SchemaComposer(name=”ConversationState”)

# Add fields manually composer.add_field(

name=”messages”, field_type=List[BaseMessage], default_factory=list, description=”Conversation history”, shared=True, reducer=”add_messages”

)

composer.add_field(

name=”context”, field_type=List[str], default_factory=list, description=”Retrieved document contexts”, reducer=operator.add

)

# Extract fields from components composer.add_fields_from_components([

retriever_engine, llm_engine, memory_component

])

# Build the schema ConversationState = composer.build()

# Use the schema state = ConversationState()

Classes

SchemaComposer

Utility for building state schemas dynamically from component fields.

Module Contents

class haive.core.schema.schema_composer.SchemaComposer(name='ComposedSchema', base_state_schema=None)[source]

Utility for building state schemas dynamically from component fields.

The SchemaComposer provides a high-level, builder-style API for creating state schemas by combining fields from various sources. It handles the complex details of field extraction, metadata management, and schema generation, providing a streamlined interface for schema composition.

Key capabilities include: - Dynamically extracting fields from components (engines, models, dictionaries) - Adding and configuring fields individually with comprehensive options - Tracking field relationships, shared status, and reducer functions - Managing engine I/O mappings for proper state handling - Building optimized schema classes with the right configuration - Supporting nested state schemas and structured output models - Providing rich visualization for debugging and analysis

This class is the primary builder interface for dynamic schema creation in the Haive Schema System, offering a more declarative approach than StateSchemaManager. It’s particularly useful for creating schemas at runtime based on available components, ensuring consistent state handling across complex agent architectures.

SchemaComposer is designed to be used either directly or through class methods like from_components() for simplified schema creation from a list of components.

Initialize a new SchemaComposer.

Parameters:
  • name (str) – The name for the composed schema class. Defaults to “ComposedSchema”.

  • base_state_schema (type[haive.core.schema.state_schema.StateSchema] | None) – Optional custom base state schema to use. If not provided, the composer will auto-detect the appropriate base class.

Examples

Creating a schema composer for a conversational agent:

composer = SchemaComposer(name="ConversationState")
composer.add_field("messages", List[BaseMessage], default_factory=list)
schema_class = composer.build()

Using a custom base schema:

from haive.core.schema.prebuilt import MessagesStateWithTokenUsage
composer = SchemaComposer(
    name="TokenAwareState",
    base_state_schema=MessagesStateWithTokenUsage
)
add_engine(engine)[source]

Add an engine to the composer for tracking and later updates.

Parameters:

engine (Any) – Engine to add

Returns:

Self for chaining

Return type:

SchemaComposer

add_engine_management()[source]

Add standardized engine management fields to the schema.

This method adds the new engine management pattern to support: - Optional ‘engine’ field for primary/main engine - Explicit ‘engines’ dict field (was implicit before) - Automatic synchronization between the two

This is part of the schema simplification effort to provide clearer patterns for engine management while maintaining backward compatibility.

Returns:

Self for chaining

Return type:

SchemaComposer

add_field(name, field_type, default=None, default_factory=None, description=None, shared=False, reducer=None, source=None, input_for=None, output_from=None)[source]

Add a field definition to the schema.

This method adds a field to the schema being composed, with comprehensive configuration for type, default values, sharing behavior, reducer functions, and engine I/O relationships. It handles special cases like fields provided by the base class and nested StateSchema fields.

The method performs validation on the field type and ensures proper tracking of metadata for schema generation. It’s the core building block for schema composition, allowing fine-grained control over field properties.

Parameters:
  • name (str) – Field name

  • field_type (type) – Type of the field (e.g., str, List[int], Optional[Dict[str, Any]])

  • default (Any) – Default value for the field

  • default_factory (collections.abc.Callable[[], Any] | None) – Optional factory function for creating default values

  • description (str | None) – Optional field description for documentation

  • shared (bool) – Whether field is shared with parent graph (enables state synchronization)

  • reducer (collections.abc.Callable | None) – Optional reducer function for merging field values during state updates

  • source (str | None) – Optional source identifier (component or module name)

  • input_for (list[str] | None) – Optional list of engines this field is input for

  • output_from (list[str] | None) – Optional list of engines this field is output from

Returns:

Self for method chaining to enable fluent API style

Return type:

SchemaComposer

Examples

composer = SchemaComposer(name=”MyState”) composer.add_field(

name=”messages”, field_type=List[BaseMessage], default_factory=list, description=”Conversation history”, shared=True, reducer=add_messages, input_for=[“memory_engine”], output_from=[“llm_engine”]

)

add_fields_from_components(components)[source]

Add fields from multiple components to the schema.

This method intelligently processes a list of heterogeneous components, automatically detecting their types and extracting fields using the appropriate extraction strategy. It supports engines, Pydantic models, dictionaries, and other component types, providing a unified interface for schema composition from diverse sources.

The method first detects base class requirements (such as the need for messages or tools fields) and then processes each component individually, delegating to specialized field extraction methods based on component type. After processing all components, it ensures standard fields are present and properly configured.

Parameters:

components (list[Any]) – List of components to extract fields from, which can include: - Engine instances (with engine_type attribute) - Pydantic BaseModel instances or classes - Dictionaries of field definitions - Other component types with field information

Returns:

Self for method chaining to enable fluent API style

Return type:

SchemaComposer

Examples

# Create a schema from multiple components composer = SchemaComposer(name=”AgentState”) composer.add_fields_from_components([

llm_engine, # Engine instance retriever_engine, # Engine instance MemoryConfig, # Pydantic model class {“context”: (List[str], list, {“description”: “Retrieved documents”})}

])

Note

This is one of the most powerful methods in SchemaComposer, as it can automatically build a complete schema from a list of components without requiring manual field definition. It’s particularly useful for dynamic composition of schemas at runtime.

add_fields_from_dict(fields_dict)[source]

Add fields from a dictionary definition.

Parameters:

fields_dict (dict[str, Any]) – Dictionary mapping field names to type/value information

Returns:

Self for chaining

Return type:

SchemaComposer

add_fields_from_engine(engine)[source]

Extract fields from an Engine object with enhanced nested schema handling.

Parameters:

engine (Any) – Engine to extract fields from

Returns:

Self for chaining

Return type:

SchemaComposer

add_fields_from_model(model)[source]

Extract fields from a Pydantic model with improved handling of nested schemas.

Parameters:

model (type[pydantic.BaseModel]) – Pydantic model to extract fields from

Returns:

Self for chaining

Return type:

SchemaComposer

add_standard_field(field_name, **kwargs)[source]

Add a standard field from the field registry.

Parameters:
  • field_name (str) – Name of the standard field (e.g., ‘messages’, ‘context’)

  • **kwargs – Additional arguments to pass to the field factory

Returns:

Self for chaining

Return type:

SchemaComposer

build()[source]

Build and return a StateSchema class with all defined fields and metadata.

This method finalizes the schema composition process by generating a concrete StateSchema subclass with the appropriate base class (determined by detected requirements) and all the fields, metadata, and behaviors defined during composition. It performs comprehensive setup of the schema class, including:

  1. Field generation with proper types, defaults, and metadata

  2. Configuration of shared fields for parent-child graph relationships

  3. Setup of reducer functions for state merging

  4. Engine I/O tracking for proper state routing

  5. Structured output model integration

  6. Schema post-initialization for nested fields, dictionaries, and engine tool synchronization

  7. Rich visualization for debugging (when debug logging is enabled)

The generated schema is a fully functional Pydantic model subclass that can be instantiated directly or used as a state schema in a LangGraph workflow.

Engine Tool Synchronization:

This method stores engines directly on the schema class and implements an enhanced model_post_init that ensures:

  1. Class-level engines are made available on instances

  2. For ToolState subclasses, tools from class-level engines are automatically synced to the instance’s tools list

This functionality bridges the gap between class-level engine storage and instance-level tool management, ensuring that tools from engines stored by SchemaComposer are properly synchronized with ToolState instances.

returns:

A StateSchema subclass with all defined fields, metadata, and behaviors

Return type:

type[haive.core.schema.state_schema.StateSchema]

build_with_engine_generics(name=None)[source]

Build a StateSchema with resolved engine generics.

Parameters:

name (str | None) – Optional name for the schema class

Returns:

StateSchema class with concrete engine types

Return type:

type[haive.core.schema.state_schema.StateSchema]

classmethod compose_input_schema(components, name='InputSchema')[source]

Create an input schema from components, focusing on input fields.

Parameters:
  • components (list[Any]) – List of components to extract fields from

  • name (str) – Name for the schema

Returns:

BaseModel subclass optimized for input

Return type:

type[pydantic.BaseModel]

classmethod compose_output_schema(components, name='OutputSchema')[source]

Create an output schema from components, focusing on output fields.

Parameters:
  • components (list[Any]) – List of components to extract fields from

  • name (str) – Name for the schema

Returns:

BaseModel subclass optimized for output

Return type:

type[pydantic.BaseModel]

compose_state_from_io(input_schema, output_schema)[source]

Compose a state schema from input and output schemas using this composer.

Parameters:
  • input_schema (type[pydantic.BaseModel]) – Input schema class

  • output_schema (type[pydantic.BaseModel]) – Output schema class

Returns:

StateSchema subclass

Return type:

type[haive.core.schema.state_schema.StateSchema]

configure_messages_field(with_reducer=True, force_add=False)[source]

Configure a messages field with appropriate settings if it exists or if requested.

Parameters:
  • with_reducer (bool) – Whether to add a reducer for the messages field

  • force_add (bool) – Whether to add the messages field if it doesn’t exist

Returns:

Self for chaining

Return type:

SchemaComposer

classmethod create_message_state(additional_fields=None, name='MessageState')[source]

Create a schema with messages field and additional fields.

Parameters:
  • additional_fields (dict[str, Any] | None) – Optional dictionary of additional fields to add

  • name (str) – Name for the schema

Returns:

StateSchema subclass with messages field

Return type:

type[haive.core.schema.state_schema.StateSchema]

classmethod create_state_from_io_schemas(input_schema, output_schema, name='ComposedStateSchema')[source]

Create a state schema that combines input and output schemas.

Parameters:
  • input_schema (type[pydantic.BaseModel]) – Input schema class

  • output_schema (type[pydantic.BaseModel]) – Output schema class

  • name (str) – Name for the composed schema

Returns:

StateSchema subclass that inherits from both input and output schemas

Return type:

type[haive.core.schema.state_schema.StateSchema]

extract_tool_schemas(tools)[source]

Extract input and output schemas from tools with improved parsing detection.

Parameters:

tools (list[Any]) – List of tools to analyze

Return type:

None

classmethod from_components(components, name='ComposedSchema', base_state_schema=None)[source]

Create and build a StateSchema directly from a list of components.

This convenience class method provides a simplified, one-step approach to schema creation from components. It creates a SchemaComposer instance, processes all components to extract fields, ensures standard fields are present, and builds the final StateSchema in a single operation.

This is the recommended entry point for most schema composition needs, as it handles all the details of schema composition in a single method call. It’s particularly useful when you want to quickly create a schema from existing components without detailed customization.

Parameters:
  • components (list[Any]) – List of components to extract fields from, which can include: - Engine instances (with engine_type attribute) - Pydantic BaseModel instances or classes - Dictionaries of field definitions - Other component types with field information

  • name (str) – Name for the generated schema class

  • base_state_schema (type[haive.core.schema.state_schema.StateSchema] | None) – Optional custom base state schema to use. If not provided, the composer will auto-detect the appropriate base class.

Returns:

A fully constructed StateSchema subclass ready for instantiation

Return type:

type[haive.core.schema.state_schema.StateSchema]

Examples

# Create a schema from components in one step ConversationState = SchemaComposer.from_components(

[llm_engine, retriever_engine, memory_component], name=”ConversationState”

)

# Use the schema state = ConversationState()

# With custom base schema for token tracking from haive.core.schema.prebuilt import MessagesStateWithTokenUsage TokenAwareState = SchemaComposer.from_components(

[llm_engine], name=”TokenAwareState”, base_state_schema=MessagesStateWithTokenUsage

)

Note

This method automatically detects which base class to use (StateSchema, MessagesStateWithTokenUsage, or ToolState) based on the components provided, ensuring the schema has the appropriate functionality for the detected requirements. When messages are detected, it now uses MessagesStateWithTokenUsage by default for better token tracking.

get_engine_union_type()[source]

Get a Union type of all concrete engine types.

Return type:

Any | None

get_engines_by_type(engine_type)[source]

Get all engines of a specific type.

Parameters:

engine_type (str) – Type of engines to retrieve

Returns:

List of engines of the specified type

Return type:

list[Any]

mark_as_input_field(field_name, engine_name)[source]

Mark a field as input field for a specific engine.

Parameters:
  • field_name (str) – Name of the field

  • engine_name (str) – Name of the engine

Returns:

Self for chaining

Return type:

SchemaComposer

mark_as_output_field(field_name, engine_name)[source]

Mark a field as output field for a specific engine.

Parameters:
  • field_name (str) – Name of the field

  • engine_name (str) – Name of the engine

Returns:

Self for chaining

Return type:

SchemaComposer

classmethod merge(first, second, name='MergedSchema')[source]

Merge two SchemaComposer instances.

Parameters:
Returns:

New merged SchemaComposer

Return type:

SchemaComposer

resolve_engine_types()[source]

Resolve engine types from added engines for generic typing.

Returns:

Dictionary mapping engine names to their concrete types

Return type:

dict[str, type]

show_engines()[source]

Display a summary of all registered engines.

Return type:

None

to_manager()[source]

Convert to a StateSchemaManager for further manipulation.

Returns:

StateSchemaManager instance

Return type:

haive.core.schema.schema_manager.StateSchemaManager

update_engine_provider(engine_type, updates)[source]

Update configuration for all engines of a specific type.

Parameters:
  • engine_type (str) – Type of engines to update (e.g., “llm”, “retriever”)

  • updates (dict[str, Any]) – Dictionary of updates to apply

Returns:

Self for chaining

Return type:

SchemaComposer