haive.core.schema.prebuilt.query_state¶

Query State Schema for Advanced RAG and Document Processing.

This module provides comprehensive query state management for advanced RAG workflows, document processing, and multi-query scenarios. It builds on top of MessagesState and DocumentState to provide a unified query processing interface.

The QueryState enables: - Multi-query processing and refinement - Query expansion and optimization - Retrieval strategy management - Context tracking and memory - Source citation and provenance - Time-weighted and filtered queries - Self-query and adaptive retrieval - Query result caching and optimization

Examples

Basic query processing:

from haive.core.schema.prebuilt.query_state import QueryState

state = QueryState(
    original_query="What are the latest trends in AI?",
    query_type="research",
    retrieval_strategy="adaptive"
)

Advanced multi-query workflow:

state = QueryState(
    original_query="Analyze Q4 2024 financial performance",
    refined_queries=[
        "Q4 2024 revenue growth analysis",
        "Fourth quarter 2024 profit margins",
        "2024 Q4 market performance comparison"
    ],
    query_expansion_enabled=True,
    time_weighted_retrieval=True,
    source_filters=["financial_reports", "earnings_calls"]
)

Self-query with structured output:

from haive.core.schema.prebuilt.query_state import QueryType, RetrievalStrategy

state = QueryState(
    original_query="Find all documents about machine learning published after 2023",
    query_type=QueryType.STRUCTURED,
    retrieval_strategy=RetrievalStrategy.SELF_QUERY,
    structured_query_enabled=True,
    metadata_filters={"year": {"$gt": 2023}, "topic": "machine_learning"}
)

Author: Claude (Haive AI Agent Framework) Version: 1.0.0

Classes¶

`QueryComplexity`	Query complexity levels for processing optimization.
`QueryIntent`	Intent classification for query processing.
`QueryMetrics`	Metrics and analytics for query processing.
`QueryProcessingConfig`	Configuration for query processing behavior.
`QueryResult`	Result container for query processing.
`QueryState`	Comprehensive query state for advanced RAG and document processing.
`QueryState`	Comprehensive query state for advanced RAG and document processing.
`QueryType`	Types of queries supported by the query processing system.
`RetrievalStrategy`	Retrieval strategies for query processing.

Module Contents¶

class haive.core.schema.prebuilt.query_state.QueryComplexity[source]¶

Bases: str, enum.Enum

Query complexity levels for processing optimization.

Initialize self. See help(type(self)) for accurate signature.

class haive.core.schema.prebuilt.query_state.QueryIntent[source]¶

Bases: str, enum.Enum

Intent classification for query processing.

Initialize self. See help(type(self)) for accurate signature.

class haive.core.schema.prebuilt.query_state.QueryMetrics(/, **data)[source]¶

Bases: pydantic.BaseModel

Metrics and analytics for query processing.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

class haive.core.schema.prebuilt.query_state.QueryProcessingConfig(/, **data)[source]¶

Bases: pydantic.BaseModel

Configuration for query processing behavior.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

class haive.core.schema.prebuilt.query_state.QueryResult(/, **data)[source]¶

Bases: pydantic.BaseModel

Result container for query processing.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

class haive.core.schema.prebuilt.query_state.QueryState(/, **data)[source]¶

Bases: haive.core.schema.prebuilt.messages_state.MessagesState, haive.core.schema.prebuilt.document_state.DocumentState

State schema for conversation management with LangChain integration.

MessagesState is a specialized StateSchema that provides comprehensive message handling capabilities for conversational AI agents. It extends the base StateSchema with specific functionality for working with LangChain message types, message filtering, and conversation management.

This schema serves as the foundation for conversation-based agent states in the Haive framework, providing seamless integration with LangGraph for agent workflows. It includes built-in support for all standard message types (Human, AI, System, Tool) and handles message conversion, ordering, and serialization.

Key features include:

Automatic message conversion between different formats (dict/object)
System message handling with proper ordering enforcement
Message filtering by type, content, or custom criteria
Token counting and length estimation for context management
Conversation history manipulation (truncation, filtering, etc.)
LangGraph integration with proper message reducers
Conversion to formats required by different LLM providers
Conversation round tracking and analysis
Tool call deduplication and error handling
Message transformation utilities

Note: For token usage tracking, use MessagesStateWithTokenUsage instead.

The messages field is automatically shared with parent/child graphs and configured with the appropriate reducer function for merging message lists during state updates.

This class is commonly used as a base class for more specialized agent states that need conversation capabilities, and is the default base class used by SchemaComposer when message handling is detected in the components being composed.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

class haive.core.schema.prebuilt.query_state.QueryState(/, **data)[source]¶

Bases: QueryState

Comprehensive query state for advanced RAG and document processing.

This state schema combines messages, documents, and query-specific information to provide a complete context for query processing workflows. It supports multi-query scenarios, retrieval strategies, and advanced RAG features.

The state includes: - Query processing and refinement - Document context and retrieval - Multi-query coordination - Retrieval strategy management - Results and metrics tracking - Source citation and provenance - Time-weighted and filtered queries - Adaptive and self-query capabilities

Examples

Basic query state:

state = QueryState(
    original_query="What is quantum computing?",
    query_type=QueryType.SIMPLE,
    retrieval_strategy=RetrievalStrategy.BASIC
)

Advanced research query:

state = QueryState(
    original_query="Analyze the impact of AI on healthcare",
    query_type=QueryType.RESEARCH,
    retrieval_strategy=RetrievalStrategy.ADAPTIVE,
    query_expansion_enabled=True,
    time_weighted_retrieval=True,
    source_filters=["medical_journals", "clinical_trials"],
    metadata_filters={"publication_year": {"$gte": 2020}}
)

Multi-query workflow:

state = QueryState(
    original_query="Compare Q3 vs Q4 2024 performance",
    refined_queries=[
        "Q3 2024 financial results analysis",
        "Q4 2024 earnings report summary",
        "Q3 Q4 2024 performance comparison"
    ],
    query_type=QueryType.COMPARISON,
    retrieval_strategy=RetrievalStrategy.MULTI_QUERY
)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

class Config[source]¶

Pydantic configuration for the DocumentState schema.

arbitrary_types_allowed¶

Allows Pydantic to handle arbitrary types, which is useful for complex data structures like langchain_core.documents.Document.

Type:: bool

add_citation(citation)[source]¶

Add a citation to the state.

Parameters:: citation (dict[str, Any])
Return type:: None

add_context_document(document)[source]¶

Add a context document to the state.

Parameters:: document (langchain_core.documents.Document)
Return type:: None

add_error(error, context=None)[source]¶

Add an error to the history.

Parameters:

error (str)
context (dict[str, Any] | None)

Return type:

None

add_expanded_query(query)[source]¶

Add an expanded query to the list.

Parameters:: query (str)
Return type:: None

add_intermediate_result(result)[source]¶

Add an intermediate result to tracking.

Parameters:: result (dict[str, Any])
Return type:: None

add_query_variation(query)[source]¶

Add a query variation to the list.

Parameters:: query (str)
Return type:: None

add_refined_query(query)[source]¶

Add a refined query to the list.

Parameters:: query (str)
Return type:: None

add_retrieved_document(document)[source]¶

Add a retrieved document to the state.

Parameters:: document (langchain_core.documents.Document)
Return type:: None

create_cache_key()[source]¶

Create a cache key for the current query state.

Return type:: str

get_active_filters()[source]¶

Get all active filters for the query.

Return type:: dict[str, Any]

get_all_documents()[source]¶

Get all documents including raw, context, and retrieved.

Return type:: list[langchain_core.documents.Document]

get_all_queries()[source]¶

Get all queries including original, refined, and expanded.

Return type:: list[str]

get_confidence_score(source)[source]¶

Get confidence score for a source.

Parameters:: source (str)
Return type:: float

get_processing_summary()[source]¶

Get a summary of processing statistics.

Return type:: dict[str, Any]

is_multi_query_workflow()[source]¶

Check if this is a multi-query workflow.

Return type:: bool

requires_structured_output()[source]¶

Check if structured output is required.

Return type:: bool

set_confidence_score(source, score)[source]¶

Set confidence score for a source.

Parameters:

source (str)
score (float)

Return type:

None

update_stage(stage)[source]¶

Update the current processing stage.

Parameters:: stage (str)
Return type:: None

classmethod validate_original_query(v)[source]¶

Validate that the original query is not empty.

Parameters:: v (str)
Return type:: str

classmethod validate_refined_queries(v)[source]¶

Validate refined queries are not empty.

Parameters:: v (list[str])
Return type:: list[str]

classmethod validate_time_range(v)[source]¶

Validate time range filter has valid start and end dates.

Parameters:: v (dict[str, datetime.datetime] | None)
Return type:: dict[str, datetime.datetime] | None

class haive.core.schema.prebuilt.query_state.QueryType[source]¶

Bases: str, enum.Enum

Types of queries supported by the query processing system.

Initialize self. See help(type(self)) for accurate signature.

class haive.core.schema.prebuilt.query_state.RetrievalStrategy[source]¶

Bases: str, enum.Enum

Retrieval strategies for query processing.

Initialize self. See help(type(self)) for accurate signature.