haive.core.engine.retriever.providers.SelfQueryRetrieverConfigΒΆ
Self-Query Retriever implementation for the Haive framework.
This module provides a configuration class for the Self-Query retriever, which enables natural language queries to be converted into structured queries that can filter on document metadata and perform semantic similarity search.
The SelfQueryRetriever works by: 1. Using an LLM to parse natural language queries into structured components 2. Extracting filter conditions for metadata (date, category, etc.) 3. Extracting the semantic search query component 4. Performing both metadata filtering and vector similarity search 5. Returning documents that match both criteria
This retriever is particularly useful when: - Documents have rich metadata that should be queryable - Need to combine semantic search with structured filtering - Users want to query both content and attributes naturally - Building systems that need precise control over search scope
The implementation integrates with LangChainβs SelfQueryRetriever while providing a consistent Haive configuration interface with metadata schema support.
ClassesΒΆ
Configuration for Self-Query retriever in the Haive framework. |
Module ContentsΒΆ
- class haive.core.engine.retriever.providers.SelfQueryRetrieverConfig.SelfQueryRetrieverConfig[source]ΒΆ
Bases:
haive.core.engine.retriever.retriever.BaseRetrieverConfig
Configuration for Self-Query retriever in the Haive framework.
This retriever converts natural language queries into structured queries that can filter on document metadata and perform semantic similarity search.
- retriever_typeΒΆ
The type of retriever (always SELF_QUERY).
- Type:
- vectorstore_configΒΆ
Vector store for semantic search.
- Type:
- llm_configΒΆ
LLM for parsing natural language queries.
- Type:
- metadata_field_infoΒΆ
Metadata fields that can be filtered on.
- Type:
List[Dict]
Examples
>>> from haive.core.engine.retriever import SelfQueryRetrieverConfig >>> from haive.core.engine.vectorstore.providers.ChromaVectorStoreConfig import ChromaVectorStoreConfig >>> from haive.core.engine.aug_llm import AugLLMConfig >>> >>> # Create vector store and LLM configs >>> vs_config = ChromaVectorStoreConfig(name="docs", collection_name="documents") >>> llm_config = AugLLMConfig(model_name="gpt-3.5-turbo", provider="openai") >>> >>> # Define metadata schema >>> metadata_fields = [ ... { ... "name": "genre", ... "description": "The genre of the movie", ... "type": "string" ... } ... ] >>> >>> # Create self-query retriever >>> config = SelfQueryRetrieverConfig( ... name="self_query_retriever", ... vectorstore_config=vs_config, ... llm_config=llm_config, ... document_content_description="Movie reviews and summaries", ... metadata_field_info=metadata_fields, ... k=5 ... ) >>> >>> # Instantiate and use the retriever >>> retriever = config.instantiate() >>> docs = retriever.get_relevant_documents("action movies from the 1990s")
- instantiate()[source]ΒΆ
Create a Self-Query retriever from this configuration.
- Returns:
Instantiated retriever ready for self-query retrieval.
- Return type:
SelfQueryRetriever
- Raises:
ImportError β If required packages are not available.
ValueError β If configuration is invalid.