haive.core.engine.retriever.providers.ContextualCompressionRetrieverConfig¶

Contextual Compression Retriever implementation for the Haive framework.

This module provides a configuration class for the Contextual Compression retriever, which compresses retrieved documents to extract only the most relevant information relative to the query, improving both relevance and efficiency.

The ContextualCompressionRetriever works by: 1. Using a base retriever to get initial document candidates 2. Applying a compressor (LLM or extractive) to compress each document 3. Extracting only the parts of documents that are relevant to the query 4. Returning compressed, more focused document content

This retriever is particularly useful when: - Documents are long and contain irrelevant sections - Need to reduce token usage in downstream processing - Want to improve precision by filtering out noise - Building systems with strict context length limits

The implementation integrates with LangChain’s ContextualCompressionRetriever while providing a consistent Haive configuration interface with flexible compression options.

Classes¶

ContextualCompressionRetrieverConfig

Configuration for Contextual Compression retriever in the Haive framework.

Module Contents¶

class haive.core.engine.retriever.providers.ContextualCompressionRetrieverConfig.ContextualCompressionRetrieverConfig[source]¶

Bases: haive.core.engine.retriever.retriever.BaseRetrieverConfig

Configuration for Contextual Compression retriever in the Haive framework.

This retriever compresses retrieved documents to extract only the most relevant information relative to the query, improving both relevance and efficiency.

retriever_type¶

The type of retriever (always CONTEXTUAL_COMPRESSION).

Type:

RetrieverType

base_retriever¶

The underlying retriever to get initial candidates.

Type:

BaseRetrieverConfig

compressor_type¶

Type of compressor to use (‘llm_chain_extract’, ‘llm_chain_filter’).

Type:

str

llm_config¶

LLM configuration for compression (required for LLM compressors).

Type:

Optional[AugLLMConfig]

chunk_size¶

Maximum size of compressed chunks.

Type:

int

chunk_overlap¶

Overlap between compressed chunks.

Type:

int

Examples

>>> from haive.core.engine.retriever import ContextualCompressionRetrieverConfig
>>> from haive.core.engine.retriever.providers.VectorStoreRetrieverConfig import VectorStoreRetrieverConfig
>>> from haive.core.engine.aug_llm import AugLLMConfig
>>>
>>> # Create base retriever and LLM config
>>> base_config = VectorStoreRetrieverConfig(name="base", vectorstore_config=vs_config)
>>> llm_config = AugLLMConfig(model_name="gpt-3.5-turbo", provider="openai")
>>>
>>> # Create contextual compression retriever
>>> config = ContextualCompressionRetrieverConfig(
...     name="compression_retriever",
...     base_retriever=base_config,
...     compressor_type="llm_chain_extract",
...     llm_config=llm_config
... )
>>>
>>> # Instantiate and use the retriever
>>> retriever = config.instantiate()
>>> docs = retriever.get_relevant_documents("machine learning algorithms")
get_input_fields()[source]¶

Return input field definitions for Contextual Compression retriever.

Return type:

dict[str, tuple[type, Any]]

get_output_fields()[source]¶

Return output field definitions for Contextual Compression retriever.

Return type:

dict[str, tuple[type, Any]]

instantiate()[source]¶

Create a Contextual Compression retriever from this configuration.

Returns:

Instantiated retriever ready for compression retrieval.

Return type:

ContextualCompressionRetriever

Raises:
  • ImportError – If required packages are not available.

  • ValueError – If configuration is invalid.

classmethod validate_compressor_type(v)[source]¶

Validate compressor type.

classmethod validate_llm_config_required(v, info)[source]¶

Validate that LLM config is provided for LLM compressors.