haive.core.engine.retriever.providers.VespaRetrieverConfig¶

Vespa Retriever implementation for the Haive framework.

from typing import Any This module provides a configuration class for the Vespa retriever, which uses Vespa search engine for advanced search and retrieval capabilities. Vespa is a fully featured search engine and vector database which supports vector search, lexical search, and hybrid ranking in a single query.

The VespaRetriever works by: 1. Connecting to a Vespa application 2. Supporting both vector and text search simultaneously 3. Providing advanced ranking and filtering capabilities 4. Enabling real-time search and content updates

This retriever is particularly useful when: - Need hybrid search combining vector and text search - Require real-time search with continuous updates - Want advanced ranking and relevance tuning - Building large-scale search applications - Need both structured and unstructured data search

The implementation integrates with LangChain’s Vespa retriever while providing a consistent Haive configuration interface.

Classes¶

VespaRetrieverConfig

Configuration for Vespa retriever in the Haive framework.

Module Contents¶

class haive.core.engine.retriever.providers.VespaRetrieverConfig.VespaRetrieverConfig[source]¶

Bases: haive.core.engine.retriever.retriever.BaseRetrieverConfig

Configuration for Vespa retriever in the Haive framework.

This retriever uses Vespa search engine to perform hybrid search combining vector similarity and text search capabilities.

retriever_type¶

The type of retriever (always VESPA).

Type:: RetrieverType

url¶

Vespa application URL.

Type:: str

content_field¶

Field containing document content.

Type:: str

k¶

Number of documents to retrieve.

Type:: int

metadata_fields¶

Fields to include in metadata.

Type:: List[str]

vespa_query_body¶

Custom Vespa query configuration.

Type:: Optional[Dict]

Examples

>>> from haive.core.engine.retriever import VespaRetrieverConfig
>>>
>>> # Create the Vespa retriever config
>>> config = VespaRetrieverConfig(
...     name="vespa_retriever",
...     url="http://localhost:8080",
...     content_field="content",
...     k=10,
...     metadata_fields=["title", "author", "category"],
...     vespa_query_body={
...         "yql": "select * from sources * where userQuery()",
...         "hits": 10,
...         "ranking": "bm25"
...     }
... )
>>>
>>> # Instantiate and use the retriever
>>> retriever = config.instantiate()
>>> docs = retriever.get_relevant_documents("machine learning neural networks")
>>>
>>> # Example with hybrid search
>>> hybrid_config = VespaRetrieverConfig(
...     name="vespa_hybrid_retriever",
...     url="http://localhost:8080",
...     content_field="content",
...     vespa_query_body={
...         "yql": "select * from sources * where ({targetHits:10}nearestNeighbor(embedding,q)) or userQuery()",
...         "ranking": "hybrid",
...         "input.query(q)": "embed(@query)"
...     }
... )

get_input_fields()[source]¶

Return input field definitions for Vespa retriever.

Return type:: dict[str, tuple[type, Any]]

get_output_fields()[source]¶

Return output field definitions for Vespa retriever.

Return type:: dict[str, tuple[type, Any]]

instantiate()[source]¶

Create a Vespa retriever from this configuration.

Returns:

Instantiated retriever ready for hybrid search.

Return type:

VespaRetriever

Raises:

ImportError – If required packages are not available.
ValueError – If configuration is invalid.