haive.core.engine.retriever.providers.ArxivRetrieverConfigΒΆ

Arxiv Retriever implementation for the Haive framework.

from typing import Any This module provides a configuration class for the Arxiv retriever, which retrieves academic papers from the arXiv preprint repository.

The ArxivRetriever works by: 1. Taking a search query for academic papers 2. Searching the arXiv API for matching papers 3. Returning paper abstracts and metadata as documents

This retriever is particularly useful when: - Working with academic or research content - Need access to the latest preprint papers - Building research-focused applications - Combining with other retrievers in academic contexts

The implementation integrates with LangChain’s ArxivRetriever while providing a consistent Haive configuration interface.

ClassesΒΆ

ArxivRetrieverConfig

Configuration for Arxiv retriever in the Haive framework.

Module ContentsΒΆ

class haive.core.engine.retriever.providers.ArxivRetrieverConfig.ArxivRetrieverConfig[source]ΒΆ

Bases: haive.core.engine.retriever.retriever.BaseRetrieverConfig

Configuration for Arxiv retriever in the Haive framework.

This retriever searches the arXiv preprint repository for academic papers matching the query and returns their abstracts and metadata as documents.

retriever_typeΒΆ

The type of retriever (always ARXIV).

Type:

RetrieverType

top_k_resultsΒΆ

Maximum number of papers to retrieve (default: 3).

Type:

int

load_max_docsΒΆ

Maximum number of documents to load (default: 100).

Type:

int

load_all_available_metaΒΆ

Whether to load all available metadata (default: False).

Type:

bool

Examples

>>> from haive.core.engine.retriever import ArxivRetrieverConfig
>>>
>>> # Create the arxiv retriever config
>>> config = ArxivRetrieverConfig(
...     name="arxiv_retriever",
...     top_k_results=5,
...     load_max_docs=50
... )
>>>
>>> # Instantiate and use the retriever
>>> retriever = config.instantiate()
>>> docs = retriever.get_relevant_documents("machine learning transformers")
get_input_fields()[source]ΒΆ

Return input field definitions for Arxiv retriever.

Return type:

dict[str, tuple[type, Any]]

get_output_fields()[source]ΒΆ

Return output field definitions for Arxiv retriever.

Return type:

dict[str, tuple[type, Any]]

instantiate()[source]ΒΆ

Create an Arxiv retriever from this configuration.

Returns:

Instantiated retriever ready for document retrieval.

Return type:

ArxivRetriever

Raises:

ImportError – If required packages are not available.