haive.core.engine.retriever.providers.WebResearchRetrieverConfigΒΆ
from typing import Any. Web Research Retriever implementation for the Haive framework.
This module provides a configuration class for the Web Research retriever, which performs advanced web research by combining web search with document processing and retrieval. It searches the web, retrieves content from URLs, processes the content, and provides comprehensive research results.
The WebResearchRetriever works by: 1. Using a web search API to find relevant URLs 2. Retrieving and processing content from those URLs 3. Chunking and embedding the retrieved content 4. Providing retrieval over the processed web content 5. Combining search results with retrieved document chunks
This retriever is particularly useful when: - Need up-to-date information from the web - Building research applications that require current data - Combining web search with document retrieval - Creating systems that need comprehensive web coverage - Building fact-checking or research assistant applications
The implementation integrates with LangChainβs WebResearchRetriever while providing a consistent Haive configuration interface with secure API key management.
ClassesΒΆ
Configuration for Web Research retriever in the Haive framework. |
Module ContentsΒΆ
- class haive.core.engine.retriever.providers.WebResearchRetrieverConfig.WebResearchRetrieverConfig[source]ΒΆ
Bases:
haive.core.common.mixins.secure_config.SecureConfigMixin
,haive.core.engine.retriever.retriever.BaseRetrieverConfig
Configuration for Web Research retriever in the Haive framework.
This retriever performs comprehensive web research by searching the web, retrieving content, and providing retrieval capabilities over the collected data.
- retriever_typeΒΆ
The type of retriever (always WEB_RESEARCH).
- Type:
- vectorstore_configΒΆ
Vector store for indexing web content.
- Type:
- llm_configΒΆ
LLM for processing and summarization.
- Type:
- api_keyΒΆ
API key for web search (auto-resolved).
- Type:
Optional[SecretStr]
Examples
>>> from haive.core.engine.retriever import WebResearchRetrieverConfig >>> from haive.core.engine.aug_llm import AugLLMConfig >>> from haive.core.engine.vectorstore.providers.ChromaVectorStoreConfig import ChromaVectorStoreConfig >>> >>> # Configure components >>> llm_config = AugLLMConfig(model_name="gpt-4", provider="openai") >>> vectorstore_config = ChromaVectorStoreConfig( ... name="web_research_store", ... collection_name="web_content" ... ) >>> >>> # Create the web research retriever config >>> config = WebResearchRetrieverConfig( ... name="web_research_retriever", ... vectorstore_config=vectorstore_config, ... llm_config=llm_config, ... num_search_results=10, ... num_web_pages=5 ... ) >>> >>> # Instantiate and use the retriever >>> retriever = config.instantiate() >>> docs = retriever.get_relevant_documents("latest AI research developments 2024")
- instantiate()[source]ΒΆ
Create a Web Research retriever from this configuration.
- Returns:
Instantiated retriever ready for web research.
- Return type:
WebResearchRetriever
- Raises:
ImportError β If required packages are not available.
ValueError β If API key or configuration is invalid.