haive.core.engine.retriever.providers.GoogleDocumentAIWarehouseRetrieverConfig¶
from typing import Any. Google Document AI Warehouse Retriever implementation for the Haive framework.
This module provides a configuration class for the Google Document AI Warehouse retriever, which uses Google Cloud’s Document AI Warehouse service for intelligent document processing and retrieval. Document AI Warehouse provides advanced document understanding, classification, and search capabilities.
The GoogleDocumentAIWarehouseRetriever works by: 1. Connecting to a Document AI Warehouse project 2. Performing intelligent document search and retrieval 3. Leveraging ML for document understanding and classification 4. Supporting various document types and formats
This retriever is particularly useful when: - Building document management systems - Need intelligent document processing - Working with complex document formats - Want ML-powered document classification - Building compliance and governance tools
The implementation integrates with LangChain’s GoogleDocumentAIWarehouseRetriever while providing a consistent Haive configuration interface with secure GCP credential management.
Classes¶
Configuration for Google Document AI Warehouse retriever in the Haive framework. |
Module Contents¶
- class haive.core.engine.retriever.providers.GoogleDocumentAIWarehouseRetrieverConfig.GoogleDocumentAIWarehouseRetrieverConfig[source]¶
Bases:
haive.core.common.mixins.secure_config.SecureConfigMixin
,haive.core.engine.retriever.retriever.BaseRetrieverConfig
Configuration for Google Document AI Warehouse retriever in the Haive framework.
This retriever uses Google Cloud Document AI Warehouse to provide intelligent document processing and retrieval with ML-powered understanding.
- retriever_type¶
The type of retriever (always GOOGLE_DOCUMENT_AI_WAREHOUSE).
- Type:
- api_key¶
Service account key (auto-resolved from GOOGLE_APPLICATION_CREDENTIALS).
- Type:
Optional[SecretStr]
Examples
>>> from haive.core.engine.retriever import GoogleDocumentAIWarehouseRetrieverConfig >>> >>> # Create the Document AI Warehouse retriever config >>> config = GoogleDocumentAIWarehouseRetrieverConfig( ... name="doc_ai_warehouse_retriever", ... project_number="123456789012", ... location="us", ... document_schema_id="schema_id_123", ... num_results=10 ... ) >>> >>> # Instantiate and use the retriever >>> retriever = config.instantiate() >>> docs = retriever.get_relevant_documents("contract analysis documents") >>> >>> # Example with specific schema >>> contract_config = GoogleDocumentAIWarehouseRetrieverConfig( ... name="contract_doc_ai_retriever", ... project_number="123456789012", ... location="us", ... document_schema_id="contract_schema_456", ... num_results=5 ... )
- get_input_fields()[source]¶
Return input field definitions for Google Document AI Warehouse retriever.
- get_output_fields()[source]¶
Return output field definitions for Google Document AI Warehouse retriever.
- instantiate()[source]¶
Create a Google Document AI Warehouse retriever from this configuration.
- Returns:
Instantiated retriever ready for document search.
- Return type:
GoogleDocumentAIWarehouseRetriever
- Raises:
ImportError – If required packages are not available.
ValueError – If GCP credentials or configuration is invalid.