haive.core.engine.document.universal_loader

Universal Document Loader with Auto-Detection.

This module provides a comprehensive universal loader that automatically detects the best loader for any given input (URL, file path, text, etc.) and can handle preferences for optimal loader selection.

Classes

SmartSourceRegistry

Enhanced source registry with intelligent matching.

UniversalDocumentLoader

Universal document loader with intelligent source detection.

Functions

analyze_document_source(path)

Analyze a document source and return information about available loaders.

load_document(path[, credential_manager, preferences, ...])

Convenience function to load a document from any source.

Module Contents

class haive.core.engine.document.universal_loader.SmartSourceRegistry

Enhanced source registry with intelligent matching.

Init .

Returns:

Add return description]

Return type:

[TODO

find_best_sources(path, limit=3)

Find the best source types for a given path.

Parameters:
Return type:

list[tuple[type, float]]

class haive.core.engine.document.universal_loader.UniversalDocumentLoader(credential_manager=None)

Universal document loader with intelligent source detection.

Initialize the universal loader.

Parameters:

credential_manager (haive.core.engine.document.loaders.sources.implementation.CredentialManager | None) – Optional credential manager for authenticated sources

analyze_source(path)

Analyze a source and return information about available loaders.

Parameters:

path (str)

Return type:

dict[str, Any]

get_supported_sources()

Get list of all supported source types.

Return type:

list[str]

load(path, preferences=None, options=None, strategy=None, fallback=True)

Load documents from any source with intelligent detection.

Parameters:
  • path (str) – File path, URL, or source identifier

  • preferences (dict[str, Any] | None) – Preferences for loader selection

  • options (dict[str, Any] | None) – Loader-specific options

  • strategy (str | None) – Force a specific strategy

  • fallback (bool) – Whether to use fallback loaders

Returns:

Appropriate document loader or None

Return type:

langchain_core.document_loaders.base.BaseLoader | None

Examples

>>> loader = UniversalDocumentLoader()
>>>
>>> # Auto-detect and load
>>> doc_loader = loader.load("https://github.com/user/repo")
>>> doc_loader = loader.load("document.pdf")
>>> doc_loader = loader.load("data.csv")
>>>
>>> # With preferences
>>> doc_loader = loader.load(
...     "https://example.com",
...     preferences={"web_strategy": "playwright", "include_images": True}
... )
haive.core.engine.document.universal_loader.analyze_document_source(path)

Analyze a document source and return information about available loaders.

Parameters:

path (str) – Path to analyze

Returns:

Analysis information

Return type:

dict[str, Any]

haive.core.engine.document.universal_loader.load_document(path, credential_manager=None, preferences=None, options=None, strategy=None)

Convenience function to load a document from any source.

Parameters:
Returns:

Document loader or None

Return type:

langchain_core.document_loaders.base.BaseLoader | None