haive.core.engine.document.loaders.registry¶
Document loader registry system.
This module provides a registry for document loaders, allowing them to be registered, looked up, and managed throughout the application.
Classes¶
Registry for document loaders. |
|
Metadata for a document loader. |
Functions¶
|
Create a loader instance by name. |
Get the default document loader registry. |
|
|
Get a loader by name from the default registry. |
|
Decorator to register a document loader. |
Module Contents¶
- class haive.core.engine.document.loaders.registry.DocumentLoaderRegistry[source]¶
Bases:
haive.core.registry.base.AbstractRegistry
[type
[langchain_core.document_loaders.base.BaseLoader
]]Registry for document loaders.
This registry keeps track of document loader classes and their metadata, allowing for discovery and instantiation of loaders based on source types.
Initialize the registry with empty storage.
- get(item_type, name)[source]¶
Get a loader by source type and name.
- Parameters:
item_type (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – Source type
name (str) – Loader name
- Returns:
Loader class if found, None otherwise
- Return type:
type[langchain_core.document_loaders.base.BaseLoader] | None
- get_all(item_type)[source]¶
Get all loaders for a specific source type.
- Parameters:
item_type (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – Source type
- Returns:
Dictionary mapping loader names to loader classes
- Return type:
dict[str, type[langchain_core.document_loaders.base.BaseLoader]]
- get_all_metadata()[source]¶
Get metadata for all registered loaders.
- Returns:
Dictionary mapping loader names to metadata
- Return type:
- get_metadata(name)[source]¶
Get metadata for a specific loader.
- Parameters:
name (str) – Loader name
- Returns:
Loader metadata if found, None otherwise
- Return type:
LoaderMetadata | None
- list(item_type)[source]¶
List all loader names for a specific source type.
- Parameters:
item_type (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – Source type
- Returns:
List of loader names
- Return type:
- register(loader_class, metadata)[source]¶
Register a document loader with metadata.
- Parameters:
loader_class (type[langchain_core.document_loaders.base.BaseLoader]) – Loader class to register
metadata (LoaderMetadata) – Metadata for the loader
- Returns:
The registered loader class
- Return type:
type[langchain_core.document_loaders.base.BaseLoader]
- class haive.core.engine.document.loaders.registry.LoaderMetadata(/, **data)[source]¶
Bases:
pydantic.BaseModel
Metadata for a document loader.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- haive.core.engine.document.loaders.registry.create_loader(loader_name, **kwargs)[source]¶
Create a loader instance by name.
- Parameters:
loader_name (str)
- Return type:
langchain_core.document_loaders.base.BaseLoader | None
- haive.core.engine.document.loaders.registry.get_default_registry()[source]¶
Get the default document loader registry.
- Return type:
- haive.core.engine.document.loaders.registry.get_loader(loader_name)[source]¶
Get a loader by name from the default registry.
- haive.core.engine.document.loaders.registry.register_loader(source_type, name=None, description=None, requires_async=False, file_extensions=None, url_patterns=None, config_schema=None)[source]¶
Decorator to register a document loader.
- Parameters:
source_type (haive.core.engine.document.loaders.sources.source_types.SourceCategory) – Type of source this loader handles
name (str | None) – Optional custom name for the loader
description (str | None) – Optional description of the loader
requires_async (bool) – Whether this loader requires async operations
file_extensions (list[str] | None) – List of file extensions this loader can handle
url_patterns (list[str] | None) – List of URL patterns this loader can handle
config_schema (type[pydantic.BaseModel] | None) – Optional Pydantic model for configuration
- Returns:
Decorator function
- Return type:
collections.abc.Callable[[type[langchain_core.document_loaders.base.BaseLoader]], type[langchain_core.document_loaders.base.BaseLoader]]