haive.core.models.llm.providers.huggingface¶

HuggingFace Provider Module.

This module implements the HuggingFace language model provider for the Haive framework, supporting both HuggingFace Hub hosted models and local transformer models.

The provider handles API key management (for Hub), model configuration, and safe imports of the langchain-huggingface package dependencies.

Examples

Hub-hosted model:

from haive.core.models.llm.providers.huggingface import HuggingFaceProvider

provider = HuggingFaceProvider(
    model="microsoft/DialoGPT-medium",
    temperature=0.7,
    max_tokens=1000
)
llm = provider.instantiate()

Local transformer model:

provider = HuggingFaceProvider(
    model="gpt2",
    device_map="auto",
    load_in_8bit=True,
    temperature=0.8
)

HuggingFaceProvider(*[, ...])

HuggingFace language model provider configuration.

Classes¶

HuggingFaceProvider

HuggingFace language model provider configuration.

Module Contents¶

class haive.core.models.llm.providers.huggingface.HuggingFaceProvider(/, **data)[source]¶

Bases: haive.core.models.llm.providers.base.BaseLLMProvider

HuggingFace language model provider configuration.

This provider supports both HuggingFace Hub hosted models and local transformer models, providing access to thousands of open-source models.

Parameters:

data (Any)
requests_per_second (float | None)
tokens_per_second (int | None)
tokens_per_minute (int | None)
max_retries (int)
retry_delay (float)
check_every_n_seconds (float | None)
burst_size (int | None)
provider (LLMProvider)
model (str | None)
name (str | None)
api_key (SecretStr)
cache_enabled (bool)
cache_ttl (int | None)
extra_params (dict[str, Any] | None)
debug (bool)
temperature (float | None)
max_tokens (int | None)
top_p (float | None)
top_k (int | None)
repetition_penalty (float | None)
device_map (str | None)
load_in_8bit (bool)
load_in_4bit (bool)
trust_remote_code (bool)

provider¶

Always LLMProvider.HUGGINGFACE

Type:: LLMProvider

model¶

The HuggingFace model to use

Type:: str

temperature¶

Sampling temperature (0.0-2.0)

Type:: float

max_tokens¶

Maximum tokens in response

Type:: int

top_p¶

Nucleus sampling parameter

Type:: float

top_k¶

Top-k sampling parameter

Type:: int

repetition_penalty¶

Repetition penalty parameter

Type:: float

device_map¶

Device mapping for model loading

Type:: str

load_in_8bit¶

Use 8-bit quantization

Type:: bool

load_in_4bit¶

Use 4-bit quantization

Type:: bool

trust_remote_code¶

Trust remote code execution

Type:: bool

Examples

Popular conversational model:

provider = HuggingFaceProvider(
    model="microsoft/DialoGPT-medium",
    temperature=0.7,
    max_tokens=1000
)

Local model with quantization:

provider = HuggingFaceProvider(
    model="meta-llama/Llama-2-7b-chat-hf",
    load_in_8bit=True,
    device_map="auto",
    temperature=0.1
)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod get_models()[source]¶

Get popular HuggingFace models.

Return type:: list[str]

max_tokens: int | None = None¶: Get maximum total tokens for this model.