haive.core.models.llm.providers.huggingfaceΒΆ

HuggingFace Provider Module.

This module implements the HuggingFace language model provider for the Haive framework, supporting both HuggingFace Hub hosted models and local transformer models.

The provider handles API key management (for Hub), model configuration, and safe imports of the langchain-huggingface package dependencies.

Examples

Hub-hosted model:

from haive.core.models.llm.providers.huggingface import HuggingFaceProvider

provider = HuggingFaceProvider(
    model="microsoft/DialoGPT-medium",
    temperature=0.7,
    max_tokens=1000
)
llm = provider.instantiate()

Local transformer model:

provider = HuggingFaceProvider(
    model="gpt2",
    device_map="auto",
    load_in_8bit=True,
    temperature=0.8
)

HuggingFaceProvider(*[, ...])

HuggingFace language model provider configuration.

ClassesΒΆ

HuggingFaceProvider

HuggingFace language model provider configuration.

Module ContentsΒΆ

class haive.core.models.llm.providers.huggingface.HuggingFaceProvider(/, **data)[source]ΒΆ

Bases: haive.core.models.llm.providers.base.BaseLLMProvider

HuggingFace language model provider configuration.

This provider supports both HuggingFace Hub hosted models and local transformer models, providing access to thousands of open-source models.

Parameters:
  • data (Any)

  • requests_per_second (float | None)

  • tokens_per_second (int | None)

  • tokens_per_minute (int | None)

  • max_retries (int)

  • retry_delay (float)

  • check_every_n_seconds (float | None)

  • burst_size (int | None)

  • provider (LLMProvider)

  • model (str | None)

  • name (str | None)

  • api_key (SecretStr)

  • cache_enabled (bool)

  • cache_ttl (int | None)

  • extra_params (dict[str, Any] | None)

  • debug (bool)

  • temperature (float | None)

  • max_tokens (int | None)

  • top_p (float | None)

  • top_k (int | None)

  • repetition_penalty (float | None)

  • device_map (str | None)

  • load_in_8bit (bool)

  • load_in_4bit (bool)

  • trust_remote_code (bool)

providerΒΆ

Always LLMProvider.HUGGINGFACE

Type:

LLMProvider

modelΒΆ

The HuggingFace model to use

Type:

str

temperatureΒΆ

Sampling temperature (0.0-2.0)

Type:

float

max_tokensΒΆ

Maximum tokens in response

Type:

int

top_pΒΆ

Nucleus sampling parameter

Type:

float

top_kΒΆ

Top-k sampling parameter

Type:

int

repetition_penaltyΒΆ

Repetition penalty parameter

Type:

float

device_mapΒΆ

Device mapping for model loading

Type:

str

load_in_8bitΒΆ

Use 8-bit quantization

Type:

bool

load_in_4bitΒΆ

Use 4-bit quantization

Type:

bool

trust_remote_codeΒΆ

Trust remote code execution

Type:

bool

Examples

Popular conversational model:

provider = HuggingFaceProvider(
    model="microsoft/DialoGPT-medium",
    temperature=0.7,
    max_tokens=1000
)

Local model with quantization:

provider = HuggingFaceProvider(
    model="meta-llama/Llama-2-7b-chat-hf",
    load_in_8bit=True,
    device_map="auto",
    temperature=0.1
)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod get_models()[source]ΒΆ

Get popular HuggingFace models.

Return type:

list[str]

max_tokens: int | None = NoneΒΆ

Get maximum total tokens for this model.