haive.core.engineΒΆ
β‘ Haive Engine System - The Universal AI Component Interface
THE BEATING HEART OF INTELLIGENT AI ARCHITECTURES
Welcome to the Engine System - a revolutionary abstraction layer that unifies all AI components under a single, elegant interface. This isnβt just another wrapper around language models; itβs a comprehensive orchestration platform that enables seamless integration, composition, and enhancement of any AI capability imaginable.
𧬠ENGINE ARCHITECTURE OVERVIEW¢
The Engine System represents a paradigm shift in how AI components are built, composed, and orchestrated. Every AI capability - from language models to vector stores, from tools to retrievers - is unified under the Engine abstraction, enabling:
- 1. Universal Interface π
InvokableEngine: Synchronous and asynchronous execution for any AI component
Streaming Support: Real-time token streaming for responsive experiences
Batch Processing: Efficient parallel execution of multiple requests
Error Recovery: Automatic retries and graceful degradation
- 2. Enhanced Language Models π§
AugLLM: Supercharged LLMs with tool use, structured output, and memory
Multi-Provider Support: OpenAI, Anthropic, Google, Hugging Face, and more
Dynamic Configuration: Runtime model switching without code changes
Cost Optimization: Automatic model selection based on task complexity
- 3. Retrieval Systems π
Vector Search: Semantic retrieval with multiple embedding models
Hybrid Search: Combine semantic, keyword, and metadata filtering
Reranking: Advanced relevance scoring with cross-encoders
Incremental Indexing: Update knowledge bases without full rebuilds
- 4. Document Processing π
Universal Loaders: Handle any document format automatically
Intelligent Chunking: Context-aware text splitting strategies
Metadata Extraction: Automatic extraction of document properties
Format Detection: Smart content type identification
- 5. Tool Integration π οΈ
Type-Safe Tools: Full validation and error handling
Automatic Discovery: Find and register tools dynamically
Parallel Execution: Run multiple tools simultaneously
Result Aggregation: Intelligent combination of tool outputs
- 6. Vector Storage ποΈ
Multi-Backend Support: Pinecone, Weaviate, Chroma, pgvector, and more
Automatic Embeddings: Generate and cache embeddings efficiently
Similarity Search: Fast k-NN search with filtering
Persistence: Durable storage with backup and recovery
π QUICK STARTΒΆ
Examples
>>> from haive.core.engine import AugLLMConfig, create_retriever, create_vectorstore
>>>
>>> # 1. Create an enhanced LLM with tools and structured output
>>> llm = AugLLMConfig(
>>> model="gpt-4",
>>> temperature=0.7,
>>> tools=["web_search", "calculator", "code_executor"],
>>> structured_output_model=AnalysisResult,
>>> system_message="You are a helpful AI assistant with tool access."
>>> )
>>>
>>> # 2. Create a vector store for knowledge management
>>> vectorstore = create_vectorstore(
>>> type="pinecone",
>>> index_name="knowledge_base",
>>> embedding_model="text-embedding-3-large"
>>> )
>>>
>>> # 3. Create a retriever for RAG workflows
>>> retriever = create_retriever(
>>> vectorstore=vectorstore,
>>> search_type="similarity",
>>> search_kwargs={"k": 5, "score_threshold": 0.7}
>>> )
>>>
>>> # 4. Compose engines for complex workflows
>>> from haive.core.engine import compose_runnable
>>>
>>> rag_chain = compose_runnable([
>>> retriever,
>>> llm.with_context_from_retriever()
>>> ])
>>>
>>> # 5. Execute with streaming
>>> async for chunk in rag_chain.astream("What are the latest AI breakthroughs?"):
>>> print(chunk, end="", flush=True)
π― KEY INNOVATIONSΒΆ
- 1. Unified Execution Model π
Every engine supports the same interface:
>>> # Synchronous >>> result = engine.invoke(input_data) >>> >>> # Asynchronous >>> result = await engine.ainvoke(input_data) >>> >>> # Streaming >>> for chunk in engine.stream(input_data): >>> process(chunk) >>> >>> # Batch processing >>> results = engine.batch([input1, input2, input3])
- 2. Dynamic Composition π§©
Engines can be composed like building blocks:
>>> # Chain engines together >>> pipeline = retriever | reranker | llm | output_parser >>> >>> # Parallel execution >>> parallel = retriever & web_search & database_query >>> >>> # Conditional routing >>> router = conditional_engine( >>> condition=lambda x: x.get("type") == "technical", >>> if_true=technical_llm, >>> if_false=general_llm >>> )
- 3. Intelligent Caching πΎ
Automatic result caching with semantic similarity:
>>> cached_llm = llm.with_caching( >>> cache_type="semantic", >>> similarity_threshold=0.95, >>> ttl=3600 >>> )
- 4. Observability Built-In π
Every engine emits detailed telemetry:
>>> # Automatic metrics collection >>> llm.metrics.latency_p95 # 95th percentile latency >>> llm.metrics.token_usage # Token consumption >>> llm.metrics.error_rate # Error percentage
π ENGINE MODULESΒΆ
Core Modules: - aug_llm/: Enhanced language model configurations and factories - base/: Base engine classes, protocols, and registry system - tool/: Tool creation, discovery, and execution engine - output_parser/: Structured output parsing and validation
Data Processing: - document/: Universal document loading and processing - embedding/: Embedding generation for multiple providers - vectorstore/: Vector database integrations - retriever/: Retrieval strategies and implementations
Specialized Engines: - agent/: Agent-specific engine components (heavy, lazy-loaded) - prompt_template/: Dynamic prompt generation engines
ποΈ ARCHITECTURE PATTERNSΒΆ
1. Provider Abstraction
>>> # Switch providers without changing code
>>> config = AugLLMConfig(
>>> model="gpt-4", # or "claude-3", "gemini-pro", "llama-2"
>>> provider="openai" # auto-detected from model
>>> )
2. Engine Registry
>>> # Register custom engines
>>> @EngineRegistry.register("my_custom_engine")
>>> class MyCustomEngine(InvokableEngine):
>>> async def ainvoke(self, input_data):
>>> # Custom implementation
>>> return process(input_data)
>>>
>>> # Use anywhere
>>> engine = EngineRegistry.create("my_custom_engine", config)
3. Middleware Pattern
>>> # Add capabilities to any engine
>>> enhanced = base_engine.pipe(
>>> add_retry(max_attempts=3),
>>> add_rate_limiting(requests_per_minute=100),
>>> add_caching(ttl=3600),
>>> add_logging(level="DEBUG")
>>> )
π¨ ADVANCED FEATURESΒΆ
1. Multi-Modal Support πΌοΈ
>>> vision_llm = AugLLMConfig(
>>> model="gpt-4-vision",
>>> accept_types=["text", "image", "video"]
>>> )
>>>
>>> result = vision_llm.invoke({
>>> "text": "What's in this image?",
>>> "image": image_data
>>> })
2. Function Calling π
>>> llm_with_tools = AugLLMConfig(
>>> model="gpt-4",
>>> tools=[weather_tool, calculator_tool],
>>> tool_choice="auto" # or "required", "none", specific tool
>>> )
3. Structured Output π
>>> from pydantic import BaseModel
>>>
>>> class Analysis(BaseModel):
>>> sentiment: str
>>> confidence: float
>>> key_points: List[str]
>>>
>>> llm = AugLLMConfig(
>>> model="gpt-4",
>>> structured_output_model=Analysis
>>> )
>>>
>>> result: Analysis = llm.invoke("Analyze this text...")
4. Streaming with Callbacks π
>>> async def on_token(token: str):
>>> print(token, end="", flush=True)
>>>
>>> async def on_complete(result: dict):
>>> print(f"
- Tokens used: {result[βusageβ][βtotal_tokensβ]}β)
>>> >>> llm = AugLLMConfig( >>> streaming=True, >>> callbacks={ >>> "on_llm_new_token": on_token, >>> "on_llm_end": on_complete >>> } >>> )
π¨ PERFORMANCE OPTIMIZATIONSΒΆ
The Engine System is optimized for production workloads:
Lazy Loading: Heavy components loaded only when needed
Connection Pooling: Reuse connections to external services
Batch Processing: Efficient handling of multiple requests
Automatic Retries: Exponential backoff for transient failures
Resource Management: Automatic cleanup and garbage collection
π‘ BEST PRACTICESΒΆ
Use Type Hints: Engines validate inputs based on type annotations
Handle Errors: Always wrap engine calls in try-except blocks
Monitor Usage: Track token consumption and API costs
Cache Wisely: Use semantic caching for expensive operations
Compose Engines: Build complex behaviors from simple components
π SEE ALSOΒΆ
haive.core.graph: Build visual workflows with engines
haive.core.schema: State management for engine outputs
haive.core.tools: Create custom tools as engines
haive.agents: Pre-built agents using the engine system
β
The Engine System: Where AI Components Become Intelligent Building Blocks β‘