agents.memory.graph_rag_retriever¶
Graph RAG Retriever for Memory System.
This module implements a Graph RAG retriever that combines knowledge graph traversal with vector similarity search to provide comprehensive memory retrieval with relationship context and semantic understanding.
Classes¶
Comprehensive result from Graph RAG retrieval combining knowledge graph and vector search. |
|
Advanced Graph RAG retriever that combines knowledge graph traversal with vector similarity search. |
|
Configuration for Graph RAG retriever with comprehensive customization options. |
Module Contents¶
- class agents.memory.graph_rag_retriever.GraphRAGResult(/, **data)¶
Bases:
pydantic.BaseModel
Comprehensive result from Graph RAG retrieval combining knowledge graph and vector search.
This class encapsulates all information from a Graph RAG retrieval operation, including retrieved memories, graph traversal results, scoring information, and performance metrics for analysis and optimization.
- Parameters:
data (Any)
- query¶
Original user query that was processed
- memories¶
Retrieved memories from both vector and graph sources
- start_entities¶
Initial entities identified in the query for graph traversal
- traversed_entities¶
All entities explored during graph traversal
- relationship_paths¶
Relationship paths discovered during graph traversal
- graph_nodes_explored¶
Number of graph nodes explored (for testing compatibility)
- graph_paths¶
Alias for relationship_paths (for backward compatibility)
- similarity_scores¶
Vector similarity scores for each memory
- graph_scores¶
Graph centrality scores for each memory
- final_scores¶
Combined final scores used for ranking
- total_time_ms¶
Total processing time for the entire operation
- graph_traversal_time_ms¶
Time spent on graph traversal
- vector_search_time_ms¶
Time spent on vector search
- query_intent¶
Analyzed query intent and characteristics
- expansion_terms¶
Query expansion terms used for enhanced retrieval
Examples
Accessing Graph RAG results:
result = await retriever.retrieve_memories( "What are the connections between Python and machine learning?" ) print(f"Query: {result.query}") print(f"Retrieved {len(result.memories)} memories") print(f"Explored {result.graph_nodes_explored} graph nodes") print(f"Found {len(result.relationship_paths)} relationship paths") print(f"Total time: {result.total_time_ms:.1f}ms") # Access individual memories with scores for i, memory in enumerate(result.memories): sim_score = result.similarity_scores[i] graph_score = result.graph_scores[i] final_score = result.final_scores[i] print(f"Memory {i+1}: {memory['content'][:100]}...") print(f" Similarity: {sim_score:.2f}, Graph: {graph_score:.2f}, Final: {final_score:.2f}")
Analyzing graph traversal results:
result = await retriever.retrieve_memories("machine learning algorithms") print(f"Starting entities: {[e.name for e in result.start_entities]}") print(f"Traversed entities: {[e.name for e in result.traversed_entities]}") # Analyze relationship paths for i, path in enumerate(result.relationship_paths): print(f"Path {i+1}:") for rel in path: print(f" {rel.source_id} -> {rel.target_id} ({rel.relationship_type})")
Performance analysis:
result = await retriever.retrieve_memories("complex query") print(f"Performance breakdown:") print(f" Graph traversal: {result.graph_traversal_time_ms:.1f}ms") print(f" Vector search: {result.vector_search_time_ms:.1f}ms") print(f" Total time: {result.total_time_ms:.1f}ms") # Query expansion analysis if result.expansion_terms: print(f"Query expanded with: {result.expansion_terms}")
Getting top memories:
result = await retriever.retrieve_memories("Python programming") # Get top 5 memories by final score top_memories = result.get_top_memories(limit=5) for i, memory in enumerate(top_memories): print(f"Top {i+1}: {memory['content'][:50]}...")
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- get_top_memories(limit=10)¶
Get top memories by final score with ranking.
- Parameters:
limit (int) – Maximum number of memories to return
- Returns:
Top memories sorted by final score
- Return type:
List[Dict[str, Any]]
Examples
Get top memories:
result = await retriever.retrieve_memories("Python programming") top_memories = result.get_top_memories(limit=5) for i, memory in enumerate(top_memories): print(f"Top {i+1}: {memory['content'][:50]}...")
- class agents.memory.graph_rag_retriever.GraphRAGRetriever(config)¶
Advanced Graph RAG retriever that combines knowledge graph traversal with vector similarity search.
The GraphRAGRetriever enhances traditional vector-based memory retrieval by leveraging knowledge graph structure to discover relevant memories through entity relationships and semantic connections. This approach provides more comprehensive and contextually rich retrieval results.
- Key Features:
Intelligent entity identification from queries using LLM analysis
Multi-hop graph traversal to discover related entities and concepts
Hybrid scoring combining vector similarity, graph centrality, and importance
Query expansion with semantically related terms
Relationship path discovery for context understanding
Bidirectional graph traversal for comprehensive coverage
Performance optimization with configurable limits and thresholds
- Retrieval Process:
Query Analysis: Parse query intent and identify mentioned entities
Entity Identification: Match query entities to knowledge graph nodes
Graph Traversal: Explore relationships to find connected entities
Memory Retrieval: Collect memories from both vector search and graph entities
Scoring: Combine similarity, centrality, importance, and recency scores
Ranking: Sort results by final score and return top memories
- config¶
Configuration object with all retrieval settings
- memory_store¶
Memory store manager for basic storage operations
- classifier¶
Memory classifier for query intent analysis
- kg_generator¶
Knowledge graph generator for entity and relationship data
- llm¶
LLM runnable for query analysis and entity identification
- entity_identification_prompt¶
Prompt template for entity identification
- relationship_path_analysis_prompt¶
Prompt template for path analysis
Examples
Basic Graph RAG retrieval:
# Create retriever retriever = GraphRAGRetriever(config) # Retrieve memories with graph enhancement result = await retriever.retrieve_memories( "What are the connections between Python and machine learning?" ) print(f"Retrieved {len(result.memories)} memories") print(f"Explored {result.graph_nodes_explored} graph nodes") print(f"Found {len(result.relationship_paths)} relationship paths") # Access memories with graph context for memory in result.memories: graph_context = memory.get("graph_context", []) if graph_context: entities = [ctx["entity_name"] for ctx in graph_context] print(f"Memory connected to entities: {entities}")
Advanced retrieval with custom parameters:
# Retrieve with specific settings result = await retriever.retrieve_memories( query="How do neural networks work?", limit=15, memory_types=[MemoryType.SEMANTIC, MemoryType.PROCEDURAL], namespace=("user", "ml", "concepts"), enable_graph_traversal=True, max_graph_depth=3 ) # Analyze scoring components for i, memory in enumerate(result.memories): sim_score = result.similarity_scores[i] graph_score = result.graph_scores[i] final_score = result.final_scores[i] print(f"Memory {i+1}: Final={final_score:.2f} " f"(Sim={sim_score:.2f}, Graph={graph_score:.2f})")
Entity context exploration:
# Get comprehensive context for specific entity context = await retriever.get_entity_context("Python") print(f"Entity: {context['entity'].name}") print(f"Connections: {context['total_connections']}") print(f"Associated memories: {context['memory_count']}") # Explore entity neighborhood neighborhood = context['neighborhood'] for level, entities in neighborhood.get('levels', {}).items(): print(f"Level {level}: {[e.name for e in entities]}")
Relationship path analysis:
# Find paths between entities paths = await retriever.find_relationship_paths( "Python", "Machine Learning", max_depth=3 ) for i, path in enumerate(paths): print(f"Path {i+1}:") for rel in path: print(f" {rel.source_id} -> {rel.target_id} ({rel.relationship_type})")
Performance monitoring:
result = await retriever.retrieve_memories("complex query") print(f"Performance breakdown:") print(f" Graph traversal: {result.graph_traversal_time_ms:.1f}ms") print(f" Vector search: {result.vector_search_time_ms:.1f}ms") print(f" Total time: {result.total_time_ms:.1f}ms") # Query expansion analysis if result.expansion_terms: print(f"Query expanded with: {result.expansion_terms}")
Note
The retriever automatically balances graph traversal depth and performance based on the configuration. For large knowledge graphs, consider reducing max_traversal_depth and increasing min_relationship_confidence for better performance.
Initialize the Graph RAG retriever with comprehensive configuration.
Sets up all components needed for Graph RAG retrieval including memory stores, knowledge graph generators, LLM for query analysis, and prompt templates.
- Parameters:
config (GraphRAGRetrieverConfig) – GraphRAGRetrieverConfig with all necessary components and settings
- Raises:
ValueError – If required components are missing in config
Examples
Basic initialization:
config = GraphRAGRetrieverConfig( memory_store_manager=store_manager, memory_classifier=classifier, kg_generator=kg_generator ) retriever = GraphRAGRetriever(config)
With custom LLM configuration:
config = GraphRAGRetrieverConfig( memory_store_manager=store_manager, memory_classifier=classifier, kg_generator=kg_generator, llm_config=AugLLMConfig( model="gpt-4", temperature=0.1, max_tokens=500 ) ) retriever = GraphRAGRetriever(config)
Note
The retriever validates all required components during initialization and sets up optimized prompt templates for entity identification and relationship path analysis.
- async find_relationship_paths(entity1, entity2, max_depth=3)¶
Find relationship paths between two entities in the knowledge graph.
This method performs a breadth-first search to discover all possible relationship paths connecting two entities within the specified depth limit. It’s useful for understanding how entities are connected and for providing context about their relationships in query responses.
- Parameters:
- Returns:
- List of relationship paths, where each
path is a list of KnowledgeGraphRelationship objects representing the sequence of relationships connecting the two entities. Limited to 10 paths to prevent excessive computation.
- Return type:
List[List[KnowledgeGraphRelationship]]
Examples
Find direct and indirect connections:
paths = await retriever.find_relationship_paths( "Python", "Machine Learning", max_depth=3 ) print(f"Found {len(paths)} paths between Python and Machine Learning") for i, path in enumerate(paths): print(f"Path {i+1}:") for rel in path: print(f" {rel.source_id} -> {rel.target_id} ({rel.relationship_type})") print(f" Confidence: {rel.confidence:.2f}")
Analyze relationship strength:
paths = await retriever.find_relationship_paths("AI", "Ethics") if paths: # Find strongest path (highest average confidence) strongest_path = max(paths, key=lambda p: sum(rel.confidence for rel in p) / len(p)) avg_confidence = sum(rel.confidence for rel in strongest_path) / len(strongest_path) print(f"Strongest path has {len(strongest_path)} hops with confidence {avg_confidence:.2f}") # Analyze relationship types rel_types = [rel.relationship_type for rel in strongest_path] print(f"Relationship sequence: {' -> '.join(rel_types)}")
Find shortest path:
paths = await retriever.find_relationship_paths( "Neural Networks", "Deep Learning", max_depth=2 ) if paths: shortest_path = min(paths, key=len) print(f"Shortest path has {len(shortest_path)} hops") # Display path details for rel in shortest_path: print(f"{rel.source_id} --[{rel.relationship_type}]--> {rel.target_id}")
Check for no connection:
paths = await retriever.find_relationship_paths( "Unrelated Topic 1", "Unrelated Topic 2" ) if not paths: print("No relationship paths found between these entities") else: print(f"Found {len(paths)} connecting paths")
Note
The search is limited to 10 paths to prevent excessive computation on highly connected graphs. Paths are found using breadth-first search, so shorter paths are discovered first. For very large knowledge graphs, consider reducing max_depth for better performance.
- async get_entity_context(entity_name)¶
Get comprehensive context information for a specific entity in the knowledge graph.
This method provides detailed information about an entity including its neighborhood, associated memories, and connection statistics. It’s useful for understanding the role and importance of an entity within the knowledge graph structure.
- Parameters:
entity_name (str) – Name of the entity to get context for (e.g., “Python”, “Machine Learning”)
- Returns:
- Comprehensive entity context containing:
entity: The KnowledgeGraphNode object with entity details
neighborhood: Dictionary with entity’s neighborhood structure by depth levels
associated_memories: List of memories directly associated with this entity
total_connections: Number of entities connected to this entity
memory_count: Number of memories referencing this entity
error: Error message if entity not found
- Return type:
Dict[str, Any]
Examples
Get context for a specific entity:
context = await retriever.get_entity_context("Python") if "error" not in context: entity = context["entity"] print(f"Entity: {entity.name} ({entity.type})") print(f"Confidence: {entity.confidence:.2f}") print(f"Total connections: {context['total_connections']}") print(f"Associated memories: {context['memory_count']}") # Explore neighborhood structure neighborhood = context["neighborhood"] for level, entities in neighborhood.get("levels", {}).items(): print(f"Level {level}: {[e.name for e in entities]}") # Access associated memories memories = context["associated_memories"] for memory in memories: print(f"Memory: {memory['content'][:100]}...")
Handle entity not found:
context = await retriever.get_entity_context("NonexistentEntity") if "error" in context: print(f"Error: {context['error']}") else: print(f"Found entity: {context['entity'].name}")
Analyze entity importance:
context = await retriever.get_entity_context("Machine Learning") if "error" not in context: entity = context["entity"] connections = context["total_connections"] memories = context["memory_count"] # Calculate importance score importance = (connections * 0.6) + (memories * 0.4) print(f"Entity importance score: {importance:.2f}") # Analyze neighborhood diversity neighborhood = context["neighborhood"] entity_types = set() for level_entities in neighborhood.get("levels", {}).values(): for entity in level_entities: entity_types.add(entity.type) print(f"Connected entity types: {list(entity_types)}")
Note
This method explores the entity’s neighborhood to depth 2 by default, which provides a good balance between comprehensiveness and performance. For very large knowledge graphs, consider the performance implications of deep neighborhood exploration.
- async retrieve_memories(query, limit=None, memory_types=None, namespace=None, enable_graph_traversal=True, max_graph_depth=None)¶
Retrieve memories using Graph RAG approach.
- Parameters:
query (str) – User query
limit (int | None) – Maximum number of memories to retrieve
memory_types (list[haive.agents.memory.core.types.MemoryType] | None) – Specific memory types to focus on
namespace (tuple[str, Ellipsis] | None) – Memory namespace to search
enable_graph_traversal (bool) – Whether to use graph traversal
max_graph_depth (int | None) – Maximum depth for graph traversal (overrides config)
- Returns:
GraphRAGResult with retrieved memories and graph context
- Return type:
- class agents.memory.graph_rag_retriever.GraphRAGRetrieverConfig(/, **data)¶
Bases:
pydantic.BaseModel
Configuration for Graph RAG retriever with comprehensive customization options.
This configuration class defines all parameters needed to create and configure a GraphRAGRetriever, including core components, graph traversal settings, scoring weights, and query expansion parameters.
- Parameters:
data (Any)
- memory_store_manager¶
Manager for memory storage and retrieval operations
- memory_classifier¶
Classifier for analyzing query intent and memory types
- kg_generator¶
Knowledge graph generator for entity and relationship extraction
- default_limit¶
Default number of memories to retrieve per query
- max_limit¶
Maximum number of memories that can be retrieved
- max_traversal_depth¶
Maximum depth for graph traversal (prevents infinite loops)
- min_relationship_confidence¶
Minimum confidence score for relationships to traverse
- enable_bidirectional_traversal¶
Whether to traverse relationships in both directions
- similarity_weight¶
Weight for vector similarity score in final ranking (0.0-1.0)
- graph_weight¶
Weight for graph centrality score in final ranking (0.0-1.0)
- importance_weight¶
Weight for memory importance score in final ranking (0.0-1.0)
- recency_weight¶
Weight for memory recency score in final ranking (0.0-1.0)
- enable_query_expansion¶
Whether to enable query expansion with related terms
- max_expansion_terms¶
Maximum number of terms to add during query expansion
- llm_config¶
LLM configuration for query analysis and entity identification
Examples
Basic configuration:
config = GraphRAGRetrieverConfig( memory_store_manager=store_manager, memory_classifier=classifier, kg_generator=kg_generator, default_limit=10, max_traversal_depth=2 )
Performance-optimized configuration:
config = GraphRAGRetrieverConfig( memory_store_manager=store_manager, memory_classifier=classifier, kg_generator=kg_generator, # Faster retrieval settings default_limit=5, max_limit=20, max_traversal_depth=2, min_relationship_confidence=0.7, enable_bidirectional_traversal=False, # Similarity-focused scoring similarity_weight=0.6, graph_weight=0.2, importance_weight=0.1, recency_weight=0.1, # Limited query expansion enable_query_expansion=True, max_expansion_terms=3, # Fast LLM llm_config=AugLLMConfig( model="gpt-3.5-turbo", temperature=0.1, max_tokens=500 ) )
Quality-focused configuration:
config = GraphRAGRetrieverConfig( memory_store_manager=store_manager, memory_classifier=classifier, kg_generator=kg_generator, # Comprehensive retrieval settings default_limit=15, max_limit=100, max_traversal_depth=4, min_relationship_confidence=0.3, enable_bidirectional_traversal=True, # Balanced scoring similarity_weight=0.3, graph_weight=0.4, importance_weight=0.2, recency_weight=0.1, # Extensive query expansion enable_query_expansion=True, max_expansion_terms=8, # High-quality LLM llm_config=AugLLMConfig( model="gpt-4", temperature=0.2, max_tokens=1000 ) )
Note
The scoring weights (similarity_weight, graph_weight, importance_weight, recency_weight) should sum to 1.0 for optimal result ranking. The system will normalize them if needed.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].