agents.memory.graph_rag_retriever

Graph RAG Retriever for Memory System.

This module implements a Graph RAG retriever that combines knowledge graph traversal with vector similarity search to provide comprehensive memory retrieval with relationship context and semantic understanding.

Classes

GraphRAGResult

Comprehensive result from Graph RAG retrieval combining knowledge graph and vector search.

GraphRAGRetriever

Advanced Graph RAG retriever that combines knowledge graph traversal with vector similarity search.

GraphRAGRetrieverConfig

Configuration for Graph RAG retriever with comprehensive customization options.

Module Contents

class agents.memory.graph_rag_retriever.GraphRAGResult(/, **data)

Bases: pydantic.BaseModel

Comprehensive result from Graph RAG retrieval combining knowledge graph and vector search.

This class encapsulates all information from a Graph RAG retrieval operation, including retrieved memories, graph traversal results, scoring information, and performance metrics for analysis and optimization.

Parameters:

data (Any)

query

Original user query that was processed

memories

Retrieved memories from both vector and graph sources

start_entities

Initial entities identified in the query for graph traversal

traversed_entities

All entities explored during graph traversal

relationship_paths

Relationship paths discovered during graph traversal

graph_nodes_explored

Number of graph nodes explored (for testing compatibility)

graph_paths

Alias for relationship_paths (for backward compatibility)

similarity_scores

Vector similarity scores for each memory

graph_scores

Graph centrality scores for each memory

final_scores

Combined final scores used for ranking

total_time_ms

Total processing time for the entire operation

graph_traversal_time_ms

Time spent on graph traversal

vector_search_time_ms

Time spent on vector search

query_intent

Analyzed query intent and characteristics

expansion_terms

Query expansion terms used for enhanced retrieval

Examples

Accessing Graph RAG results:

result = await retriever.retrieve_memories(
    "What are the connections between Python and machine learning?"
)

print(f"Query: {result.query}")
print(f"Retrieved {len(result.memories)} memories")
print(f"Explored {result.graph_nodes_explored} graph nodes")
print(f"Found {len(result.relationship_paths)} relationship paths")
print(f"Total time: {result.total_time_ms:.1f}ms")

# Access individual memories with scores
for i, memory in enumerate(result.memories):
    sim_score = result.similarity_scores[i]
    graph_score = result.graph_scores[i]
    final_score = result.final_scores[i]

    print(f"Memory {i+1}: {memory['content'][:100]}...")
    print(f"  Similarity: {sim_score:.2f}, Graph: {graph_score:.2f}, Final: {final_score:.2f}")

Analyzing graph traversal results:

result = await retriever.retrieve_memories("machine learning algorithms")

print(f"Starting entities: {[e.name for e in result.start_entities]}")
print(f"Traversed entities: {[e.name for e in result.traversed_entities]}")

# Analyze relationship paths
for i, path in enumerate(result.relationship_paths):
    print(f"Path {i+1}:")
    for rel in path:
        print(f"  {rel.source_id} -> {rel.target_id} ({rel.relationship_type})")

Performance analysis:

result = await retriever.retrieve_memories("complex query")

print(f"Performance breakdown:")
print(f"  Graph traversal: {result.graph_traversal_time_ms:.1f}ms")
print(f"  Vector search: {result.vector_search_time_ms:.1f}ms")
print(f"  Total time: {result.total_time_ms:.1f}ms")

# Query expansion analysis
if result.expansion_terms:
    print(f"Query expanded with: {result.expansion_terms}")

Getting top memories:

result = await retriever.retrieve_memories("Python programming")

# Get top 5 memories by final score
top_memories = result.get_top_memories(limit=5)

for i, memory in enumerate(top_memories):
    print(f"Top {i+1}: {memory['content'][:50]}...")

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

get_top_memories(limit=10)

Get top memories by final score with ranking.

Parameters:

limit (int) – Maximum number of memories to return

Returns:

Top memories sorted by final score

Return type:

List[Dict[str, Any]]

Examples

Get top memories:

result = await retriever.retrieve_memories("Python programming")
top_memories = result.get_top_memories(limit=5)

for i, memory in enumerate(top_memories):
    print(f"Top {i+1}: {memory['content'][:50]}...")
class agents.memory.graph_rag_retriever.GraphRAGRetriever(config)

Advanced Graph RAG retriever that combines knowledge graph traversal with vector similarity search.

The GraphRAGRetriever enhances traditional vector-based memory retrieval by leveraging knowledge graph structure to discover relevant memories through entity relationships and semantic connections. This approach provides more comprehensive and contextually rich retrieval results.

Key Features:
  • Intelligent entity identification from queries using LLM analysis

  • Multi-hop graph traversal to discover related entities and concepts

  • Hybrid scoring combining vector similarity, graph centrality, and importance

  • Query expansion with semantically related terms

  • Relationship path discovery for context understanding

  • Bidirectional graph traversal for comprehensive coverage

  • Performance optimization with configurable limits and thresholds

Retrieval Process:
  1. Query Analysis: Parse query intent and identify mentioned entities

  2. Entity Identification: Match query entities to knowledge graph nodes

  3. Graph Traversal: Explore relationships to find connected entities

  4. Memory Retrieval: Collect memories from both vector search and graph entities

  5. Scoring: Combine similarity, centrality, importance, and recency scores

  6. Ranking: Sort results by final score and return top memories

config

Configuration object with all retrieval settings

memory_store

Memory store manager for basic storage operations

classifier

Memory classifier for query intent analysis

kg_generator

Knowledge graph generator for entity and relationship data

llm

LLM runnable for query analysis and entity identification

entity_identification_prompt

Prompt template for entity identification

relationship_path_analysis_prompt

Prompt template for path analysis

Examples

Basic Graph RAG retrieval:

# Create retriever
retriever = GraphRAGRetriever(config)

# Retrieve memories with graph enhancement
result = await retriever.retrieve_memories(
    "What are the connections between Python and machine learning?"
)

print(f"Retrieved {len(result.memories)} memories")
print(f"Explored {result.graph_nodes_explored} graph nodes")
print(f"Found {len(result.relationship_paths)} relationship paths")

# Access memories with graph context
for memory in result.memories:
    graph_context = memory.get("graph_context", [])
    if graph_context:
        entities = [ctx["entity_name"] for ctx in graph_context]
        print(f"Memory connected to entities: {entities}")

Advanced retrieval with custom parameters:

# Retrieve with specific settings
result = await retriever.retrieve_memories(
    query="How do neural networks work?",
    limit=15,
    memory_types=[MemoryType.SEMANTIC, MemoryType.PROCEDURAL],
    namespace=("user", "ml", "concepts"),
    enable_graph_traversal=True,
    max_graph_depth=3
)

# Analyze scoring components
for i, memory in enumerate(result.memories):
    sim_score = result.similarity_scores[i]
    graph_score = result.graph_scores[i]
    final_score = result.final_scores[i]

    print(f"Memory {i+1}: Final={final_score:.2f} "
          f"(Sim={sim_score:.2f}, Graph={graph_score:.2f})")

Entity context exploration:

# Get comprehensive context for specific entity
context = await retriever.get_entity_context("Python")

print(f"Entity: {context['entity'].name}")
print(f"Connections: {context['total_connections']}")
print(f"Associated memories: {context['memory_count']}")

# Explore entity neighborhood
neighborhood = context['neighborhood']
for level, entities in neighborhood.get('levels', {}).items():
    print(f"Level {level}: {[e.name for e in entities]}")

Relationship path analysis:

# Find paths between entities
paths = await retriever.find_relationship_paths(
    "Python", "Machine Learning", max_depth=3
)

for i, path in enumerate(paths):
    print(f"Path {i+1}:")
    for rel in path:
        print(f"  {rel.source_id} -> {rel.target_id} ({rel.relationship_type})")

Performance monitoring:

result = await retriever.retrieve_memories("complex query")

print(f"Performance breakdown:")
print(f"  Graph traversal: {result.graph_traversal_time_ms:.1f}ms")
print(f"  Vector search: {result.vector_search_time_ms:.1f}ms")
print(f"  Total time: {result.total_time_ms:.1f}ms")

# Query expansion analysis
if result.expansion_terms:
    print(f"Query expanded with: {result.expansion_terms}")

Note

The retriever automatically balances graph traversal depth and performance based on the configuration. For large knowledge graphs, consider reducing max_traversal_depth and increasing min_relationship_confidence for better performance.

Initialize the Graph RAG retriever with comprehensive configuration.

Sets up all components needed for Graph RAG retrieval including memory stores, knowledge graph generators, LLM for query analysis, and prompt templates.

Parameters:

config (GraphRAGRetrieverConfig) – GraphRAGRetrieverConfig with all necessary components and settings

Raises:

ValueError – If required components are missing in config

Examples

Basic initialization:

config = GraphRAGRetrieverConfig(
    memory_store_manager=store_manager,
    memory_classifier=classifier,
    kg_generator=kg_generator
)

retriever = GraphRAGRetriever(config)

With custom LLM configuration:

config = GraphRAGRetrieverConfig(
    memory_store_manager=store_manager,
    memory_classifier=classifier,
    kg_generator=kg_generator,
    llm_config=AugLLMConfig(
        model="gpt-4",
        temperature=0.1,
        max_tokens=500
    )
)

retriever = GraphRAGRetriever(config)

Note

The retriever validates all required components during initialization and sets up optimized prompt templates for entity identification and relationship path analysis.

async find_relationship_paths(entity1, entity2, max_depth=3)

Find relationship paths between two entities in the knowledge graph.

This method performs a breadth-first search to discover all possible relationship paths connecting two entities within the specified depth limit. It’s useful for understanding how entities are connected and for providing context about their relationships in query responses.

Parameters:
  • entity1 (str) – Name of the first entity (source entity)

  • entity2 (str) – Name of the second entity (target entity)

  • max_depth (int) – Maximum path length to explore (default: 3)

Returns:

List of relationship paths, where each

path is a list of KnowledgeGraphRelationship objects representing the sequence of relationships connecting the two entities. Limited to 10 paths to prevent excessive computation.

Return type:

List[List[KnowledgeGraphRelationship]]

Examples

Find direct and indirect connections:

paths = await retriever.find_relationship_paths(
    "Python", "Machine Learning", max_depth=3
)

print(f"Found {len(paths)} paths between Python and Machine Learning")

for i, path in enumerate(paths):
    print(f"Path {i+1}:")
    for rel in path:
        print(f"  {rel.source_id} -> {rel.target_id} ({rel.relationship_type})")
        print(f"    Confidence: {rel.confidence:.2f}")

Analyze relationship strength:

paths = await retriever.find_relationship_paths("AI", "Ethics")

if paths:
    # Find strongest path (highest average confidence)
    strongest_path = max(paths, key=lambda p:
        sum(rel.confidence for rel in p) / len(p))

    avg_confidence = sum(rel.confidence for rel in strongest_path) / len(strongest_path)
    print(f"Strongest path has {len(strongest_path)} hops with confidence {avg_confidence:.2f}")

    # Analyze relationship types
    rel_types = [rel.relationship_type for rel in strongest_path]
    print(f"Relationship sequence: {' -> '.join(rel_types)}")

Find shortest path:

paths = await retriever.find_relationship_paths(
    "Neural Networks", "Deep Learning", max_depth=2
)

if paths:
    shortest_path = min(paths, key=len)
    print(f"Shortest path has {len(shortest_path)} hops")

    # Display path details
    for rel in shortest_path:
        print(f"{rel.source_id} --[{rel.relationship_type}]--> {rel.target_id}")

Check for no connection:

paths = await retriever.find_relationship_paths(
    "Unrelated Topic 1", "Unrelated Topic 2"
)

if not paths:
    print("No relationship paths found between these entities")
else:
    print(f"Found {len(paths)} connecting paths")

Note

The search is limited to 10 paths to prevent excessive computation on highly connected graphs. Paths are found using breadth-first search, so shorter paths are discovered first. For very large knowledge graphs, consider reducing max_depth for better performance.

async get_entity_context(entity_name)

Get comprehensive context information for a specific entity in the knowledge graph.

This method provides detailed information about an entity including its neighborhood, associated memories, and connection statistics. It’s useful for understanding the role and importance of an entity within the knowledge graph structure.

Parameters:

entity_name (str) – Name of the entity to get context for (e.g., “Python”, “Machine Learning”)

Returns:

Comprehensive entity context containing:
  • entity: The KnowledgeGraphNode object with entity details

  • neighborhood: Dictionary with entity’s neighborhood structure by depth levels

  • associated_memories: List of memories directly associated with this entity

  • total_connections: Number of entities connected to this entity

  • memory_count: Number of memories referencing this entity

  • error: Error message if entity not found

Return type:

Dict[str, Any]

Examples

Get context for a specific entity:

context = await retriever.get_entity_context("Python")

if "error" not in context:
    entity = context["entity"]
    print(f"Entity: {entity.name} ({entity.type})")
    print(f"Confidence: {entity.confidence:.2f}")
    print(f"Total connections: {context['total_connections']}")
    print(f"Associated memories: {context['memory_count']}")

    # Explore neighborhood structure
    neighborhood = context["neighborhood"]
    for level, entities in neighborhood.get("levels", {}).items():
        print(f"Level {level}: {[e.name for e in entities]}")

    # Access associated memories
    memories = context["associated_memories"]
    for memory in memories:
        print(f"Memory: {memory['content'][:100]}...")

Handle entity not found:

context = await retriever.get_entity_context("NonexistentEntity")

if "error" in context:
    print(f"Error: {context['error']}")
else:
    print(f"Found entity: {context['entity'].name}")

Analyze entity importance:

context = await retriever.get_entity_context("Machine Learning")

if "error" not in context:
    entity = context["entity"]
    connections = context["total_connections"]
    memories = context["memory_count"]

    # Calculate importance score
    importance = (connections * 0.6) + (memories * 0.4)
    print(f"Entity importance score: {importance:.2f}")

    # Analyze neighborhood diversity
    neighborhood = context["neighborhood"]
    entity_types = set()
    for level_entities in neighborhood.get("levels", {}).values():
        for entity in level_entities:
            entity_types.add(entity.type)

    print(f"Connected entity types: {list(entity_types)}")

Note

This method explores the entity’s neighborhood to depth 2 by default, which provides a good balance between comprehensiveness and performance. For very large knowledge graphs, consider the performance implications of deep neighborhood exploration.

async retrieve_memories(query, limit=None, memory_types=None, namespace=None, enable_graph_traversal=True, max_graph_depth=None)

Retrieve memories using Graph RAG approach.

Parameters:
  • query (str) – User query

  • limit (int | None) – Maximum number of memories to retrieve

  • memory_types (list[haive.agents.memory.core.types.MemoryType] | None) – Specific memory types to focus on

  • namespace (tuple[str, Ellipsis] | None) – Memory namespace to search

  • enable_graph_traversal (bool) – Whether to use graph traversal

  • max_graph_depth (int | None) – Maximum depth for graph traversal (overrides config)

Returns:

GraphRAGResult with retrieved memories and graph context

Return type:

GraphRAGResult

class agents.memory.graph_rag_retriever.GraphRAGRetrieverConfig(/, **data)

Bases: pydantic.BaseModel

Configuration for Graph RAG retriever with comprehensive customization options.

This configuration class defines all parameters needed to create and configure a GraphRAGRetriever, including core components, graph traversal settings, scoring weights, and query expansion parameters.

Parameters:

data (Any)

memory_store_manager

Manager for memory storage and retrieval operations

memory_classifier

Classifier for analyzing query intent and memory types

kg_generator

Knowledge graph generator for entity and relationship extraction

default_limit

Default number of memories to retrieve per query

max_limit

Maximum number of memories that can be retrieved

max_traversal_depth

Maximum depth for graph traversal (prevents infinite loops)

min_relationship_confidence

Minimum confidence score for relationships to traverse

enable_bidirectional_traversal

Whether to traverse relationships in both directions

similarity_weight

Weight for vector similarity score in final ranking (0.0-1.0)

graph_weight

Weight for graph centrality score in final ranking (0.0-1.0)

importance_weight

Weight for memory importance score in final ranking (0.0-1.0)

recency_weight

Weight for memory recency score in final ranking (0.0-1.0)

enable_query_expansion

Whether to enable query expansion with related terms

max_expansion_terms

Maximum number of terms to add during query expansion

llm_config

LLM configuration for query analysis and entity identification

Examples

Basic configuration:

config = GraphRAGRetrieverConfig(
    memory_store_manager=store_manager,
    memory_classifier=classifier,
    kg_generator=kg_generator,
    default_limit=10,
    max_traversal_depth=2
)

Performance-optimized configuration:

config = GraphRAGRetrieverConfig(
    memory_store_manager=store_manager,
    memory_classifier=classifier,
    kg_generator=kg_generator,

    # Faster retrieval settings
    default_limit=5,
    max_limit=20,
    max_traversal_depth=2,
    min_relationship_confidence=0.7,
    enable_bidirectional_traversal=False,

    # Similarity-focused scoring
    similarity_weight=0.6,
    graph_weight=0.2,
    importance_weight=0.1,
    recency_weight=0.1,

    # Limited query expansion
    enable_query_expansion=True,
    max_expansion_terms=3,

    # Fast LLM
    llm_config=AugLLMConfig(
        model="gpt-3.5-turbo",
        temperature=0.1,
        max_tokens=500
    )
)

Quality-focused configuration:

config = GraphRAGRetrieverConfig(
    memory_store_manager=store_manager,
    memory_classifier=classifier,
    kg_generator=kg_generator,

    # Comprehensive retrieval settings
    default_limit=15,
    max_limit=100,
    max_traversal_depth=4,
    min_relationship_confidence=0.3,
    enable_bidirectional_traversal=True,

    # Balanced scoring
    similarity_weight=0.3,
    graph_weight=0.4,
    importance_weight=0.2,
    recency_weight=0.1,

    # Extensive query expansion
    enable_query_expansion=True,
    max_expansion_terms=8,

    # High-quality LLM
    llm_config=AugLLMConfig(
        model="gpt-4",
        temperature=0.2,
        max_tokens=1000
    )
)

Note

The scoring weights (similarity_weight, graph_weight, importance_weight, recency_weight) should sum to 1.0 for optimal result ranking. The system will normalize them if needed.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].