mcp.utils.extract_mcp_github_repos ================================== .. py:module:: mcp.utils.extract_mcp_github_repos .. autoapi-nested-parse:: Enhanced MCP Repository Extractor with README Processing. This script: 1. Extracts repository URLs from awesome-mcp-servers 2. Downloads and processes README files 3. Converts to LangChain Documents with metadata 4. Organizes resources for agent access Attributes ---------- .. autoapisummary:: mcp.utils.extract_mcp_github_repos.console Classes ------- .. autoapisummary:: mcp.utils.extract_mcp_github_repos.ExtractionStats mcp.utils.extract_mcp_github_repos.MCPCategory mcp.utils.extract_mcp_github_repos.MCPLanguage mcp.utils.extract_mcp_github_repos.MCPPlatform mcp.utils.extract_mcp_github_repos.MCPRepositoryExtractor mcp.utils.extract_mcp_github_repos.MCPScope mcp.utils.extract_mcp_github_repos.MCPServerDocument mcp.utils.extract_mcp_github_repos.MCPServerMetadata Functions --------- .. autoapisummary:: mcp.utils.extract_mcp_github_repos.create_agent_loader mcp.utils.extract_mcp_github_repos.main Module Contents --------------- .. py:class:: ExtractionStats(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Statistics for extraction process. .. py:attribute:: categories :type: dict[str, int] :value: None .. py:attribute:: extraction_duration :type: float | None :value: None .. py:attribute:: failed_extractions :type: int :value: 0 .. py:attribute:: languages :type: dict[str, int] :value: None .. py:attribute:: successfully_extracted :type: int :value: 0 .. py:attribute:: total_found :type: int :value: 0 .. py:class:: MCPCategory Bases: :py:obj:`str`, :py:obj:`enum.Enum` MCP Server Categories. .. py:attribute:: AGGREGATORS :value: 'Aggregators' .. py:attribute:: AI_SERVICES :value: 'AI Services' .. py:attribute:: ART_LITERATURE :value: 'Art & Literature' .. py:attribute:: CLOUD_PLATFORMS :value: 'Cloud Platforms' .. py:attribute:: CLOUD_STORAGE :value: 'Cloud Storage' .. py:attribute:: COMMUNICATION :value: 'Communication' .. py:attribute:: DATABASES :value: 'Databases' .. py:attribute:: DATA_VISUALIZATION :value: 'Data Visualization' .. py:attribute:: DEVELOPMENT_TOOLS :value: 'Development Tools' .. py:attribute:: FILE_SYSTEMS :value: 'File Systems' .. py:attribute:: FINANCE :value: 'Finance' .. py:attribute:: GAMING :value: 'Gaming' .. py:attribute:: IDENTITY :value: 'Identity' .. py:attribute:: IOT :value: 'IoT' .. py:attribute:: LANGUAGE_TRANSLATION :value: 'Language & Translation' .. py:attribute:: LOCATION_SERVICES :value: 'Location Services' .. py:attribute:: MARKETING :value: 'Marketing' .. py:attribute:: MONITORING :value: 'Monitoring' .. py:attribute:: NOTE_TAKING :value: 'Note Taking' .. py:attribute:: OTHER :value: 'Other' .. py:attribute:: RESEARCH_DATA :value: 'Research & Data' .. py:attribute:: SANDBOX_VIRTUALIZATION :value: 'Sandbox & Virtualization' .. py:attribute:: SEARCH_WEB :value: 'Search & Web' .. py:attribute:: SECURITY :value: 'Security' .. py:attribute:: SOCIAL_MEDIA :value: 'Social Media' .. py:attribute:: SYSTEM_AUTOMATION :value: 'System Automation' .. py:attribute:: VERSION_CONTROL :value: 'Version Control' .. py:attribute:: WORKFLOW_AUTOMATION :value: 'Workflow Automation' .. py:class:: MCPLanguage Bases: :py:obj:`str`, :py:obj:`enum.Enum` Programming Languages. .. py:attribute:: CSHARP :value: 'C#' .. py:attribute:: C_CPP :value: 'C/C++' .. py:attribute:: GO :value: 'Go' .. py:attribute:: JAVA :value: 'Java' .. py:attribute:: OTHER :value: 'Other' .. py:attribute:: PYTHON :value: 'Python' .. py:attribute:: RUST :value: 'Rust' .. py:attribute:: TYPESCRIPT_JAVASCRIPT :value: 'TypeScript/JavaScript' .. py:class:: MCPPlatform Bases: :py:obj:`str`, :py:obj:`enum.Enum` Supported Platforms. .. py:attribute:: CROSS_PLATFORM :value: 'Cross-Platform' .. py:attribute:: LINUX :value: 'Linux' .. py:attribute:: MACOS :value: 'macOS' .. py:attribute:: WINDOWS :value: 'Windows' .. py:class:: MCPRepositoryExtractor(output_dir: str = 'agent_resources/mcp_servers') Enhanced MCP Repository Extractor. .. py:method:: extract_all() -> list[MCPServerDocument] :async: Main extraction method. .. py:method:: extract_repositories_from_readme() -> list[MCPServerMetadata] :async: Extract repository information from the awesome-mcp-servers. README. .. py:method:: fetch_github_metadata(metadata: MCPServerMetadata) -> None :async: Fetch additional metadata from GitHub API. .. py:method:: fetch_readme_content(metadata: MCPServerMetadata) -> str | None :async: Fetch README content from GitHub. .. py:method:: generate_statistics_report(documents: list[MCPServerDocument]) -> None Generate statistics report. .. py:method:: process_repository(metadata: MCPServerMetadata) -> MCPServerDocument | None :async: Process a single repository. .. py:method:: save_documents(documents: list[MCPServerDocument]) -> None Save documents in various formats. .. py:attribute:: category_mappings .. py:attribute:: docs_dir .. py:attribute:: language_indicators .. py:attribute:: metadata_dir .. py:attribute:: output_dir .. py:attribute:: platform_indicators .. py:attribute:: raw_dir .. py:attribute:: scope_indicators .. py:attribute:: session :value: None .. py:attribute:: source_url :value: 'https://github.com/TensorBlock/awesome-mcp-servers' .. py:attribute:: stats .. py:class:: MCPScope Bases: :py:obj:`str`, :py:obj:`enum.Enum` Server Scope. .. py:attribute:: CLOUD :value: 'cloud' .. py:attribute:: EMBEDDED :value: 'embedded' .. py:attribute:: LOCAL :value: 'local' .. py:class:: MCPServerDocument(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Complete MCP Server Document. .. py:method:: compute_content_hash() -> str Compute SHA256 hash of README content. .. py:method:: to_langchain_document() -> langchain_core.documents.Document Convert to LangChain Document. .. py:attribute:: content_hash :type: str | None :value: None .. py:attribute:: extracted_at :type: datetime.datetime :value: None .. py:attribute:: metadata :type: MCPServerMetadata :value: None .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: readme_content :type: str | None :value: None .. py:class:: MCPServerMetadata(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel` Metadata for an MCP Server. .. py:method:: get_unique_id() -> str Generate unique ID for this server. .. py:method:: to_langchain_metadata() -> dict[str, Any] Convert to LangChain Document metadata format. .. py:method:: validate_repo_url(v: str) -> str :classmethod: Validate GitHub repository URL. .. py:attribute:: api_base_url :type: str | None :value: None .. py:attribute:: category :type: MCPCategory :value: None .. py:attribute:: description :type: str | None :value: None .. py:attribute:: is_official :type: bool :value: None .. py:attribute:: languages :type: list[MCPLanguage] :value: None .. py:attribute:: last_updated :type: datetime.datetime | None :value: None .. py:attribute:: license :type: str | None :value: None .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: name :type: str :value: None .. py:attribute:: owner :type: str :value: None .. py:attribute:: platforms :type: list[MCPPlatform] :value: None .. py:attribute:: readme_url :type: str | None :value: None .. py:attribute:: repo_name :type: str :value: None .. py:attribute:: repo_url :type: str :value: None .. py:attribute:: scopes :type: list[MCPScope] :value: None .. py:attribute:: stars :type: int | None :value: None .. py:function:: create_agent_loader(output_dir: str = 'agent_resources/mcp_servers') -> callable Create a loader function for agents to access MCP documents. .. py:function:: main() :async: Main function. .. py:data:: console