prebuilt.tldr2.tools¶

Tools for the News Research Agent.

This module defines all tools used by the news research agent for web searching, content extraction, and analysis operations.

Tools are implemented as Pydantic models with proper typing and documentation for use with LangChain’s tool system.

Examples

>>> from news_research.tools import web_search, extract_content
>>> results = web_search.invoke({"query": "AI news", "max_results": 5})
>>> content = extract_content.invoke({"url": "https://example.com/article"})

Note

All tools follow LangChain tool patterns and return structured data compatible with the agent’s models.

Functions¶

analyze_relevance(article_title, article_description, ...)

Analyze the relevance of an article to the research topic.

batch_process_articles(urls[, operation, max_concurrent])

Process multiple articles concurrently.

check_source_credibility(source_name)

Check the credibility rating of a news source.

extract_content(url[, timeout])

Extract full text content from a news article URL.

filter_by_date(articles[, days_ago])

Filter articles by publication date.

web_search(query[, max_results, sources, from_date, ...])

Search for news articles using NewsAPI.

Module Contents¶

prebuilt.tldr2.tools.analyze_relevance(article_title, article_description, search_query, research_topic)¶

Analyze the relevance of an article to the research topic.

This tool uses heuristics to score how relevant an article is to the research topic and search query.

Parameters:
  • article_title (str) – Title of the article

  • article_description (str) – Brief description of the article

  • search_query (str) – Query used to find the article

  • research_topic (str) – Main topic being researched

Returns:

Dictionary with relevance score and explanation

Return type:

Dict[str, Any]

Note

This is a simplified heuristic. In production, this could use an LLM or more sophisticated NLP techniques.

async prebuilt.tldr2.tools.batch_process_articles(urls, operation='extract', max_concurrent=5)¶

Process multiple articles concurrently.

This tool enables efficient batch processing of multiple articles for extraction or analysis operations.

Parameters:
  • urls (List[str]) – List of article URLs to process

  • operation (str) – Type of operation (‘extract’ or ‘analyze’)

  • max_concurrent (int) – Maximum number of concurrent operations

Returns:

Dictionary with results for each URL and summary statistics

Return type:

Dict[str, Any]

Examples

>>> results = await batch_process_articles(
...     urls=["url1", "url2", "url3"],
...     operation="extract"
... )
prebuilt.tldr2.tools.check_source_credibility(source_name)¶

Check the credibility rating of a news source.

This tool provides credibility information about news sources based on a predefined list of ratings.

Parameters:

source_name (str) – Name of the news source to check

Returns:

Dictionary with credibility score and details

Return type:

Dict[str, Any]

Note

In production, this would connect to a media bias/credibility API

prebuilt.tldr2.tools.extract_content(url, timeout=10)¶

Extract full text content from a news article URL.

This tool fetches the web page and extracts the main article content using BeautifulSoup.

Parameters:
  • url (str) – URL of the article to extract

  • timeout (int) – Request timeout in seconds

Returns:

Dictionary with extracted content and metadata

Return type:

Dict[str, Any]

Examples

>>> content = extract_content("https://example.com/article")
>>> print(f"Extracted {content['word_count']} words")
prebuilt.tldr2.tools.filter_by_date(articles, days_ago=7)¶

Filter articles by publication date.

This tool filters a list of articles to include only those published within the specified number of days.

Parameters:
  • articles (List[Dict[str, Any]]) – List of article dictionaries with ‘published_at’ field

  • days_ago (int) – Number of days to look back

Returns:

Filtered list of articles

Return type:

List[Dict[str, Any]]

Search for news articles using NewsAPI.

This tool searches for news articles based on the provided query and filters. It returns metadata about matching articles.

Parameters:
  • query (str) – Search query string

  • max_results (int) – Maximum number of results to return

  • sources (Optional[str]) – Comma-separated list of news sources

  • from_date (Optional[str]) – Start date for search (YYYY-MM-DD)

  • to_date (Optional[str]) – End date for search (YYYY-MM-DD)

Returns:

Dictionary containing article metadata and search info

Return type:

Dict[str, Any]

Examples

>>> results = web_search("AI healthcare", max_results=5)
>>> print(f"Found {results['total_results']} articles")