RSS Summarizer

Automated news pipeline that ingests articles from Portuguese/Brazilian sources, processes them with AI for summarization and categorization, and publishes optimized posts to Telegram.

Client:

Portugalist, Brazil

Project Overview

The client engaged Axis to design a cost-effective and resilient news automation system. The solution parses raw RSS feeds (Globo, Sapo, Sapo AF), filters irrelevant entries, leverages LLMs for extraction and summarization, and delivers polished, formatted posts with titles, summaries, hashtags, and source links directly to Telegram. The system was carefully engineered to reduce API usage costs while maintaining consistent quality output.

Challenge

  • Parsing and formatting diverse RSS feeds with inconsistent structures.

  • Minimizing costly AI API calls without reducing post quality.

  • Handling multilingual content (Portuguese + Brazilian variations) and filtering out noise (ads, irrelevant characters).

  • Structuring reliable, JSON-like outputs from LLMs despite their inherent variability.

  • Meeting Telegram-specific constraints (post length limits, caption formatting).

Tech Stack

  • Artificial Intelligence: Llama family (via Together.ai API) for summarization, keyword/hashtag generation, and content optimization.

  • Backend Logic: Python-based parsers and processing pipeline.

  • Data Management: Lightweight database for deduplication, freshness tracking, and automated cleanup.

  • Integration & Delivery: Telegram Bot API for final post publishing.

  • Optimization Tools: Prompt engineering, configurable system parameters (config file) for fast iteration and tuning.

Solution

Axis delivered an optimized multi-stage system:

  • Smart Ingestion: Custom RSS parsers for Globo/Sapo with filters to skip short, ad-like content and prioritize longer, higher-value articles.

  • Cost Control: Changed logic from “AI on every article” to “AI only on scheduled-to-publish articles,” drastically lowering API expenses.

  • Robust AI Prompts: Structured prompts enforced clean separation of output fields (Title, Description, Hashtags) and reduced errors like malformed JSON.

  • Language & Format Filtering: Logic added to remove unwanted symbols, normalize hashtags, and enforce consistency across dialects.

  • Post Formatting: Telegram-ready output with bold titles, proper paragraphing, hashtags, and link attribution.

  • Error Handling: Fail-safes for missing fields, oversized posts, and empty AI responses.

Interested in building an AI-powered product recommendation?

Interested in building an AI-powered product recommendation or try-on experience?