Call Data Extraction

Automating the processing of phone call recordings to extract key appointment data using LLMs and an optimized RAG-based approach.

Client:

2doc, Belarus

Project Overview

The client engaged Axis to automate the handling of phone calls in a medical center. Calls typically contain appointment details such as doctor specialization, doctor’s name, service, date, and patient’s name. Previously, this information was processed manually, leading to delays, errors, and high operational costs.

The project aims to automatically transcribe audio recordings, extract structured entities, and integrate the results into the client’s system. Special focus is placed on reducing hallucinations from the LLM, improving extraction accuracy through context optimization, and introducing clear logic for reprocessing and manual verification in edge cases.

Challenge

  • Context limitations: A standard RAG pipeline required large context windows, which increased costs and risk of errors.

  • Specialization accuracy: Correctly identifying the right medical specialization and doctor in a large database of similar entries.

  • Integration barrier: Accessing call recordings via the call center’s API became the primary technical blocker.

  • Data quality: Some cases required manual validation and a controlled retry workflow to avoid incorrect outputs.

Tech Stack

  • LLM: Google Gemini (for transcription and entity extraction).

  • Databases: PostgreSQL (doctor and service data), pgvector for embedding storage.

  • RAG: Optimized multi-step pipeline with API-driven context narrowing.

  • Infrastructure: Google Cloud (Poland) for cost-efficient deployment.

  • Integration: Call center API for retrieving audio files; Discord notifications for error handling.

Solution

Axis designed a three-stage optimized extraction workflow:

  1. Transcription & Initial Extraction — Gemini transcribes audio and identifies patient name, date, time, call state, and specialization. To minimize hallucinations, prompts always include the current date and a complete list of valid specializations.

  2. Context Narrowing — Based on the chosen specialization, the system queries the API to retrieve a shortened list of doctors and services.

  3. Final Extraction — Gemini reprocesses the call with the reduced context to accurately determine the doctor’s name and the specific service.

A QA layer was implemented: if data is missing or incorrect, the system retries processing up to two times. If unsuccessful, an error notification is sent to Discord for manual correction.

Interested in building an AI-powered product recommendation?

Interested in building an AI-powered product recommendation or try-on experience?