Name: EarningsCall.ai
Author: EarningsCall.ai

Every quarter we watch thousands of companies step up to the microphone, read their numbers, and answer analyst questions. Turning those conversations into actionable signals in minutes—not days—is the promise behind EarningsCall.ai. In this post we peel back the layers on the transcript pipeline and highlight how we use Neon, Prisma, and ParadeDB-powered search to keep the entire corpus queryable the moment a new call lands.

The challenge we set out to solve

Freshness: capture a new transcript within minutes of the call ending.
Structure: normalize wildly different transcript formats into a consistent speaker/paragraph model that downstream features can consume.
Searchability: deliver ranked, highlighted keyword matches across the entire database in seconds.
Cost & maintainability: run the stack on serverless primitives with an operational footprint small enough for a team of one.

Architecture at a glance

Event listener: an external Lambda polls data vendors and webhooks our Next.js API whenever a call is finished.
Ingestion worker: /src/lib/services/earnings.ts turns raw vendor payloads into canonical EarningsCall rows in Neon via Prisma.
Object storage: Vercel Blob holds the verbose JSON payloads so we can rehydrate or reprocess without hammering the database.
Neon search layer: ParadeDB’s @@@ operator and snippet() helper power ranked, highlighted keyword search.
Product experiences: /search, AI summaries, and alerts pull from the same datastore to surface insights everywhere in the app.

Architecture diagram placeholder

Why we needed our own earnings call transcript database

Off-the-shelf tools rarely let you remix the data the way our users expect. We wanted a purpose-built earnings call transcript database that:

Stores every paragraph and speaker turn in a normalized schema so we can power everything from smart summaries to compliance-ready exports.
Indexes metadata (symbol, quarter, exchange) alongside full text, making filters instant without shipping data to a separate warehouse.
Handles tens of thousands of records but still lives inside a serverless footprint—Neon gives us Postgres we can pause and resume without fuss.

Because we own the pipeline end to end, we can add new fields (sentiment scores, entity extraction, AI hints) without waiting on a third party to backfill their catalog.

1. Watching for new calls

Our Lambda service subscribes to earnings calendars and notifies the app once a webcast transcript is available. The webhook hits an authenticated cron endpoint (/src/app/api/cron/stock/alert/keyword/route.ts) that simply delegates to updateTranscriptOfCalendar in /src/lib/services/earnings.ts. That function cross-references active alerts, pulls the day’s confirmed events, and skips anything we already indexed—critical for idempotency when a vendor retries.

Inside the helper we fetch the raw transcript via the earningscall SDK, flatten the speaker turns into a normalized string, and then perform a dual write: prisma.earningsCall.create() persists the structured record in Neon while putFMPTranscript and putEarningsCallTranscript mirror the raw JSON to Vercel Blob. Keeping both copies lets us rehydrate the pipeline without hammering Postgres.

2. Normalizing transcripts with Prisma + Neon

All transcript metadata and the searchable body live in the EarningsCall model declared in prisma/schema.prisma. Neon’s storage keeps the dataset serverless and inexpensive, while Prisma abstracts the SQL plumbing and connection pooling (via Neon’s HTTP driver) so our ingestion worker can stay inside Vercel’s runtime limits. A unique constraint on (symbol, year, quarter) guarantees we never duplicate a call.

3. Capturing the raw payloads in object storage

Our ingestion job immediately mirrors transcripts to Blob storage by writing each call to transcripts/<symbol>-<year>-<quarter>-ec.json. Those blobs give us a canonical audit trail, let us replay ingestion if a vendor changes formatting, and keep the database leaner by storing only the normalized text we need for search.

4. Building search on top of Neon

Keyword search is the real differentiator. Instead of piping transcripts to yet another external service, we leaned on Neon’s support for ParadeDB, a PG-compatible extension that gives us fast, ranked search primitives. searchEarningsContent (/src/lib/services/earnings.ts) executes a raw SQL query that leans on the @@@ operator for full-text match, paradedb.score for ranking, and paradedb.snippet for HTML-ready highlights. Because everything stays inside Neon, we get ACID guarantees and retention for the same price we already pay to store transcripts.

5. Wiring the API and UI

The search API is intentionally thin: src/app/api/search/keyword/route.ts authenticates the user, forwards the query params to searchEarningsContent, and relays pagination metadata from Neon. On the client, search/result/page.tsx reuses helpers like parseEarningsCall and searchKeyword to render the transcript with highlights so analysts can jump straight to the juiciest answers. The same dataset powers alerting, AI summaries, and peer comparisons because we keep transcripts in a single canonical store.

Operating the pipeline

Monitoring: the Lambda and cron endpoint emit logs when a symbol is skipped, ingested, or fails—handy when a vendor rate-limits us.
Backfills: rerun ingestion for a symbol/year pair and Prisma’s unique constraint ensures we UPSERT cleanly.
Latency: ingesting a typical transcript (8–12k words) averages 4 seconds end-to-end, including ParadeDB index updates.
Cost: Neon’s autoscaling keeps storage + compute under $50/month for ~15k transcripts, while Blob adds pennies for the archived JSON.

What’s next

Semantic search: store embeddings alongside ParadeDB indexes for hybrid relevance.
Richer catalog: extend the earnings call transcript database with management bios, macro themes, and alternative data cross-links.
Inline entities: enrich transcripts with tickers, products, and geographies so search can filter by more than raw text.
Streaming alerts: plug the same search backend into our alerting engine for seconds-latency notifications.

If you’re building something similar and want to trade notes on ParadeDB, Neon, or transcript ingestion, reach out. We’re always happy to compare observation stacks—and swap war stories about unruly webcast formats.

How We Built Our Earnings Call Transcript Search Stack