All Case Studies
Case Study / NewsForge

NewsForge
the autonomous newsroom behind khoboria.com.

11-stage pipeline. .NET 10 plus a Python AI pipeline. Postgres with pgvector for content DNA. Groq Llama 70B for Master Intelligence. Bengali as a first-class output language with a named live customer.

11
Pipeline stages
50+
Output languages
500+
Source feeds
Live
khoboria.com
One named live customer (khoboria.com, Bengali-first daily publishing). Source attribution and audit trails on every article.
What is the NewsForge case study

A multilingual newsroom that publishes without a human editor.

The NewsForge case study documents an 11-stage pipeline that ingests global news, deduplicates by content DNA across languages, rewrites as original journalism in 50-plus languages, fact-verifies against trusted sources, selects a hero image from Professional Vault, and auto-publishes to a CMS or a native portal.

The reference customer is khoboria.com, a Bengali-first portal publishing daily without a manual editor. Bengali is treated as first-class because the dedicated transliteration-correction step is where most multilingual AI tools break on named entities.

The Challenge

Manual newsrooms cannot scale.

Traditional newsrooms rely on human journalists for every step: monitoring sources, writing articles, fact-checking, translating, and publishing. The model breaks at scale. News cycles are 24/7 but human teams are not. Costs rise linearly with coverage and languages multiply the problem.

The 24/7 news cycle

Breaking news does not wait for office hours. Manual newsrooms miss stories overnight, on weekends, and during holidays.

Language barrier at scale

Covering news in 50-plus languages requires teams of translators. Quality varies, speed suffers, costs balloon.

Linear cost scaling

Every new language, source, or publication channel needs more staff. Revenue does not scale the same way.

The Pipeline

11 stages, source to published.

Every article passes through 11 automated stages. Master Intelligence (stage 4) consolidates translation, categorisation, and the first rewrite into a single Groq Llama 70B call, collapsing what used to take seven calls into one.

01

Source Monitoring

500-plus global sources monitored in real time. RSS, APIs, social feeds, and web scraping with rate-limit compliance.

02

Normalisation

Raw inputs are coerced into a uniform article shape with provenance fields preserved for audit.

03

Content DNA (pgvector)

Embeddings cluster duplicate stories across languages and sources. One event becomes one cluster, not five rewrites.

04

Master Intelligence (Groq Llama 70B)

Translation, categorisation, and the first rewrite in a single call. ~80 percent cost reduction versus the original 7-call pipeline.

05

Fact Verification

Multi-source cross-referencing with credibility scoring. Claims checked against trusted databases and prior reporting.

06

Localisation Polish

Per-language editorial polish. Bengali gets a dedicated transliteration-correction prompt for named entities.

07

SEO Optimisation

Headline testing, meta descriptions, keyword placement, and structured data markup for every article.

08

Image Selection (Professional Vault)

Face recognition and CLIP semantic search against the licensed Professional Vault asset library.

09

Quality Assurance

Automated editorial review covering grammar, tone, bias detection, and factual consistency before publish.

10

Multi-CMS Publishing

WordPress, Ghost, Drupal, Contentful, Strapi, or NewsForge native. Scheduled or instant, with rollback support.

11

Analytics

Post-publish performance tracking. Engagement, SEO ranking, and feedback loops back into Master Intelligence prompts.

NewsForge vs the alternatives

Why a pipeline beats a content mill.

Versus a manual newsroom, a generic content mill, or a wire-only feed, here is what the 11-stage pipeline does differently.

CapabilityNewsForgeManual newsroomContent millWire service only
Publishes 24/7 with no overnight gap
Bengali as first-class output languageDepends
Cross-language deduplication (pgvector DNA)
Editorial QA gate before publish
Multi-CMS native publishing5 CMS connectorsManual uploadAPIWire feed
Auditable source attribution per article
Live reference customerkhoboria.comN/AVariesVaries
Live Customer

khoboria.com runs 100 percent on NewsForge.

khoboria is a live Bengali-first multilingual news site powered entirely by NewsForge. No manual editor in the default loop. Every article sourced, deduplicated, rewritten, fact-checked, image-selected, and published by the 11-stage pipeline.

Bengali
Primary language
Daily
Publishing cadence
0
Manual editors in default loop
11
Pipeline stages on every article

Frequently asked

NewsForge, asked plainly.

What is the NewsForge case study?
The NewsForge case study documents how an 11-stage autonomous pipeline becomes a working multilingual newsroom. The reference customer is khoboria.com, a Bengali-first news portal that publishes daily on NewsForge with no manual editor in the loop. NewsForge is built on .NET 10 Minimal APIs for the backend, a Python 3.11 asynchronous AI pipeline, PostgreSQL 18 with pgvector for content DNA deduplication, Redis 8 for queue state, MinIO for assets, and Groq Llama 70B for the Master Intelligence stage that handles translation, categorisation, and the first rewrite in a single call.
Who is khoboria.com and why is it called out?
khoboria.com is a Bengali-first multilingual news portal that runs 100 percent on NewsForge in production. We call it out by name because it is the live reference customer for the platform and publishes publicly every day. khoboria proves the harder claim, that the 11-stage pipeline can handle Bengali as a first-class output language with a dedicated rewrite path and a separate transliteration-correction prompt for named entities, which is the failure mode most multilingual AI tools ship with.
How does the 11-stage NewsForge pipeline work?
Stages 1 and 2 ingest from 500-plus global wires and normalise them. Stage 3 (Content DNA) runs pgvector embeddings across languages so the same story is not rewritten twice. Stage 4 (Master Intelligence) is a single Groq Llama 70B call that handles translation, categorisation, and the first rewrite, cutting pipeline cost by roughly 80 percent versus the original 7-call design. Subsequent stages do fact verification, image selection from Professional Vault, format adaptation, automated editorial review (grammar, tone, bias), multi-CMS publishing (WordPress, Ghost, Drupal, Contentful, Strapi, or NewsForge native), and post-publish analytics.
What is the customer status of NewsForge?
NewsForge is in production with one named live customer (khoboria.com publishing daily in Bengali) and active conversations with publishers in additional languages. We label the platform live and the named customer live; new-publisher onboarding is qualified as deployment-ready rather than as installed customers until each go-live is signed off. We do not show anonymous customer logos.
Is the NewsForge output truly autonomous or does a human review every story?
Truly autonomous in the default configuration. Quality Assurance is stage 9 of the pipeline and runs automated editorial review (grammar, tone, bias detection, factual consistency) before publishing. Publishers can opt into a human-in-the-loop gate before stage 10 publishing, in which case the pipeline pauses on a queue until an editor approves; this is configured per publisher and per topic taxonomy. khoboria.com runs in the autonomous default.
Can NewsForge publish into an existing CMS or do publishers have to migrate?
NewsForge publishes into existing CMSes via stage 10 connectors for WordPress, Ghost, Drupal, Contentful, and Strapi, with scheduled or instant delivery and rollback support. Publishers that prefer a turnkey portal can run NewsForge native, which is what khoboria.com does. The pipeline is identical either way; only the final write target changes.

Explore NewsForge

Automate your newsroom, source to publish.

From a single source to a multilingual published article, fully automated and fully auditable. See how NewsForge maps to your CMS and your language plan.