KaritKarma / News + Media
NewsForge
The press release goes in.
The newspaper comes out.
NewsForge is the AI newsroom behind khoboria.com. An eleven-stage pipeline reads 500+ global wires, dedupes by content DNA, rewrites as original journalism in fifty-plus languages, fact-verifies against multiple sources, picks the hero image from Professional Vault, and auto-publishes. No copy-paste, no template farm, no human in the critical path.
- Sources
- 500+
- Languages
- 50+
- Stages
- 11
- Live since
- khoboria.com
- ENTech
Indo-Pacific data centre buildout outpaces grid capacity, operators warn
Reuters, Nikkei, TechCrunch
3-source consensus
- BNTech
চিপ রপ্তানি নিয়ন্ত্রণ: ঢাকার সরবরাহ চক্রে নতুন চাপ
Reuters, BSS, Prothom Alo
4-source consensus
- ENMarkets
Ringgit and rupiah firm as Fed minutes flag a softer landing
Bloomberg, FT, AP
3-source consensus
or hold
What is NewsForge
NewsForge is KaritKarma's autonomous newsroom infrastructure. A .NET 10 backend, six Python services, an eleven-stage AI pipeline with Groq Llama 70B doing Master Intelligence in a single pass, and a Next.js reader site with a reporter dashboard. It runs in production at khoboria.com and publishes into WordPress, Drupal, or its own native portal. Compared with Arc XP, Brightspot, WordPress VIP, and Drupal, the four CMSs newsrooms usually evaluate, NewsForge is the only one that generates the journalism, dedupes story DNA across languages, and runs a multi-source credibility check before publication. Pricing is per portal, not the six-figure enterprise contracts the incumbents quote.
002 / What it does
Four jobs of an autonomous newsroom.
Each pillar maps to a real folder in the codebase. The product is the integration of these four, not a list of feature ticks.
500+ wires, watched in real time.
01 / SourceThe Python crawler keeps a live list of newsroom-grade sources, refreshed every minute. RSS, sitemap, and HTML fallbacks. IPv6 rotation and Cloudflare bypass keep ingestion alive when adblocked sites would refuse a desktop browser.
Source / crawler/
Eleven stages, deterministic.
02 / PipelineEach article passes the same eleven stages in order. DNA dedup, Master Intelligence rewrite, credibility, fact verification, copyright filter, story merge, image selection, SEO, publish. No stage can be skipped. The audit trail is the schema.
Source / ai-pipeline/
Multi-source consensus, never silent.
03 / VerifyAn 85% consensus threshold across credible sources gates publication. Disagreements queue for human review with the conflicting passages highlighted, not auto-resolved by a hallucinating model.
Source / ai-pipeline/stage6_credibility.py
Original journalism in 50+ languages.
04 / LocaliseMaster Intelligence rewrites, not translates. A dedicated Bengali prompt corrects transliteration of named entities, the failure mode most multilingual AI ships with. Hindi, Arabic, Spanish, Indonesian, Tagalog covered in the same pass.
Source / ai-pipeline/stage2_master_intelligence.py
003 / The pipeline
Eleven stages. One press.
Every article moves through the same eleven stages in order. The schedule is deterministic. Stage 2 fires every 40 minutes for Master Intelligence, stages 5 to 11 run every minute. The audit trail is the database, not a log file.
- Stage 01
Source monitor
Crawler watches 500+ RSS, sitemap, and HTML sources with IPv6 rotation and Cloudflare bypass.
- Stage 02
Universal extract
Beautiful Soup, Playwright, and a vision-language fallback parse the article body, byline, dateline, and lead image.
- Stage 03
Content DNA
pgvector embeddings cluster duplicates across languages so the same story never gets rewritten twice.
- Stage 04
Master Intelligence
Groq Llama 70B runs translation, categorisation, and the first rewrite in one pass. 80% cost reduction vs the original 7-call pipeline.
- Stage 05
Credibility score
Multi-source consensus check, 85% threshold. Disagreements are flagged for human review, never published silently.
- Stage 06
Fact verification
Named entities, quotes, and numerics are pinned back to the source URLs before the article can advance.
- Stage 07
Copyright filter
Quotes, opinion, and expressive prose are stripped or attributed. Facts are not copyrighted; expression is.
- Stage 08
Story merge
Updates over a 40-minute window are consolidated into a single canonical article, not a stream of duplicates.
- Stage 09
Visual selection
Semantic image search picks the lead photo from Professional Vault. Auto card generation if no licensed photo exists.
- Stage 10
SEO + Bengali polish
Bengali transliteration of named entities is corrected by a dedicated prompt. SEO title, slug, meta, and schema are emitted.
- Stage 11
Auto publish
Pushes to WordPress, the native portal, or your CMS. Reporter celebrations, share cards, and audit trail close the loop.
004 / Architecture
Six services. One newsroom.
The whole platform is a .NET 10 backend, four Python 3.11 services, and one Next.js 15 frontend, behind a Traefik proxy with Let's Encrypt. Postgres 18 with pgvector underneath.
frontend/
backend/
crawler/
ai-pipeline/
customer-portal-ai-processing/
orchestrator/
005 / Humans, when it matters
Autonomous, with a reporter desk bolted on.
Field reporters file straight into NewsForge through a dedicated dashboard. Submissions skip the 40-minute build-up window and run on the priority lane. When the story publishes, a full celebration screen confirms the byline and shows live share links.
The hero image for every published story is selected from Professional Vault, KaritKarma's central DAM. Faces are matched against one shared person directory, so the same public figure resolves to the same entity across every article.
Reporter dashboard
File text, photos, and a completeness score. Real-time status from draft to published.
frontend/src/app/reporter/
Editor review queue
Failures of the 85% credibility check land here with the conflicting passages highlighted.
backend / review
Hero from Professional Vault
Semantic image search against the shared library. Person-aware matching.
ai-pipeline / stage8 + PV
WordPress auto-publisher
Publishes into existing WordPress sites alongside the native Next.js portal.
portal-auto-publisher/
006 / Comparison
NewsForge vs the CMS incumbents.
Compiled from the public product pages of Arc XP, Brightspot, WordPress VIP, and Drupal as of May 2026. Attributable capabilities only, no marketing claims. NewsForge sits above these as a content production engine that can publish into them or replace them entirely with its own native portal.
| Capability | NewsForge | Arc XP | Brightspot | WP VIP | Drupal |
|---|---|---|---|---|---|
| Autonomous content generation, not just CMS | Bolt-on | ||||
| Multi-source credibility scoring built in | |||||
| Original-rewrite localisation in 50+ languages | Translation only | Translation only | Plugin | Module | |
| Bengali named-entity transliteration handled in-engine | |||||
| pgvector DNA dedup across languages | |||||
| Reporter app with live publish celebration | Editor only | Editor only | WP Admin | Admin UI | |
| Self-hostable, owned hardware | On-prem available, six-figure licence | ||||
| Pricing model | Per portal, KaritKarma SaaS | Enterprise contract | Enterprise contract | From USD 25,000/mo | Free + integrator |
Sources / arcxp.com / brightspot.com / wpvip.com / drupal.org / May 2026
007 / In the wild
khoboria.com runs on this exact stack.
A Bengali-first news portal publishing daily without a manual editorial pipeline. Every article on the homepage has passed the eleven stages above, has a multi-source consensus, and links back to its sources in the audit log.
008 / Questions
Frequently asked.
Mirrored in FAQPage JSON-LD so search and answer engines can lift these verbatim.
What is NewsForge?
Does NewsForge replace my CMS, or sit alongside it?
How does the 11-stage pipeline work, and is it truly autonomous?
Does NewsForge handle Bengali content properly?
How does NewsForge stay copyright-compliant?
How does NewsForge compare to Arc XP, Brightspot, WordPress VIP, and Drupal?
Launch your newsroom
One stack. Every wire. Tomorrow's edition, already on press.
Brief us on the language, the desk, and the publish target. You get a NewsForge tenant with the eleven-stage pipeline live, a reporter dashboard, a hero-image pool from Professional Vault, and a Bengali transliteration prompt that has shipped in production.