Utkarsh M — Principal Software Engineer

Problem

A sexual health education platform needed to reach teenagers in conservative school environments. Network filters blocked content. Bandwidth was unreliable. The content itself required careful moderation to stay accurate and age-appropriate.

Context

The platform had real medical content written by doctors. It also had to pass through ISP-level keyword filters that couldn't distinguish between clinical terminology and prohibited content. Schools were blocking the domain entirely.

Constraint

Must work on 2G connections. Must survive keyword filters without compromising medical accuracy. Content moderation had to be human-reviewed but scale to thousands of new submissions per month.

The decision

What we chose and why.

Rebuilt as an offline-first PWA with content stored as a semantic knowledge graph rather than keyword-indexed documents. Content was tagged with semantic relationships, not raw text. Search worked against the graph, not the content. Filtering systems saw structure, not sensitive terms.

Tradeoffs

Semantic knowledge graphoverCMS with keyword search

Keyword-based content is trivially blocked by network filters. Graph traversal exposes relationships, not sensitive terms.

Offline-first PWAoverServer-rendered pages

In low-bandwidth environments, connectivity is intermittent. Content cached during a good connection must be available during a bad one.

Human-in-the-loop moderation at confidence thresholdsoverFully automated moderation

Automated moderation on medical content either over-blocks legitimate content or under-catches harmful content. Humans review the ambiguous cases; automation handles the clear ones.

Architecture

Service

Data Store

Queue / Bus

Client

Click any node to inspect

The failure

The first moderation pipeline was fully automated with a conservative classifier. It removed 31% of legitimate medical content flagged as sensitive. Doctors who contributed content stopped submitting.

DISCOVERED — After two months of declining content quality. A doctor who stopped contributing emailed to explain why.

IMPACT — Lost 40% of content contributors. Had to rebuild the content library from scratch.

Iteration

Added confidence scores to every moderation decision. Content above 0.95 confidence is auto-approved. Content below 0.60 is auto-rejected. Everything in between goes to a queue reviewed by a qualified health professional within 24 hours. Contributor satisfaction returned to baseline within one month.

Outcome

200K+ students reached. Content stays online.

The platform is now used in school curricula across three states. Health teachers report that students engage with content outside classroom hours. The semantic search approach was later licensed to two other health education platforms.

Lessons

Know your adversary. Design for the network filters, the firewall rules, the conservative keyword blockers.

Automated moderation on sensitive content requires calibration, not confidence. Know what you're optimizing for.

Offline-first is not a feature. In low-connectivity environments, it's the product.

When domain experts stop contributing, the problem is the process, not the experts.