biopanel.io — AI-Powered Health Analytics Platform
Project Summary
Project: biopanel.io
Type: Personal product — built and owned end-to-end
Stack: Python · FastAPI · Celery · Redis · PostgreSQL · Supabase · React 18 · TypeScript · Docker · LLMs
What it does: Extracts structured biomarker data from blood-test PDFs across any laboratory, language, or document layout — with zero hardcoded parsing rules. Users upload a lab report, the system normalizes all values, flags out-of-range markers, and renders an interactive health dashboard.
The Problem
Blood test results come in hundreds of different PDF formats — different labs, different languages, different layouts, different reference ranges. Traditional extraction approaches require lab-specific parsing rules that break the moment a format changes.
The challenge: build a system that extracts biomarkers reliably from any PDF, without knowing the layout in advance, at production-grade reliability.
Architecture & Approach
I designed an asynchronous extraction pipeline that treats each PDF as a structured extraction problem for a language model — not a pattern-matching problem.
PDF Upload → Celery Task → PDF Parser → LLM Extraction → Normalization → PostgreSQL
↓
Multi-provider fallback
GPT-4o → Claude Sonnet
Key architectural decisions:
- Typed DAG via Celery + Redis: each stage is an explicit async task with typed inputs/outputs — no silent failures
- Multi-provider LLM fallback: GPT-4o primary, Claude Sonnet as fallback — resilient to provider outages
- Streaming via Server-Sent Events: extraction results stream to the frontend as they arrive — no polling
- pgvector for biomarker search: normalized biomarkers stored with vector embeddings for semantic search
- Zero hardcoded rules: the LLM handles layout variability — the pipeline handles structure and validation
Tech Stack
Backend
- FastAPI — async REST API with full OpenAPI docs
- Celery + Redis — async task queue with retry logic and typed DAG
- PostgreSQL + pgvector — persistence and vector search
- Supabase — auth and storage layer
- Alembic — database migrations
- Docker Compose — local development and deployment
Frontend
- React 18 + TypeScript — full type safety end-to-end
- Custom chart components for biomarker visualization
- Responsive mobile-first layout
- Real-time SSE for extraction progress
AI / LLM
- OpenAI GPT-4o — primary extraction model
- Anthropic Claude Sonnet — fallback model
- Custom prompt engineering for structured biomarker extraction
- LangChain for chain orchestration
Results
- Extracts biomarkers from PDFs across multiple labs, languages, and layouts with no layout-specific configuration
- End-to-end extraction completes in under 30 seconds for a typical lab report
- Multi-provider fallback ensures availability when individual providers have outages
- Deployed as a public product at biopanel.io
What this demonstrates
This project exists to prove a specific capability: I can take an idea from zero to a production AI product — architecture, backend, frontend, deployment, and all. It's the most complete demonstration of my full-stack + AI engineering depth.
For data engineering clients: the same pipeline design patterns (async DAGs, typed stages, observability, fallback logic) are how I approach enterprise data workflows.
-
Interested in AI-powered data systems?
Let's talk about what this kind of architecture could look like for your use case.