🧠 AuditVision: Turning FDA Compliance Chaos into Clarity
Tags: Python, OCR, Django, OpenCV, FDA 483, Data Analytics, AI in Pharma | 🔗 View Live Demo
When I joined Zydus Lifesciences, I saw firsthand how painful and time-consuming it was to handle FDA Form 483 documents — critical post-audit observations that demand fast, accurate action.
Most of these PDFs weren’t even text-selectable. They were scanned image documents that QA teams had to retype manually. One audit could take up to a week to process — and there were dozens per year. It was slow, error-prone, and draining valuable resources.
🚨 The Pain: A Broken Process
QA teams would comb through these flat PDFs, retyping line by line, trying to extract key insights: auditor names, firm IDs, root causes, trends. The process was manual, fragmented, and inconsistent across sites. There was no centralized system, no dashboards — just Excel sheets and shared folders.
🧩 The Spark: Let’s Build Something Better
That’s when I built AuditVision — a platform that could turn these scanned FDA documents into clean, structured data and meaningful visual insights. From OCR to web dashboards, the goal was simple: automate what could be automated, and surface what mattered.
🔧 Building the System
📄 Step 1: Converting PDFs to Structured Data
- Used
Popplerto convert multi-page PDFs into high-res images - Applied
OpenCVwith custom morphological filters to:- Detect table lines
- Remove scan noise and whitespace
- Segment images into top (30%), middle (observations), bottom (footer)
- Ran
Tesseract OCRon each zone independently for better accuracy - Cleaned and parsed text using regex, converting it into
Pandas DataFrameswith fields like firm name, auditor, date, FEI number - Merged with QA Excel data to enrich with metadata (plant, department, etc.)
💻 Step 2: Interactive Dashboards
Once the data pipeline was set, I built a secure Django web app with real-time analytics using Chart.js, Plotly, and DataTables. Users could:
- Track top auditors and their observation patterns
- View company-wise and year-wise compliance trends
- Compare auditor performance across regions and time
- Filter, search, and export relevant slices of data
🔐 Step 3: Securing Compliance
- Role-based access control with separate user/admin views
- Full audit trail of logins, CRUD operations, exports with IPs and timestamps
- Locked Excel exports (row limits + embedded user metadata for traceability)
- Live session APIs for stats, termination, and timeout handling
⚡ The Impact
- Reduced audit processing time from 1 week to just 1 day
- Enabled centralized, real-time visibility across multiple manufacturing sites
- Adopted across both Corporate QA and plant-level teams
- Improved traceability, transparency, and audit readiness
💡 Reflection
What started as a small automation script turned into a full-scale internal product. AuditVision blended OCR, analytics, UX, and security into one unified platform — and made life easier for people on the ground.
To me, this project wasn’t just about writing code. It was about understanding a real-world problem and designing a solution that respected the complexity while simplifying the work.
Tech Stack: Python, Django, OpenCV, Tesseract, Pandas, Chart.js, Plotly, SQL
Company: Zydus Lifesciences | Status: Internal deployment