Bhinu Puvvala
Software Engineer & Data Scientist building systems that perform at scale — from distributed pipelines to on-device AI.
Background
Experience
Methix (Microsoft-backed) · New York, NY
- —Reduced p95 API latency by 30% for 5,000+ weekly users via Redis caching across Python/Flask services
- —Cut database query time by 52% (250ms → 120ms) by redesigning PostgreSQL indexes — sustained 2× peak traffic
- —Shipped production observability: structured logging, distributed tracing, automated health checks
GLUE Lab, UW–Madison · Madison, WI
- —Distributed pipelines over 10TB+ multispectral data with Dask & GeoPandas — cut inference latency by 86%
- —Improved scene classification accuracy by 20% via automated feature extraction modules
- —Reduced false positive rate by 53% through ablation testing across geographic regions
Diyar United Company · Kuwait City
- —Automated Python/SQL ETL pipelines processing 500K+ records/day — 3× throughput, eliminating 15+ hrs/week
- —Designed star-schema data warehouse + Power BI dashboards serving 10K+ clients
Education
University of Wisconsin–Madison
B.S. Computer Science & Data Science
Minor: Economic Analytics
Expected May 2027
Coursework
Technical Skills
Languages
Frameworks
Data & Processing
Infrastructure
Projects
View all on GitHub →Videre
Fully offline, privacy-first AI video editor running entirely on-device with zero cloud dependency. Deployed Whisper & SigLIP2 via ONNX Runtime QNN for on-chip NPU acceleration — real-time transcription with word-level timestamps on edge hardware.
WhismurAI
Real-time audio translation pipeline supporting 20+ languages. Streaming STT → semantic translation → TTS under strict sub-second latency constraints, preserving speaker identity across passes.
Gesture
Real-time human-computer interaction via gesture recognition. 98% accuracy at 30ms latency with 12+ user-defined commands integrated with productivity tools.
Memory Allocator
Custom heap allocator in C with malloc/free/realloc, block splitting, coalescing, segregated free-list, and per-arena mutex locking for zero-contention concurrent allocation.
Let's build something.
Open to SWE internships, research roles, and interesting side projects.