Available for opportunities · Summer 2026

Bhinu Puvvala

Software Engineer & Data Scientist building systems that perform at scale — from distributed pipelines to on-device AI.

Get in touch github.com/bhinu LinkedIn

Background

Experience

Software Engineering InternMay – Aug 2025

Methix (Microsoft-backed) · New York, NY

—Reduced p95 API latency by 30% for 5,000+ weekly users via Redis caching across Python/Flask services
—Cut database query time by 52% (250ms → 120ms) by redesigning PostgreSQL indexes — sustained 2× peak traffic
—Shipped production observability: structured logging, distributed tracing, automated health checks

Undergraduate ResearcherJan – May 2025

GLUE Lab, UW–Madison · Madison, WI

—Distributed pipelines over 10TB+ multispectral data with Dask & GeoPandas — cut inference latency by 86%
—Improved scene classification accuracy by 20% via automated feature extraction modules
—Reduced false positive rate by 53% through ablation testing across geographic regions

Data Analyst InternJun – Aug 2024

Diyar United Company · Kuwait City

—Automated Python/SQL ETL pipelines processing 500K+ records/day — 3× throughput, eliminating 15+ hrs/week
—Designed star-schema data warehouse + Power BI dashboards serving 10K+ clients

Education

University of Wisconsin–Madison

B.S. Computer Science & Data Science
Minor: Economic Analytics

Expected May 2027

3.74

GPA · Dean's List

May '27

Graduating

Coursework

Operating SystemsAlgorithms & DSBig Data SystemsComputer VisionEconometricsMachine Org.

Technical Skills

Languages

PythonC/C++TypeScriptJavaScriptJavaSQLBash

Frameworks

ReactFastAPIFlaskNext.jsElectronTensorFlowPyTorchONNX Runtime

Data & Processing

PandasNumPyDaskGeoPandasFFmpegOpenCV

Infrastructure

DockerRedisPostgreSQLKubernetesMongoDBMySQLCI/CD

Projects

View all on GitHub →

🏆 Qualcomm Track Winner · MadData 2026

Videre

Fully offline, privacy-first AI video editor running entirely on-device with zero cloud dependency. Deployed Whisper & SigLIP2 via ONNX Runtime QNN for on-chip NPU acceleration — real-time transcription with word-level timestamps on edge hardware.

cloud deps · real-time NPU inference

Qualcomm NPUONNX Runtime QNNWhisperSigLIP2FFmpegReact + Electron

WhismurAI

Real-time audio translation pipeline supporting 20+ languages. Streaming STT → semantic translation → TTS under strict sub-second latency constraints, preserving speaker identity across passes.

20+

languages · sub-second latency

ReactFastAPIWeb Audio APIDeepgramOpenAIFish Audio

Gesture

Real-time human-computer interaction via gesture recognition. 98% accuracy at 30ms latency with 12+ user-defined commands integrated with productivity tools.

98%

accuracy · 30ms latency

TensorFlowMediaPipeOpenCVElectron.js

Memory Allocator

Custom heap allocator in C with malloc/free/realloc, block splitting, coalescing, segregated free-list, and per-arena mutex locking for zero-contention concurrent allocation.

memory leaks · thread-safe

CpthreadsLinux

Let's build something.

Open to SWE internships, research roles, and interesting side projects.

bhinusantosh@gmail.com LinkedIn GitHub