PESU Venture Labs - Applied AI Developer

Overview

PESU Venture Labs is a university-affiliated technology incubator where student developers work on real client projects. Unlike traditional internships or academic projects, we delivered production-ready solutions to paying external clients—real businesses with real problems and tight deadlines.

Over two years as an Applied AI Developer, I worked across diverse domains: computer vision, semantic search, generative AI, and document processing. Each project taught me valuable lessons about bridging the gap between research and production.

Major Projects

1. Advertease - Real-Time Advertisement Detection

Client Need: A media analytics company needed to detect advertisements in live broadcast streams to verify ad placement and measure screen time.

Technical Approach:

Built a YOLO-based object detection system trained to recognize TV commercial patterns
Implemented confidence-scored event emission (probability scores for each detection)
Optimized end-to-end video processing latency to sub-700ms (critical for real-time use)

Key Challenges:

Latency Requirements: 700ms end-to-end meant every component needed optimization—video decoding, model inference, post-processing
False Positive Control: Ads and regular content can look similar. Implemented temporal consistency checks (an "ad" should last 15-30 seconds, not flicker on/off)
Model Size vs. Accuracy Trade-off: Smaller YOLO models run faster but sacrifice accuracy. Settled on YOLOv5-medium after extensive benchmarking

Outcome: Deployed system processed live streams with 95% detection accuracy and sustained sub-700ms latency. Client used it for ad placement verification across multiple TV channels.

2. Document Processing Pipeline

Client Need: Government contractor processing thousands of scanned forms daily needed automated detection of predefined visual markers (checkboxes, signatures, stamps).

Technical Approach:

YOLO-based detection pipeline for visual markers in scanned documents
Data pipeline to process and annotate training data
Post-processing to extract structured data from detection results

The Data Challenge:

The client provided ~1,000 annotated samples. Not nearly enough for robust YOLO training. I:

Built data augmentation pipeline (rotation, scaling, noise injection, perspective transforms)
Expanded dataset from ~1K to ~25K annotations
Implemented active learning loop—model predictions on unlabeled data, human review of uncertain cases, retrain

Outcome: Achieved ~95% precision in deployment tests. System processed 10,000+ documents per day, reducing manual review time by 80%.

3. OneSKU - Vendor Catalog Alignment

(See dedicated OneSKU project page for full technical details)

Brief Summary: Built hybrid BM25 + embedding retrieval system for aligning heterogeneous vendor catalogs. Achieved sub-15s query latency across multi-million SKU inventories.

4. Sustify - AI-Driven Sustainability Marketplace

(See dedicated Sustify project page for full details)

Brief Summary: Founded an AI-driven sustainability marketplace using transformer-based embeddings for automated vendor matching. Secured early pilot interest before pivoting.

Cross-Project Learnings

Production AI is Different from Research AI

Academic ML focuses on pushing state-of-the-art metrics. Production ML focuses on:

Reliability: Can this run 24/7 without crashing?
Latency: Does it respond fast enough for real users?
Maintainability: Can someone else debug this when I'm gone?
Cost: Can the client afford to run this at scale?

This mindset shift—from "what's the best model?" to "what's the right model for this production context?"—was transformative.

Data Quality > Model Sophistication

On multiple projects, I learned that:

Cleaning and augmenting data improved results more than trying fancier models
Domain-specific data engineering (e.g., separating categorical vs. numerical features in OneSKU) mattered more than hyperparameter tuning
Active learning loops (model → predictions → human review → retrain) beat passive dataset collection

Latency Budgets Force Trade-offs

Real-time systems teach you about trade-offs fast. For Advertease:

Could have used larger, more accurate YOLO model → missed latency target
Could have used smaller model → accuracy too low
Final solution: medium model + aggressive inference optimizations (TensorRT, batch size tuning, GPU memory management)

Lesson: Understand your constraints (latency, cost, accuracy) upfront and design around them.

Technical Skills Developed

Computer Vision

Object Detection: YOLO family (v5, v7, v8), Faster R-CNN
Image Processing: OpenCV, PIL, data augmentation techniques
Video Processing: FFmpeg, frame extraction, real-time stream handling
Model Optimization: TensorRT, ONNX, quantization, pruning

Natural Language Processing

Embeddings: BERT, RoBERTa, Sentence Transformers, OpenAI embeddings
Semantic Search: Vector databases (FAISS, Qdrant), BM25, hybrid retrieval
LLMs: Prompt engineering, RAG systems, LlamaIndex

MLOps & Deployment

Containerization: Docker, Kubernetes for ML workloads
Model Serving: FastAPI, TorchServe, gRPC
Monitoring: Prometheus metrics, logging, performance profiling
CI/CD: Automated testing, model versioning, deployment pipelines

Client Collaboration Skills

Beyond technical skills, PESU Venture Labs taught me critical soft skills:

Requirements Gathering

Clients often don't know what's technically feasible. My job was to:

Understand the business problem (not just the stated solution)
Propose technically feasible approaches
Set realistic expectations (accuracy, latency, cost)

Iterative Delivery

Rather than disappearing for months and delivering a final product, I learned to:

Deliver MVPs quickly (2-3 weeks for initial prototype)
Gather feedback early and often
Iterate based on real user testing

Technical Communication

Clients don't care about model architectures. They care about:

"Will this solve my problem?"
"How accurate is it?"
"What does it cost to run?"
"When can I deploy it?"

I learned to translate technical details into business outcomes.

Impact & Outcomes

By The Numbers

6 production deployments across diverse domains
4 external clients served (media, e-commerce, government, sustainability)
~25K annotations created through data augmentation and active learning
Sub-second latency achieved for real-time systems (Advertease: 700ms)

What Made PESU Venture Labs Unique

Unlike traditional university research labs or internships, PESU Venture Labs offered:

Real clients with real budgets: Accountability to paying customers taught me discipline
Production deployments: Code actually ran in production, handling real user traffic
Cross-domain exposure: Worked on computer vision, NLP, and search problems (not just one narrow area)
Autonomy: Trusted to make technical decisions, not just execute predefined tasks

Transition to Full-Time Work

My time at PESU Venture Labs directly prepared me for my current role at Baxter International. Skills I use daily:

Translating business requirements into technical solutions
Building production-ready systems (not just research prototypes)
Collaborating with cross-functional teams (product, security, DevOps)
Making pragmatic trade-offs (perfect vs. good enough vs. shipping)

INTERESTED IN APPLIED AI WORK?

I'm happy to discuss lessons learned from client projects, share insights about production ML, or provide guidance on navigating university incubators and applied research opportunities.

Let's Connect