Overview
PESU Venture Labs is a university-affiliated technology incubator where student developers work on real client projects. Unlike traditional internships or academic projects, we delivered production-ready solutions to paying external clients—real businesses with real problems and tight deadlines.
Over two years as an Applied AI Developer, I worked across diverse domains: computer vision, semantic search, generative AI, and document processing. Each project taught me valuable lessons about bridging the gap between research and production.
Major Projects
1. Advertease - Real-Time Advertisement Detection
Client Need: A media analytics company needed to detect advertisements in live broadcast streams to verify ad placement and measure screen time.
Technical Approach:
- Built a YOLO-based object detection system trained to recognize TV commercial patterns
- Implemented confidence-scored event emission (probability scores for each detection)
- Optimized end-to-end video processing latency to sub-700ms (critical for real-time use)
Key Challenges:
- Latency Requirements: 700ms end-to-end meant every component needed optimization—video decoding, model inference, post-processing
- False Positive Control: Ads and regular content can look similar. Implemented temporal consistency checks (an "ad" should last 15-30 seconds, not flicker on/off)
- Model Size vs. Accuracy Trade-off: Smaller YOLO models run faster but sacrifice accuracy. Settled on YOLOv5-medium after extensive benchmarking
Outcome: Deployed system processed live streams with 95% detection accuracy and sustained sub-700ms latency. Client used it for ad placement verification across multiple TV channels.
2. Document Processing Pipeline
Client Need: Government contractor processing thousands of scanned forms daily needed automated detection of predefined visual markers (checkboxes, signatures, stamps).
Technical Approach:
- YOLO-based detection pipeline for visual markers in scanned documents
- Data pipeline to process and annotate training data
- Post-processing to extract structured data from detection results
The Data Challenge:
The client provided ~1,000 annotated samples. Not nearly enough for robust YOLO training. I:
- Built data augmentation pipeline (rotation, scaling, noise injection, perspective transforms)
- Expanded dataset from ~1K to ~25K annotations
- Implemented active learning loop—model predictions on unlabeled data, human review of uncertain cases, retrain
Outcome: Achieved ~95% precision in deployment tests. System processed 10,000+ documents per day, reducing manual review time by 80%.
3. OneSKU - Vendor Catalog Alignment
(See dedicated OneSKU project page for full technical details)
Brief Summary: Built hybrid BM25 + embedding retrieval system for aligning heterogeneous vendor catalogs. Achieved sub-15s query latency across multi-million SKU inventories.
4. Sustify - AI-Driven Sustainability Marketplace
(See dedicated Sustify project page for full details)
Brief Summary: Founded an AI-driven sustainability marketplace using transformer-based embeddings for automated vendor matching. Secured early pilot interest before pivoting.
Cross-Project Learnings
Production AI is Different from Research AI
Academic ML focuses on pushing state-of-the-art metrics. Production ML focuses on:
- Reliability: Can this run 24/7 without crashing?
- Latency: Does it respond fast enough for real users?
- Maintainability: Can someone else debug this when I'm gone?
- Cost: Can the client afford to run this at scale?
This mindset shift—from "what's the best model?" to "what's the right model for this production context?"—was transformative.
Data Quality > Model Sophistication
On multiple projects, I learned that:
- Cleaning and augmenting data improved results more than trying fancier models
- Domain-specific data engineering (e.g., separating categorical vs. numerical features in OneSKU) mattered more than hyperparameter tuning
- Active learning loops (model → predictions → human review → retrain) beat passive dataset collection
Latency Budgets Force Trade-offs
Real-time systems teach you about trade-offs fast. For Advertease:
- Could have used larger, more accurate YOLO model → missed latency target
- Could have used smaller model → accuracy too low
- Final solution: medium model + aggressive inference optimizations (TensorRT, batch size tuning, GPU memory management)
Lesson: Understand your constraints (latency, cost, accuracy) upfront and design around them.
Technical Skills Developed
Computer Vision
- Object Detection: YOLO family (v5, v7, v8), Faster R-CNN
- Image Processing: OpenCV, PIL, data augmentation techniques
- Video Processing: FFmpeg, frame extraction, real-time stream handling
- Model Optimization: TensorRT, ONNX, quantization, pruning
Natural Language Processing
- Embeddings: BERT, RoBERTa, Sentence Transformers, OpenAI embeddings
- Semantic Search: Vector databases (FAISS, Qdrant), BM25, hybrid retrieval
- LLMs: Prompt engineering, RAG systems, LlamaIndex
MLOps & Deployment
- Containerization: Docker, Kubernetes for ML workloads
- Model Serving: FastAPI, TorchServe, gRPC
- Monitoring: Prometheus metrics, logging, performance profiling
- CI/CD: Automated testing, model versioning, deployment pipelines
Client Collaboration Skills
Beyond technical skills, PESU Venture Labs taught me critical soft skills:
Requirements Gathering
Clients often don't know what's technically feasible. My job was to:
- Understand the business problem (not just the stated solution)
- Propose technically feasible approaches
- Set realistic expectations (accuracy, latency, cost)
Iterative Delivery
Rather than disappearing for months and delivering a final product, I learned to:
- Deliver MVPs quickly (2-3 weeks for initial prototype)
- Gather feedback early and often
- Iterate based on real user testing
Technical Communication
Clients don't care about model architectures. They care about:
- "Will this solve my problem?"
- "How accurate is it?"
- "What does it cost to run?"
- "When can I deploy it?"
I learned to translate technical details into business outcomes.
Impact & Outcomes
By The Numbers
- 6 production deployments across diverse domains
- 4 external clients served (media, e-commerce, government, sustainability)
- ~25K annotations created through data augmentation and active learning
- Sub-second latency achieved for real-time systems (Advertease: 700ms)
What Made PESU Venture Labs Unique
Unlike traditional university research labs or internships, PESU Venture Labs offered:
- Real clients with real budgets: Accountability to paying customers taught me discipline
- Production deployments: Code actually ran in production, handling real user traffic
- Cross-domain exposure: Worked on computer vision, NLP, and search problems (not just one narrow area)
- Autonomy: Trusted to make technical decisions, not just execute predefined tasks
Transition to Full-Time Work
My time at PESU Venture Labs directly prepared me for my current role at Baxter International. Skills I use daily:
- Translating business requirements into technical solutions
- Building production-ready systems (not just research prototypes)
- Collaborating with cross-functional teams (product, security, DevOps)
- Making pragmatic trade-offs (perfect vs. good enough vs. shipping)
INTERESTED IN APPLIED AI WORK?
I'm happy to discuss lessons learned from client projects, share insights about production ML, or provide guidance on navigating university incubators and applied research opportunities.
Let's Connect