Open to opportunities · Canada & remote

Hi, I'm Dvip.

AI Engineer — NLP & Intelligent Systems. I ship end-to-end intelligent systems — from LLM pipelines and retrieval stacks to production Laravel migrations — with a bias toward trustworthy, auditable design.

Get in touch →LinkedIn GitHub

📍 Oshawa, Ontario, Canada·✉ dvippatel.math@gmail.com

0 yrs

Building production systems end-to-end

0–90%

Autonomous migration issue resolution

0%+

Reduction in manual migration effort

<$0

Cost per autonomous migration run

Infrastructure

Cold Start Race

A live benchmark: four AWS deployment strategies run the exact same workload head-to-head and stream their real millisecond timings back as they finish. Hit start and watch actual infrastructure race — no mocks.

AWS LambdaEC2 + DockerSSE streamingRedis rate limitingEdge concurrency caps

Lambda Cold Start

AWS Lambda with forced cold start — no warm container reuse

—

Lambda Warm

AWS Lambda kept warm via EventBridge ping every 5 minutes

—

EC2 Direct

Fastify on bare EC2 (t3.micro) behind Nginx — port 3001

—

Docker on EC2

Containerized Fastify via docker-compose on EC2 — port 3002

—

Races all 4 strategies simultaneously — typically 1–3 seconds

AI Demo

Autonomous Agent

A real AI agent you can run. Pick a task, give it an input, and watch it reason, choose tools, call real APIs, and write a structured report — every step streamed live as it happens.

Tool-use loopSSE streamingRedis rate limitingCost ceilingOutput sanitizationOpenRouter fallback

autonomous-agent

Select a task type, provide input, and watch the agent reason in real time.

Experience

Where I've worked

Graduate Researcher (MSc, Software Engineering)
Ontario Tech University · Oshawa, ON
Sep 2024 — Present
- Designed TRAC-RE — a Traceable, Reliable, Auditable, and Contextual framework for automated Requirements Engineering, built from four interconnected phases that take raw stakeholder prose to implicit domain-knowledge discovery, grounded in an industrial FinTech/SaaS corpus of 110 requirements and 1,997 High-Level JSON (HLJ) artifacts.
- Phase 1 — Context-aware requirement identification: empirical study of sentence-level classification (all-mpnet-base-v2, DeBERTa-v3-base) across frozen / LoRA / full fine-tuning and context windows k=0–3, showing structured local context adds +16 F1 points (0.664 → 0.894) and that 4K domain-aligned samples (F1=0.894) beat 15K mixed (F1=0.883). Supporting paper submitted to RE'26 Main.
- Phase 3 — Governed LLM parsing: built and benchmarked a multi-model parsing pipeline (GPT-4.1, Claude Opus 4, Meta-70B) producing confidence-scored HLJ artifacts with a versioned tag-governance stack (Harvest → Filter → Cluster → Validate → Whitelist → Audit → Drift); tag precision improved 0.657 → 0.897 at the strictest v2 stage. Supporting paper published at CASCON 2025.
- Phases 2 & 4 — Audit-grade extraction and implicit discovery: surfaced a structural coverage ceiling of 1.5 canonical keywords per artifact, then architected a multi-signal discovery engine combining neighbor transfer, graph walks over a 275,164-edge enriched co-occurrence graph, and UMAP+HDBSCAN cluster-gap detection, with a bounded LLM-as-judge tiebreaker and full per-keyword provenance. In revision at RE'26 RE@Next! with Amarachi Nwosu.
- Co-designed a domain dictionary of 13,725 entries / 35,799 lookup keys / 108 detected abbreviations used as synonym normalizer, confidence booster, stoplist, and novelty flagger across the framework.
- Translated the framework into 3 supporting papers across CASCON and RE'26; conducted literature reviews, experimental design, ablation planning (hop depth, signal composition, dictionary impact), and supervisory reporting.
AI EngineerPart-time
Palomino Systems · Remote
Nov 2025 — Present
- Designed and shipped Laravel Upgrader, a fully autonomous CLI-driven AI agent that migrates legacy Laravel codebases end-to-end — a 10-step pipeline (analysis → planning → transformation → validation → self-healing) processing ~10k LOC/run at under $80/run, with 85–90% automated issue resolution via detect → fix → retry loops.
- Measured two full migrations by hand first, then automated those exact steps — cutting manual migration effort by ~80% (a baseline-grounded number, not an estimate).
- Owned the codebase end-to-end and was the primary technical contact with Palomino stakeholders, presenting the agent architecture and translating tradeoffs into prioritized recommendations.
Founding EngineerPart-time
Mediabridge · Remote
Apr 2025 — Present
- Co-built and launched a multi-tenant SaaS platform from scratch, owning architecture decisions across frontend, backend, and AWS deployment.
- Personally built the core systems — a custom role-based access control (RBAC) layer, a dynamic form builder, and an event-driven notification system.
- Owned the full lifecycle from requirements through production launch and served as primary technical contact for the client.
Laravel Developer
Finserve Infotech · India
Jan 2024 — Dec 2024
- Built and delivered 4 ERP / POS systems in Laravel, covering data models, business logic, and client-facing workflows.

Research

Publications & thesis

ThesisTRAC-REMSc, Software Engineering · Ontario Tech University

Toward Automated Requirements Engineering: Empirical and Architectural Foundations for Structured Parsing and Knowledge Discovery

TRAC-RE: Traceable, Reliable, Auditable, and Contextual

TRAC-RE is a framework for automated requirements engineering (RE) built from four interconnected phases. Together they take raw stakeholder prose to implicit domain-knowledge discovery — spanning context-aware requirement identification, audit-grade keyword extraction, governed LLM parsing into structured JSON, and multi-signal implicit keyword discovery, all grounded in an industrial FinTech and SaaS corpus.

Supervisor: Prof. Sanaa Alwidian

PublishedCASCON 2025 · 2025

Improving Reliability of LLMs in RE with Structured Confidence & Tag Governance

A modular multi-model LLM pipeline that converts raw stakeholder requirements into High-Level JSON (HLJ) artifacts with confidence-scored fields, paired with a versioned tag-governance system that catches prompt-leak exploitation, hallucinated tags, and low-agreement outputs before they reach downstream stages.

LLM governanceRequirements EngineeringTag validation

Read →

In RevisionRE'26 · 2026

From Explicit to Implicit: Towards Traceable Keyword Discovery in Requirements Engineering

Explicit keyword extraction — even at audit-grade precision — hits a structural ceiling of roughly 1.5 canonical keywords per HLJ artifact, with Jaccard agreement below 0.11 across KeyBERT, RAKE, and YAKE. This paper presents a 5-phase implicit keyword discovery engine combining a 13,725-entry domain dictionary, a 275,164-edge enriched co-occurrence graph, UMAP+HDBSCAN clustering, and a bounded LLM-as-judge that tiebreaks only the borderline scoring band. Every implicit keyword traces back to its discovery signal(s) with full per-signal evidence.

Implicit knowledge discoveryGraph-based NLPUMAP / HDBSCAN

RE@Next!

Under ReviewRE'26 · 2026

Towards Improving Sentence-Level Requirements Identification via Explicit Local Context Modeling

An empirical study of sentence-level requirement classification on 110 real-world FinTech and SaaS documents (~5,700 candidate sentences, balanced to 15K mixed / 4K domain-only). We compare all-mpnet-base-v2 (110M) and DeBERTa-v3-base (184M) across frozen / LoRA / full fine-tuning and context window sizes k=0–3, and show that structured local-context features — not raw concatenation — are the critical signal.

Requirements classificationLocal context modelingLoRA fine-tuning

Main Track

Want the full research write-up? Switch to the academic view →

Capabilities

What I work with

Research Areas

Requirements EngineeringNLP for Software EngineeringLLM GovernanceKnowledge DiscoveryEmpirical SE

Languages

PythonTypeScript / JavaScriptPHP (Laravel)SQLBash

AI / ML

PyTorchHuggingFace Transformerssentence-transformersSBERT / MPNetMiniLMDeBERTa-v3KeyBERTRAKEFAISSUMAP + HDBSCANLoRA Fine-tuningRAG PipelinesAnthropic / Claude APIMCPAider

Frontend

ReactNext.jsTailwind CSSZustand

Backend / Cloud

LaravelFlaskREST APIsMicroservicesRBACDockerAWSCI/CDGitHub Actions

Testing

Vitestfast-check (property-based)Testing Library

Education

MASc, Software Engineering (Thesis)
2024 – Present
Ontario Tech University · Oshawa, ON, Canada

Certifications

✓
AWS Certified Cloud Practitioner
Amazon Web Services · 2024

Teaching & Authored

✎
OVIN EV Micro-Credential Course
Ontario Vehicle Innovation Network · 2024

Featured work

Things I've built

Two systems tell the story best: a production migration platform that pays for itself, and the research pipeline behind my thesis. Here's what each one does and what it's made of.

01🛠 Product

Inbox Intelligence Platform

Gmail-native communication intelligence layer — one ingestion pipeline, six analysis modules.

Built at a 48-hour hackathon. Turns unstructured email into structured, actionable intelligence: a single ingestion + parsing core (Claude classifier, entity extractor, temporal indexer) feeds six independent modules — financial waste detection, relationship decay signals, contract/obligation tracking, follow-up commitments, a RAG-based institutional memory, and health-admin tracking. One parse, many consumers — no redundant API calls, and an entity graph that compounds over time. Currently being taken from hackathon build to a live product.

What it's built with

Claude Sonnet
Gmail MCP
Google Calendar MCP
RAG / pgvector
PostgreSQL
React
Tailwind
Node.js

02🛠 Product

Laravel Upgrader

Autonomous 10-step AI migration agent for legacy Laravel codebases.

CLI-driven migration agent that ingests legacy Laravel code and runs analysis → planning → transformation → validation → self-healing loops. Processes ~10k LOC per run at under $80 cost, with 85–90% automated issue resolution.

What it's built with

Python
Claude / LLM Orchestration
Laravel
CLI
Self-healing retry loops

03🔬 Research

TRAC-RE — Automated Requirements Engineering Framework

A Traceable, Reliable, Auditable, and Contextual framework for automated Requirements Engineering — structured parsing, context-aware classification, and implicit keyword discovery.

The research framework behind my MSc thesis at Ontario Tech University. **TRAC-RE** — Traceable, Reliable, Auditable, and Contextual — converts raw stakeholder prose into audit-grade knowledge artifacts across four interconnected phases: 1. **Phase 1 — Context-aware requirement identification** — Sentence-level classification with structured local-context features. F1 improved 0.664 → 0.894 on 110 FinTech/SaaS documents; 4K domain-aligned samples (F1=0.894) beat 15K mixed (F1=0.883). Supporting paper under review at RE'26 Main. 2. **Phase 2 — Audit-grade keyword extraction** — Perfect-precision extraction that surfaced a structural coverage ceiling of 1.5 canonical keywords per artifact, motivating the implicit-discovery phase. Supporting paper in revision at RE'26 RE@Next!. 3. **Phase 3 — Governed LLM parsing (HLJ)** — Multi-model parsing (GPT-4.1, Claude Opus 4, Meta-70B) into confidence-scored High-Level JSON with versioned tag governance (v0 → v2); tag precision 0.657 → 0.897. Supporting paper published at CASCON 2025. 4. **Phase 4 — Implicit keyword discovery** — A 5-stage engine over a 275,164-edge enriched co-occurrence graph, with UMAP+HDBSCAN cluster-gap detection and a bounded LLM-as-judge tiebreaker. Supporting paper in revision at RE'26 RE@Next!. **Cross-cutting infrastructure** — 1,997 HLJ artifacts, a 13,725-entry domain dictionary, a FAISS index over 768-dim SBERT vectors, full per-decision audit logging, and drift monitoring across pipeline runs. The framework's central argument: governance and distributional alignment, not raw model scale, are the primary levers for trustworthy RE automation.

What it's built with

Python
PyTorch
HuggingFace
SBERT / MPNet
DeBERTa-v3
FAISS
UMAP
HDBSCAN
LoRA

Also on this page — live & playable

⚡ Infrastructure↑ Run it

Cold Start Race

Four AWS deployment strategies benchmarked head-to-head with live millisecond timings.

AWS LambdaEC2 + DockerSSE streamingRedis

🤖 AI Demo↑ Run it

Autonomous Agent

A real tool-using AI agent that reasons, calls live APIs, and streams a structured report.

Tool-use loopSSE streamingRedisCost ceiling

Open source

GitHub at a glance

@dvip07 →

Public repos

Total stars

Followers

Top language

Python

Contact

Let's build something

Have an AI or engineering problem worth chasing?

Happy to chat about NLP pipelines, LLM governance, autonomous migration systems, or graduate research collaboration. Email is the fastest way through.

✉ dvippatel.math@gmail.com LinkedIn GitHub

Hi, I'm Dvip.

Cold Start Race

Lambda Cold Start

Lambda Warm

EC2 Direct

Docker on EC2

Autonomous Agent

GitHub Repo Analyzer

API Health Inspector

Job Description Decoder

Where I've worked

Graduate Researcher (MSc, Software Engineering)

AI EngineerPart-time

Founding EngineerPart-time

Laravel Developer

Publications & thesis

Toward Automated Requirements Engineering: Empirical and Architectural Foundations for Structured Parsing and Knowledge Discovery

Improving Reliability of LLMs in RE with Structured Confidence & Tag Governance

From Explicit to Implicit: Towards Traceable Keyword Discovery in Requirements Engineering

Towards Improving Sentence-Level Requirements Identification via Explicit Local Context Modeling

What I work with

Research Areas

Languages

AI / ML

Frontend

Backend / Cloud

Testing

Education

Certifications

Teaching & Authored

Things I've built

Inbox Intelligence Platform

Laravel Upgrader

TRAC-RE — Automated Requirements Engineering Framework

Cold Start Race

Autonomous Agent

GitHub at a glance

Let's build something

Have an AI or engineering problem worth chasing?