themeAcademicCorporateTerminalMeme
Open to opportunities · Canada & remote

Hi, I'm Dvip.

AI Engineer — NLP & Intelligent Systems. I ship end-to-end intelligent systems — from LLM pipelines and retrieval stacks to production Laravel migrations — with a bias toward trustworthy, auditable design.

📍 Oshawa, Ontario, Canada·dvippatel.math@gmail.com
85–90%
Autonomous issue resolution
+16
F1 lift from local context (k=1)
1,997
Research artifacts shipped
10k+
LOC per migration run

Infrastructure

Cold Start Race

Four AWS deployment strategies race head-to-head with real infrastructure — no mocks, no fakes. Click start and watch actual millisecond measurements stream in live.

Lambda Cold Start

AWS Lambda with forced cold start — no warm container reuse

Lambda Warm

AWS Lambda kept warm via EventBridge ping every 5 minutes

EC2 Direct

Fastify on bare EC2 (t3.micro) behind Nginx — port 3001

Docker on EC2

Containerized Fastify via docker-compose on EC2 — port 3002

Races all 4 strategies simultaneously — typically 1–3 seconds

AI Demo

Autonomous Agent

A live AI agent powered by Claude Haiku. Pick a task, provide input, and watch it reason, select tools, and produce a structured report — all streamed in real time.

autonomous-agent

Select a task type, provide input, and watch the agent reason in real time.

Featured work

Things I've built

🛠 Product

Laravel Upgrader

Autonomous 10-step AI migration pipeline for Laravel codebases.

CLI-driven migration system that ingests legacy Laravel code and runs analysis → planning → transformation → validation → self-healing loops. Processes 10k+ LOC per run at under $80 cost, with 85–90% automated issue resolution.

PythonLLM OrchestrationLaravelCLI
🔬 Research

RE NLP System — Thesis Pipeline

Four-study pipeline for automated Requirements Engineering — structured parsing, context-aware classification, and implicit keyword discovery.

End-to-end research system behind my MSc thesis at Ontario Tech University. The pipeline converts raw stakeholder prose into audit-grade knowledge artifacts across four studies: 1. **Studies 1 & 2 (CASCON 2025)** — Multi-model LLM parsing (GPT-4.1, Claude Opus 4, Meta-70B) into High-Level JSON artifacts with versioned tag governance (v0 → v2; v2 precision 0.95 / F1 0.85). 2. **Study 3 (RE'26 Main, under review)** — Sentence-level requirement classification with structured local-context features (+16 F1 at k=1; best F1 0.894 on 4K domain-aligned samples). 3. **Study 4 (RE'26 RE@Next!, in revision)** — 5-phase implicit keyword discovery over a 275,164-edge enriched co-occurrence graph, UMAP+HDBSCAN cluster-gap detection, and a bounded LLM-as-judge tiebreaker. 4. **Cross-cutting infrastructure** — 1,997 HLJ artifacts, 13,725-entry domain dictionary, FAISS index over 768-dim SBERT vectors, full per-decision audit logging, and drift monitoring across pipeline runs.

PythonPyTorchHuggingFaceSBERT / MPNetDeBERTa-v3FAISS

Open source

GitHub at a glance

@dvip07
Public repos
27
Total stars
1
Followers
8
Top language
Python

Experience

Where I've worked

  1. Graduate Researcher (MSc, Software Engineering)

    Ontario Tech University · Oshawa, ON

    Sep 2024Present
    • Designing a four-study thesis pipeline that automates Requirements Engineering from raw stakeholder prose to implicit domain-knowledge discovery, grounded in an industrial FinTech/SaaS corpus of 110 requirements and 1,997 High-Level JSON (HLJ) artifacts.
    • Built and benchmarked a multi-model LLM parsing pipeline (GPT-4.1, Claude Opus 4, Meta-70B) with a versioned tag-governance stack (Harvest → Filter → Cluster → Validate → Whitelist → Audit → Drift); Opus 4 and Meta-70B reach F1=0.85 / precision=0.95 at the strictest v2 stage — published at CASCON 2025.
    • Conducted an empirical study of sentence-level requirement classification (all-mpnet-base-v2, DeBERTa-v3-base) across frozen / LoRA / full fine-tuning and context windows k=0–3, showing structured local context adds +16 F1 points and that 4K domain-aligned samples (F1=0.894) beat 15K mixed (F1=0.883) — submitted to RE'26 Main.
    • Architected a 5-phase implicit keyword discovery engine combining neighbor transfer, graph walks over a 275,164-edge enriched co-occurrence graph, and UMAP+HDBSCAN cluster-gap detection, with a bounded LLM-as-judge tiebreaker and full per-keyword provenance — in revision at RE'26 RE@Next! with Amarachi Nwosu.
    • Co-designed a domain dictionary of 13,725 entries / 35,799 lookup keys / 108 detected abbreviations used as synonym normalizer, confidence booster, stoplist, and novelty flagger across the pipeline.
    • Translated research into 3 papers across CASCON and RE'26; conducted literature reviews, experimental design, ablation planning (hop depth, signal composition, dictionary impact), and supervisory reporting.
  2. Migration Engineer

    Palomino Systems · Remote

    Nov 2025Present
    • Designed and shipped Laravel Upgrader, a fully autonomous CLI-driven AI migration system built on a 10-step pipeline (analysis → planning → transformation → validation → self-healing) processing 10k+ LOC/run.
    • Achieved end-to-end autonomous migration of small-to-medium Laravel applications at under $80 cost/run, with 85–90% automated issue resolution via detect → fix → retry loops.
    • Reduced migration effort by 80%+ and runtime by 40–60%; validation layers cut post-migration defects by 70%+.
    • Led client communication directly with Palomino stakeholders, presenting pipeline architecture and translating technical tradeoffs into prioritized recommendations.
    • Owned the full codebase end-to-end, driving iterative improvements under tight client feedback cycles.
  3. Software Engineer

    Mediabridge · Remote

    Apr 2025Present
    • Led development of a modular Ad Builder Canvas Engine and dynamic campaign system, reducing frontend effort by 60%+; served as primary technical point of contact for client-side feature discussions.
    • Engineered multi-tenant backend services with RBAC access control and event-driven notification delivery; reduced deployment time by 70% via CI/CD automation.
    • Proposed and prioritized feature roadmap improvements directly with client stakeholders, translating business needs into system decisions.
  4. Laravel Developer

    Finserve Infotech · India

    Jan 2024Dec 2024
    • Delivered 4 ERP/POS systems, improving workflow efficiency by 30%+.

Research

Publications & thesis

Thesis in progressMSc, Software Engineering · Ontario Tech University

Toward Automated Requirements Engineering: Empirical and Architectural Foundations for Structured Parsing and Knowledge Discovery

A four-study pipeline that automates requirements engineering (RE) using NLP and AI — from raw stakeholder prose to implicit domain knowledge discovery. The work spans structured JSON parsing, transformer-based classification with local context, and multi-signal keyword discovery, grounded in an industrial FinTech and SaaS corpus.

Supervisor: Prof. Sanaa Alwidian

PublishedCASCON 2025 · 2025

Improving Reliability of LLMs in RE with Structured Confidence & Tag Governance

A modular multi-model LLM pipeline that converts raw stakeholder requirements into High-Level JSON (HLJ) artifacts with confidence-scored fields, paired with a versioned tag-governance system that catches prompt-leak exploitation, hallucinated tags, and low-agreement outputs before they reach downstream stages.

LLM governanceRequirements EngineeringTag validation
Read →
In RevisionRE'26 · 2026

From Explicit to Implicit: Towards Traceable Keyword Discovery in Requirements Engineering

Explicit keyword extraction — even at audit-grade precision — hits a structural ceiling of roughly 1.5 canonical keywords per HLJ artifact, with Jaccard agreement below 0.11 across KeyBERT, RAKE, and YAKE. This paper presents a 5-phase implicit keyword discovery engine combining a 13,725-entry domain dictionary, a 275,164-edge enriched co-occurrence graph, UMAP+HDBSCAN clustering, and a bounded LLM-as-judge that tiebreaks only the borderline scoring band. Every implicit keyword traces back to its discovery signal(s) with full per-signal evidence.

Implicit knowledge discoveryGraph-based NLPUMAP / HDBSCAN
RE@Next!
Under ReviewRE'26 · 2026

Towards Improving Sentence-Level Requirements Identification via Explicit Local Context Modeling

An empirical study of sentence-level requirement classification on 110 real-world FinTech and SaaS documents (~5,700 candidate sentences, balanced to 15K mixed / 4K domain-only). We compare all-mpnet-base-v2 (110M) and DeBERTa-v3-base (184M) across frozen / LoRA / full fine-tuning and context window sizes k=0–3, and show that structured local-context features — not raw concatenation — are the critical signal.

Requirements classificationLocal context modelingLoRA fine-tuning
Main Track

Want the full research write-up? Switch to the academic view →

Capabilities

What I work with

Research Areas

Requirements EngineeringNLP for Software EngineeringLLM GovernanceKnowledge DiscoveryEmpirical SE

Languages

PythonTypeScript / JavaScriptPHP (Laravel)SQLBash

AI / ML

PyTorchHuggingFace TransformersSBERT / MPNetDeBERTa-v3FAISSLoRA Fine-tuningLLM ApplicationsRAG PipelinesUMAP + HDBSCAN

Systems

Distributed PipelinesPipeline OrchestrationFeedback LoopsConstraint-based ValidationAgentic Systems

Backend / Cloud

LaravelFlaskREST APIsMicroservicesRBACDockerAWSCI/CDGitHub Actions

Education

  • MASc, Software Engineering (Thesis)
    2024Present
    Ontario Tech University · Oshawa, ON, Canada

Certifications

  • AWS Certified Cloud Practitioner
    Amazon Web Services · 2024
  • OVIN Microcredential
    Ontario Vehicle Innovation Network · 2024

Contact

Let's build something

Have an AI or engineering problem worth chasing?

Happy to chat about NLP pipelines, LLM governance, autonomous migration systems, or graduate research collaboration. Email is the fastest way through.