01

Ayaan Khan

Applied AI & ML

open to work
Chicago, IL ayaanahmedkhan12@gmail.com 04:17:10 PM CST
Building
VeritasLayer (deployed)

Evidence-traceable obligations, risk alerts & summaries from operational documents

Clutch (staging)

AI course generation platform

SyntecAgent (deployed)

Coding, Classification & Naming (CRUD) Agent

This site

continuously refining

Using

Hardware

MacBook Pro M4 Pro

iPhone 17 Pro

Samsung Z Fold 7

Software

VS Code

Claude

Codex

Gemini

Ghostty

Learning About

Agentic workflows

in construction and real estate.

Watching

Currently

House of Cards

Show · 2025

Past

Industry

Silicon Valley

Severance

Succession

02

About Me

AI engineer building production LLM systems — RAG pipelines, semantic search, multi-agent workflows. Originally from New Delhi, based in Chicago. Outside the terminal, I'm an avid football fan (the real kind), play guitar, and spend too much time thinking about music and art. I think good taste is a technical skill.

Education

B.S. in Artificial Intelligence

Minor in Architecture

Illinois Institute of Technology

Expected May 2026

DSAAIMLNLPDBMSAssemblyData MiningDiscrete MathLinear AlgebraProbabilityStatisticsOOP

Where I'm Headed

After graduation in May 2026, I want to join a team where I can ship AI systems that actually matter — whether that is at a fast-moving startup or a company building infrastructure for the next wave of intelligent software. Longer term, I want to build products of my own at the intersection of AI and systems design, and eventually contribute to research that makes LLM pipelines more reliable and interpretable.

03

Experience

AI & Digital Development Intern

The Syntec Group

  • Built and deployed a semantic RAG chatbot on Chatbase over firm documents, delivering cited, context-grounded answers to reduce lookup time and improve response consistency.
  • Developed an internal agentic system using OpenAI function calling to manage building module codes through natural language, with confirmation flows for destructive operations and ChromaDB sync for semantic search.
  • Implemented ingestion and retrieval workflow (chunking, embeddings, indexing) across PDFs, CSVs, website pages, and WordPress blog posts, with embedding caching via Redis that reduced inference cost by approximately 65%.
  • Led an information architecture plus website redesign improving navigation and access to resources; used engagement analysis to iterate content performance.

Chicago, Illinois

May 2025 – Present

Co-Founder

Volunteers.Covihelp

  • Created and managed a 24/7 helpline during India's second COVID wave, connecting thousands of patients with critical resources like oxygen, beds, and medicines.

Remote (India)

May 2021 – July 2021

Project Manager

Excelerate (Globalshala)

  • Led a global team to organize an academic event with a $30,000 budget, managing documentation, risk assessment, and external outsourcing.

New Delhi, India

June 2023 – July 2023

04

Projects

Clutch

·staging

Built a staging-deployed SaaS that generates research-backed courses using a multi-stage agent pipeline, orchestrating 5 stages with persisted job state, retry policies, and failure isolation to keep long-running workflows reliable.

FastAPI, Postgres/pgvector, Redis, Inngest, SvelteKit, LiteLLM, PydanticAI, Docker, Sentry, PostHog

SyntecAgent

·deployed

Built an agentic Coding, Classification & Naming system using OpenAI function calling that enables users to query, add, update, and delete BIM module codes through natural language with automated sub-code assignment.

Flask, OpenAI Function Calling, SQLite, ChromaDB, React, Docker

VeritasLayer

·deployed

Built an evidence-traceable document intelligence platform that extracts obligations, risk signals, and structured summaries from operational documents using LLM pipelines with source-level citation.

FastAPI, LLMs, Vector Search, PDF Ingestion, SvelteKit

Syntec AI Chatbot

·deployed

Shipped a semantic search and retrieval-augmented chatbot using GPT-4o and ChromaDB embeddings to deliver question answering across PDF documents, CSVs, blog posts, and website content with automated source citations.

Flask, React, Vite, ChromaDB, OpenAI GPT-4o, DeepSeek, Redis, Docker, Nginx, Vercel

InvestoChat

RAG system for real estate investment queries with multi-path retrieval (pgvector + SQL fallbacks with MMR), OCR processing, and automated table extraction from PDF brochures.

FastAPI, Postgres/pgvector, OCR, WhatsApp Business API, Airtable

Trend Analyzer for Raw Materials

Cotton price forecasting using Facebook Prophet with external regressors (oil, gas, soybeans); evaluated with MAPE.

Python, Prophet, Pandas, Matplotlib

Sports and Metrics Tracker

Built an end-to-end soccer video analysis pipeline using YOLOv8 detection and ByteTrack tracking, adding temporal smoothing for stable motion analytics and CPU only local processing with robust cross-platform video input and output.

Python, YOLOv8, ByteTrack, OpenCV

05

Communication

My Approach to Technical Communication

Technical work only matters if others can act on it. My approach: meet the audience where they are — precise specs for engineers, concrete outcomes for stakeholders. A confusing interface is a communication failure. In AI especially, closing the gap between what a model does and what people think it does is one of the highest-value skills on a team.

01 · Academic Poster

IPRO Innovation Day — Research Poster

Audience: Faculty, industry judges & peers

Sole designer of this poster for Illinois Tech's IPRO Innovation Day. The format demands compression — every word earns its place — and the mixed audience of faculty and industry judges forced me to layer technical depth with accessible framing. Fielding live questions sharpened my ability to adapt on the fly.

02 · Public Awareness & Operations

Volunteers.Covihelp — Crisis Communication Campaign

Audience: General public, patients & families across India

Co-founded a volunteer helpline during India's second COVID wave, writing outreach and coordination copy under real pressure. High-stakes, time-sensitive, directed at people in distress — clarity and empathy couldn't be traded off. Taught me that good communication removes friction for the reader, not credit for the writer.

03 · Technical Write-up

SyntecAgent — Technical Documentation & Stakeholder Handoff

Audience: Engineering team & non-technical stakeholders at The Syntec Group

Delivered SyntecAgent with docs written for two audiences simultaneously: developers maintaining the code and managers approving its use. Modularizing the document into clearly separated layers — rather than blending jargon with prose — made both versions sharper. Good technical writing is information architecture.

04 · Slide Deck

Clutch — Pitch Deck & Product Narrative

Audience: Potential users, collaborators & early investors

Pitching Clutch meant arguing why the product should exist to people with no reason to care yet. I led with the problem, not the tech — a deliberate choice that respects the audience's skepticism. Compressing a multi-stage AI pipeline into one compelling page made me a sharper communicator across every medium.

06

Systems Thinking

  1. 01.
    Ingestion & preprocessing

    Clean, chunk, and index messy docs so retrieval starts from ground truth

  2. 02.
    Hybrid retrieval + reranking

    Vector search alone hallucinates on structured data — combine with lexical matching and rerank

  3. 03.
    Agentic workflows & guardrails

    Function-calling agents with confirmation gates before anything destructive touches production

  4. 04.
    Structured outputs & validation

    Every LLM response parsed through Pydantic schemas — if it doesn't conform, it doesn't ship

  5. 05.
    Latency/cost routing

    LiteLLM across models, Redis caching, and fallback LLMs — not every query needs GPT-4o

  6. 06.
    Observability & failure modes

    Inngest job tracing, eval harnesses — catch breaks before users do

© 2026 Ayaan Khan

Built with SvelteKit