applied ai · systems engineering

I build AI-powered operational systems that automate organizations.

Applied AI engineer focused on workflow orchestration, self-hosted infrastructure, and production automation. I design systems where deterministic logic and LLM reasoning replace repetitive human operations — measured in tickets routed, cost reduced, and hours returned.

View systems → Resume GitHub LinkedIn

basedBanglore / Pune · IN

stackn8n · Gemma · React · Node

statusopen to founding-engineer roles

live · argus.routing node-04 · uptime 41d 06h

throughput312/day

accuracy99.6%

p50 latency820ms

infra cost₹438/mo

/var/log/argus.events tail -f

▍

operational leverage

snapshot · last 90 days

300+

tickets routed / day

argus sentinel · production

99.6%

routing accuracy

hybrid rules + llm reasoning

₹438

monthly infra cost

down from ₹50,000 (api alt.)

20K+

plugins audited

visual headless cms · 5 yrs

600

reels / month capacity

autonomous content pipeline

₹0.73

cost per reel

end-to-end orchestration

engineering philosophy

/notes/01.md

I'm interested in systems where human effort is repeatedly wasted. Most operational inefficiencies exist because workflows evolve faster than tooling.

My work focuses on identifying those bottlenecks and replacing them with AI-assisted operational infrastructure. I move across AI integration, backend systems, orchestration pipelines, automation, and product engineering — whatever the system requires.

The goal is simple: reduce operational friction, increase leverage, build systems that scale human capability.

“ The best operational systems remove repetitive decisions from humans without removing human control.

production systems

Three systems. Each one replaces a recurring operation with measurable leverage.

Case studies, not project cards. Each entry covers the problem, the tradeoffs taken, and where it goes next.

sys.01

Argus Sentinel

AI-powered operational ticket routing infrastructure

production

An autonomous support ticket routing system that eliminates manual triage. It combines a deterministic rules engine with a self-hosted Gemma 3 LLM to classify and assign operational tickets at production scale — running on self-hosted infrastructure orchestrated through n8n.

Problem

A support operation was burning two full-time triagers on ticket assignment. Routing was inconsistent, SLA breaches were rising, and the cost of managed AI APIs (~₹50,000/mo at projected scale) made the obvious solution non-viable.

Constraints

Predictable routing — no hallucinated assignees
Self-hosted only — no ticket data leaving infra
Operate at 300+ tickets/day with sub-second decisions
Replaceable by a human in any edge case — full audit log required

Tradeoffs

Hybrid rules + LLM

Deterministic for the 80% predictable cases; LLM only when rule confidence falls below threshold. Cuts hallucination risk and inference cost.

Self-hosted Gemma 3

Ticket bodies contain customer PII. Self-hosting removes the privacy and latency tax of managed APIs.

Rules evaluated first

Predictability is a feature, not a limitation. Rules give the team an auditable decision boundary.

n8n for orchestration

Visual graph means operations can inspect and edit flows without me. Lower bus-factor than custom code.

Scaling thoughts

Current ceiling: ~1.2k tickets/day on a single EC2 node before LLM queue depth grows
Next: Redis-backed job queue + horizontal worker pool with shared embedding cache
Adding observability via OpenTelemetry → Grafana stack for routing drift detection
Quantization to 4-bit Q4_K_M gives ~2× throughput at 1.3% accuracy cost

Architecture notes → GitHub

/architecture/argus-sentinel.svg v 2.4 · detailed

sys.02

Autonomous Content Pipeline

AI workflow system for research, script generation & content ops

production

A self-hosted workflow that converts a single Telegram message into three research-backed video scripts. Trend search, context ranking, generation, and delivery happen autonomously — built in under 18 hours, running for ₹0.73 per reel.

Problem

A content team was spending 4–6 hours per reel on trend research and scripting. The work was repetitive enough to automate, but creative enough that off-the-shelf tooling produced generic output.

Tradeoffs

Telegram as entry point

Operators already lived there. Adding a new dashboard would have lowered adoption.

Serper over scraping

Reliable, low-latency, structured. Spending ₹0.40/run on search is worth not maintaining a scraper.

JSON-structured output

Forces the LLM to commit to schema. Downstream parsing has zero string-matching.

Scaling thoughts

Adding voice generation + auto-cut would close the loop end-to-end
Per-creator context profiles for tone consistency across runs

Architecture notes → GitHub

/architecture/content-pipeline.svg v 1.2 · linear

sys.03

Visual Headless CMS

Custom CMS platform built for scalable hotel website operations

production

A headless CMS purpose-built for a portfolio of hotel sites. Visual editor with live preview, automatic media optimization, GitHub-versioned content, and a reusable plugin system that lets non-engineers compose pages without touching a deploy pipeline.

Problem

Operating dozens of hotel sites meant content edits were either gated on engineering time or risky direct-to-prod changes. A generic CMS couldn't model the inheritance patterns the design system needed.

Tradeoffs

GitHub as source of truth

Every change is a commit. Reversion, audit, and review come for free.

Plugin-as-block model

Marketers compose pages from sanctioned blocks. Engineers ship the blocks once.

Design inheritance

A property override only writes the diff. Brand changes propagate without bulk edits.

Architecture notes → GitHub

/architecture/visual-cms.svg wireframe · editor view

decision logs

The tradeoffs behind each system, in plain language.

Architectures are choices. Here are mine and the reasoning behind them.

01 argus

Why hybrid rules + LLM, not LLM-only?

→Deterministic flows are auditable; LLMs are not.
→Rules cover the predictable 80% at near-zero cost.
→LLM is invoked only when rule confidence falls below threshold — bounding both spend and hallucination surface.
→Operators trust systems they can debug. A rule that fired is a rule you can read.

02 infra

Why self-host instead of API wrappers?

→Ticket bodies, content drafts, internal data — none of it should leave infra.
→API costs scale with usage. Self-hosted infra scales with hardware, which is fixed.
→Latency at the edge of a VPS is more predictable than latency to a third-party provider.
→Long-term: every API-wrapped product becomes a margin-compression problem.

03 orchestration

Why n8n over a custom orchestrator?

→Visual workflows lower the bus factor — operations can read and edit them.
→Iteration speed beats elegance when systems are still finding their shape.
→Custom code earns its place once a workflow is load-bearing and stable, not before.

04 economics

Why measure cost per ticket / per reel?

→A system that works but costs more than the human it replaces is a science project.
→Unit economics force honest design conversations early.
→Cost becomes a forcing function for architectural decisions (quantization, batching, caching).

what broke

Production teaches faster than design docs.

Three failure modes I've hit running these systems, and what they taught me.

incident.01

Webhook storms during deploy

Jira retried failed webhooks aggressively during a 90-second deploy window — queue depth spiked from 0 to 1,400 inside two minutes. Added an idempotency layer keyed by ticket id + payload hash and a Redis-backed dedupe TTL.

incident.02

LLM drift on edge categories

Routing accuracy on a rare category quietly fell from 98% to 91% over six weeks as the support team coined new internal terms. Added a weekly drift report on misroutes flagged by ops, and a one-shot fine-tune pipeline.

incident.03

n8n single-node ceiling

Hit a CPU ceiling around 900 tickets/day on a t3.medium. Vertical scaling bought time; the real fix is decoupling ingest (cheap) from inference (expensive) behind a queue.

systems timeline

Frontend → orchestration → AI infrastructure.

The thread is the same: replacing repetitive work with systems that scale.

2020

Frontend engineering

React, design systems, hotel website portfolio

2021

Plugin architecture

Reusable block system, dynamic design inheritance

2022

Custom headless CMS

Visual editor, live preview, GitHub-versioned content

2023

Workflow automation

n8n pipelines, App Script orchestration, internal tools

2024

AI integration

LLM routing, classification systems, hybrid decisioning

2025

Self-hosted AI infra

Argus Sentinel, content pipeline, operational dashboards

operational stack

Tools grouped by the layer they operate on.

Not a skill grid — the actual stack these systems run on, day to day.

AI Infrastructure

self-hosted reasoning · llm deployment

Gemma 3 LLaMA Ollama vLLM Q4 quantization embedding stores

Automation Layer

orchestration · pipeline ops

n8n Google App Script cron webhooks Telegram Bot API Serper

Engineering Layer

product & systems code

React TypeScript Node.js MongoDB Express REST · webhook design

Infrastructure

compute · network · ops

AWS EC2 Docker Nginx VPS PM2 GitHub Actions

proof layer

Repositories. Architecture notes. Things you can read.

The work exists. Code, READMEs, deployment notes — no marketing wrapper.

argus-sentinel

AI-powered ticket routing system. Hybrid rules + LLM, self-hosted on EC2.

TypeScript

n8n gemma production

content-pipeline

Telegram → research → script generation pipeline. ₹0.73 per reel.

JavaScript

n8n serper llm

visual-cms

Headless CMS with visual editor, plugin system, GitHub versioning.

TypeScript

react platform

ops-dashboards

Self-hosted operational dashboard primitives. Queue depth, routing state, drift.

TypeScript

observability

why teams hire me

Two-minute version, for the recruiter who's already on slide four of a stack.

If you're sourcing for a founding engineer, applied AI, or automation infrastructure role, these are the six things I'm consistently good at.

Get in touch →

Operational automation

Replacing recurring human ops with systems that measure their own impact.

AI workflow systems

Hybrid rules + LLM architectures with bounded cost and audit trails.

Infra ownership

From EC2 provisioning to Nginx config to production observability.

Rapid prototyping

Ship working systems in days, not quarters. Iterate based on production data.

Cost optimization

Architectures designed against unit economics, not theoretical scale.

End-to-end architecture

Ingest → processing → decisioning → action → feedback. One person, one mental model.

contact

Building something operational?
Let's talk.

I'm open to founding-engineer roles, applied-AI engineering positions, and selective contract work where the system genuinely changes how an organization operates.

hello@abhisheksonawane.dev

→

github

github.com/abhishek-sonawane

→

linkedin.com/in/abhishek-sonawane

→

resume

abhishek-sonawane.pdf

↓

I build AI-powered operational systems that automate organizations.

Three systems. Each one replaces a recurring operation with measurable leverage.

The tradeoffs behind each system, in plain language.

Why hybrid rules + LLM, not LLM-only?

Why self-host instead of API wrappers?

Why n8n over a custom orchestrator?

Why measure cost per ticket / per reel?

Production teaches faster than design docs.

Webhook storms during deploy

LLM drift on edge categories

n8n single-node ceiling

Frontend → orchestration → AI infrastructure.

Tools grouped by the layer they operate on.

Repositories. Architecture notes. Things you can read.

Two-minute version, for the recruiter who's already on slide four of a stack.

Building something operational?Let's talk.

Building something operational?
Let's talk.