Argus Sentinel
AI-powered operational ticket routing infrastructure
An autonomous support ticket routing system that eliminates manual triage. It combines a deterministic rules engine with a self-hosted Gemma 3 LLM to classify and assign operational tickets at production scale — running on self-hosted infrastructure orchestrated through n8n.
A support operation was burning two full-time triagers on ticket assignment. Routing was inconsistent, SLA breaches were rising, and the cost of managed AI APIs (~₹50,000/mo at projected scale) made the obvious solution non-viable.
- Predictable routing — no hallucinated assignees
- Self-hosted only — no ticket data leaving infra
- Operate at 300+ tickets/day with sub-second decisions
- Replaceable by a human in any edge case — full audit log required
- Current ceiling: ~1.2k tickets/day on a single EC2 node before LLM queue depth grows
- Next: Redis-backed job queue + horizontal worker pool with shared embedding cache
- Adding observability via OpenTelemetry → Grafana stack for routing drift detection
- Quantization to 4-bit Q4_K_M gives ~2× throughput at 1.3% accuracy cost