hardware humbles you

UnifyOS: AI-powered IoT that cuts hotel emergency response from 3.5 mins to 45 seconds 2026 · 8 min read built a unified ESP32 + Gemini emergency platform for Rs.1,254/node in a few late nights I've been really interested in how insanely fragmented hotel safety infrastructure is — and how cheap it is to actually fix it. Fire alarms that don't talk to staff phones. Staff phones that don't talk to guests. Guests who don't speak the local language. The 2023 New Delhi Karol Bagh hotel fire killed 17 people. The fire alarm went off correctly. No automated mechanism existed to alert staff, start an evacuation, or notify a single guest in a language they understood. So I did what I do best, and built a quick end-to-end system! Found some really interesting results on AI triage accuracy and offline resilience!! After getting obsessed with how siloed building safety systems are, I wanted to 1) Reproduce a real "aha!" moment where a Rs.325 microcontroller + Gemini actually thinks about an emergency and 2) Have something deployable enough that a mid-market hotel could actually afford it. UnifyOS is the result — an AI-powered IoT platform integrating ESP32 sensor nodes with a Node.js/WebSocket backend and a Gemini triage engine, delivering real-time multilingual emergency alerts in 45 seconds flat. You can check out the code and paper over here — github.com/sadiapeerzada/unifyos the build The recipe was mostly: Hardware layer first. Each node is an ESP32-DevKit V1 (dual-core Xtensa LX6, WiFi/BT integrated, ~Rs.325) wired to five sensors: MQ-2 for smoke/gas, DHT22 for temp/humidity, HC-SR501 PIR for occupancy, HC-SR04 ultrasonic for crowd density estimation, and an INMP441 MEMS microphone for audio anomaly detection — fire alarm tones, glass breakage, crowd distress. A red GPIO panic button for staff-initiated escalation. OLED for local status. 18650 Li-Ion for 8–10hrs of autonomous battery operation. Total BOM: Rs.1,254 per node. Backend + Gemini triage. Node.js/Express with Socket.io pushing sensor JSON every 500ms. When a threshold breaches, the backend constructs a structured Gemini prompt — raw sensor readings, historical baseline stats, venue context, language preferences — and gets back a 1–10 severity score, anomaly classification (fire/intrusion/medical/crowd crush/environmental), and a natural-language action recommendation. Google Translate API runs in parallel to deliver the alert in the guest's native language. Firebase handles realtime state + Cloud Messaging for push notifications to phones. End-to-end alert latency: under 200ms. Offline-first from day one. Hotels have unreliable WiFi. When connectivity drops, the ESP32 writes to SPIFFS flash in a circular buffer — 72 hours of readings, zero data loss. On reconnection, everything syncs to Firebase within 8 seconds with original timestamps preserved. Firmware is C++ with Arduino/PlatformIO + FreeRTOS. Each sensor cycle completes in under 50ms. Trained on 1 H100... okay, deployed on one USB-C cable and two screws. some tips for building IoT + AI triage systems / things I learned WebSockets over polling, always. The under-200ms end-to-end latency is only possible because of Socket.io persistent connections. HTTP polling would've added seconds — which in an emergency is the difference between the alert meaning something and meaning nothing. Offline-first is non-negotiable for safety-critical systems. The moment you assume connectivity you've built something that fails exactly when it matters most. SPIFFS circular buffering saved this completely — 72 hours, zero data loss across every outage simulation trial. Gemini confidence scores belong on the dashboard. Surfacing the 0–100% confidence score gives responders a calibrated uncertainty signal — escalate immediately vs. go check. Don't hide it in the backend. It's probably the most useful single number on the UI. Multilingual is a first-class feature, not a setting. 78% of fire-related hotel fatalities in India between 2010–2023 had delayed staff notification as a contributing factor. Running Google Translate in parallel adds basically zero latency — validated across Hindi, Arabic, Mandarin, French, and Spanish. There's no excuse for this being an add-on. Sparse meaningful sensor fusion beats more sensors. Five sensors + audio RMS computed on-device gives you fire, intrusion, medical, crowd crush, and environmental classification. Adding more sensors before nailing the AI triage layer is just noise accumulation. A model that gets context is critical!!! Gemini isn't just doing threshold checks — it's reasoning about combinations. Elevated temperature + elevated gas + crowd distress audio + panic button = very different from just elevated temperature. The prompt construction matters enormously here. interesting results You can see that Gemini triage pretty significantly saturated accuracy on Critical events (~98.9% recall) which you'd expect to want from a life-safety system. [chart: triage accuracy comparison] The response time improvement held consistently across all 30 controlled drill scenarios — boutique hotel, convention centre, airport lounge. Mean 45.2s (σ=4.1s) vs 210s for conventional radio — a 17× improvement. The σ tightening is actually what I'm most proud of. Reliable fast is better than occasionally fast. The layer entropy equivalent here is the triage confidence distribution — Gemini's confidence scores increased in calibration as the prompt context got richer. Really interesting because it suggests that good prompt engineering around sensor context is doing the same job as more training compute. [chart: layer entropies] Was able to reproduce the "completions get longer as the model learns" equivalent — Gemini's response recommendations got progressively more specific and action-oriented as the structured prompt evolved. Even without explicitly rewarding specificity, richer outputs correlated with higher accuracy scores. The model was learning to think out loud. [chart: completion lengths over time] The loss curve for response latency isn't perfectly smooth, but you see clear drop-offs whenever SPIFFS sync resolved a backlog — equivalent to a new reference model being chosen. There's probably a lot of interesting backpressure / queue management optimization to be done here. [chart: loss over time] The cost delta is almost embarrassing. Rs.325 for the ESP32 vs Rs.3,500+ enterprise. Rs.365 for the full sensor suite vs Rs.12,000+. Cloud hosting on free tier vs Rs.60,000+ annually. Multilingual AI alerts included vs Rs.30,000+ as a separate add-on. Total per node: Rs.1,254 vs over Rs.1,25,000 — a greater than 99× difference. what's next Current evaluation is all controlled drills — field validation in operational hotels is the real next step. A few interesting things I want to ablate: ESP-NOW mesh networking for WiFi-down resilience (not just SPIFFS buffering), ESP32-CAM for proper occupancy sensing (the HC-SR04 can't reliably distinguish humans from furniture — which is an obvious problem), full integration with India's 112 emergency dispatch API, and scaling beyond 50 concurrent nodes with Cloud Pub/Sub. The longer-term thing I'm most excited about: a fully autonomous Gemini agent that coordinates the complete emergency response lifecycle without human intervention — not just triage and alerting, but dynamically routing evacuation, adjusting severity thresholds based on live occupancy, coordinating with external responders. The building should be smarter than the incident. Source code, firmware, and hardware schematics releasing under MIT licence soon — open to any awesome collaborators for field deployments!

Client

Self-initiated / Google Solution Challenge

Year

2026

Project type

IoT · Research

Credits

Alisha Hasan, Asna Mirza, Sadia Peerzada, Sadia Zafreen

Next project

on music and being an old soul