3:17 AM - an SMS just woke you up; You stare at Nagios half asleep and panic, production is down!!!. You've got 20 minutes to do.... something.
This talk will cover strategies for handling crisises with minimum pain and maximum results and will touch related subjects such as disaster planning, fault-tolerance, SLA, post mortems, production debugging, etc.
Veteran ops and a survivor of many prod skirmishes. Currently masquerading as the CTO of Fewbytes - a consulting company for Ops and architecture.
Ops, Devs and Biz (bizdev, product, VPs) - anybody who cares about production and uptime.