Nicky has extensive experience delivering large scale cloud native architectures previously at the Financial Times and now at Skyscanner. She passionately promotes operability as a first class concern in developing these large distributed systems. She works now on the data platform at Skyscanner where the huge scale means a whole different set of problems to solve while still striving to be operable, cost effective and maintainable.
As organisations look to empower engineers more, and embrace devops practices, we have seen the support role change quite a bit too. Developers are moving from being purely third line support, to working more collaboratively with engineers and operational staff. Also as we move to cloud native microservice solutions, the increased complexity and diversity of our production landscape means operational staff may well rely more heavily on the engineers, in particular out of hours.
I have spent the last 18 years working across a plethora of industries utilising a myriad of technology and approaches. From working on everything from trading applications to content enrichment APIs, I have seen a lot of approaches and processes try to help minimise operational support for developers.
In this talk, I will be exploring and discussing some of my top approaches and techniques to help reduce the risk of that dreaded 3am call! You will gain some practical insight into how to handle failure in today’s more complex distributed microservice systems. This will include looking at approaches to resiliency, understanding your system, understanding the requirements for fault tolerance, and the developers’ mindset necessary for this. I will be peppering this talk with real world examples, and an occasional war story along the way too.