Loading…

Note: Meeting Room 7 will be available as an On-Call Room for attendees.

Thursday, August 31 • 14:30 - 15:00
Reducing MTTR and False Escalations: Event Correlation at Linkedin

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

LinkedIn’s production stack is made up of over 900 applications, 2200 internal API’s and hundreds of databases. With any given application having many interconnected pieces, it is difficult to escalate to the right person in a timely manner.


In order to combat this, LinkedIn built an Event Correlation Engine that monitors service health and maps dependencies between services to correctly escalate to the SRE’s who own the unhealthy service.


We’ll discuss the approach we used in building a correlation engine and how it has been used at LinkedIn to reduce incident impact and provide better quality of life to LinkedIn’s oncall engineers.

Speakers
avatar for Michael Kehoe

Michael Kehoe

Staff SRE, LinkedIn
Michael Kehoe is a Staff SRE at LinkedIn who works on building scalable monitoring infrastructure, reliability principles, and incident management. Michael previously interned at NASA Ames on their PhoneSat project. Michael's key interests lie in network engineering and automatio... Read More →


Thursday August 31, 2017 14:30 - 15:00 IST
Pembroke Room