Incident management

An incident is simply when an IT service is disrupted. This could be an unplanned outage to your application or degradation in its performance or functional quality. When a genuine incident is raised for your application, which will inevitably happen, the goal is to restore normal service as quickly, and in the least disruptive way, as possible. The most basic and common example is a simple restart of the application or its infrastructure.

The difference and segregation between an event and an incident is probably one of the biggest and most common areas of confusion. Clearly, not all events are incidents. However, for convenience, there are a lot of monitoring tools that are set up in a way whereby all significant events will automatically raise an incident. Over time, and because of the way it's set up, people often refer to everything as an incident, even when there has not been any interruption or degradation to the service.