ITIL OSA Incident Management – Flashcards
Unlock all answers in this set
Unlock answersquestion
What is the purpose of Incident Management
answer
1. restore normal service operation as quickly as possible 2. minimize the adverse impact on business operations
question
Name the objectives of Incident Management
answer
-ensure standardized methods and procedures are used -increase visibility and communication of incidents -enhance business perception of IT -align activities to the priorities of the business -maintain user satisfaction with IT service quality
question
incident
answer
-unplanned interruption of an IT service -reduction in the quality of an IT service -failure of a CI that has not yet impacted a service
question
normal service operation
answer
the operational state where services and CIs are performing within their agreed service and operational levels
question
What is considered "in scope" of the incident management process?
answer
-any events that indicate disruption to an IT service -any events that could disrupt an IT service
question
What is considered "out of scope" of the incident management process?
answer
-informational events that indicate normal service operation -service requests
question
Business value of incident management
answer
-reduction of IT and business labor costs related to incidents -improved incident resolution, leading to higher levels of service availability -better alignment of IT and business priorities -increased ability to identify potential service improvements -identification of additional service or training requirements
question
Policies of Incident Management
answer
1. incidents and their status must be timely and effectively communicated 2. incidents must be resolved within agreed and acceptable timeframes 3. Customer satisfaction must be maintained at all times 4. Incident handling should be aligned to the priorities of the business 5. Incidents should be stored and managed in a single system 6. a standard classification scheme should be used for all incidents 7. Incident records should be audited on a regular basis 8. All incident records should follow a standard format 9. Prioritization and categorization should be done according to a common agreed set of criteria
question
timescales of incident management
answer
-must be agreed for all incident handling stages -will differ based on incident priority -must be based on agreed response and resolution targets with SLAs -captured as targets on OLAs and UCs as appropriate -communicated to all support groups -service management tools should be used to automate timescales based on predefined rules
question
incident models
answer
-predefined steps to handle known types of incidents in an agreed way -input into incident support tools to support automation -stored in the SKMS
question
service knowledge management system
answer
a set of tools and databases that is used to manage knowledge, information, and data; includes the CMS (configuration management system) as well as other databases and information systems; includes tools for collecting, storing, managing, updating, and analyzing, and presenting all the knowledge, information, and data that an IT service provider will need to manage the full lifecycle of IT services
question
incident models should include
answer
-chronological steps to handle the incident -responsibilities; who should do what -any precautions that need to be taken -timescales and thresholds for completion -escalation procedures -any necessary evidence preservation activities
question
major incident
answer
-highest priority category of impact for an incident; results in significant disruption to the business -must be clearly defined and mapped into incident prioritization scheme
question
major incident procedure
answer
-a separate procedure, with shorter timescales and greater urgency, which must be used for major incidents -establishment of a separate major incident team -led and managed by the incident manager; there is a risk of conflicting priorities when the incident manger is also the service desk manager -involve problem manager as necessary; incident manager ensures focus remains on restoration -service desk is accountable for recording all activities; responsibility for recording may be delegated; users kept up-to-date
question
incident manager (major incident role)
answer
this role is responsible for leading and managing the organizational response. Care should be taken when the incident manger is also the service desk manager as there may be conflicting priorities between those two roles during major incidents. As necessary, a separate person may designated to lead the major incident response team
question
problem manager (major incident role)
answer
this role participates as needed if the cause needs to be investigated at the same time as incident resolution. The incident manager should ensure that priority is placed on incident resolution and that the investigation of cause is kept separate.
question
service desk (major incident role)
answer
this role is responsible for keeping users up to date as the incident progresses through resolution activities. While it is ultimately accountable for keeping the incident record up to date, responsibility for this activity may be delegated as necessary to other support teams
question
incidents should be tracked throughout their lifecycle:
answer
-support proper handling and escalation -facilitate accurate reporting of incident status -capture within the incident management system
question
incident status examples
answer
-open -in progress -resolved -closed
question
Open
answer
an incident has been recognized but not yet assigned to a support resource for resolution
question
in progress
answer
-the incident is in the process of being investigated and resolved
question
resolved
answer
-a resolution has been put in place for the incident but normal state service operation has not yet been validated by the business or end user
question
closed
answer
the user or business has agreed that the incident has been resolved and that normal state operations have been restored
question
Incident management process activitites
answer
1. identification 2. logging 3. categorization 4. prioritization 5. initial diagnosis 6. escalation 7. investigation and diagnosis 8. resolution and recovery 9. closure
question
incident identification
answer
1. event management 2. web interface 3. phone call 4. email
question
incident logging
answer
-all incidents must be fully logged and date and time stamped +a unique incident record for each unique incident must be logged =service desk =automated incident creation =incident submitted over the web or e-mail =any other responsible groups -incident records must capture all relevant information +updated as it progresses through the lifecycle +full historical record
question
incident categorization
answer
-supports trending and analysis -can change or evolve through the incident lifecycle -multilevel categorization -confirmed at incident close
question
Defining Categories Approach
answer
1. brainstorm with relevant support groups (service desk superior, incident and problem managers) 2. best-guess the top-level categories from a user perspective (include an "other" category) 3. set up relevant tools and trial the categories 4. analyze incidents captured 5. perform a breakdown analysis of each high-level category; define the lower-level categories for each 6. implement the new categories and review after 1 to 3 months; ongoing review; changes could affect incident trending and should be done only when genuinely required
question
incident prioritization
answer
-priority should be based on impact and urgency -some factors contributing to impact: risk to life or limb; number of users or service impacted; level of financial loss of impact; effect on business reputation; regulatory or legislative breaches -clear guidance with practical examples should be provided to all staff -priority can be dynamic -occasionally, priority may be overidden
question
impact
answer
relates to the overall impact the incident is having on the business
question
urgency
answer
related to how quickly the business needs a resolution to the incident
question
initial diagnosis
answer
-typically performed by the service desk -uses diagnostic scripts and known error records -can be potentially closed over the phone with the user -if service desk cannot resolve incident over the phone, but can within an agreed timeframe: give the reference number to the user; inform the user of service desk intentions; escalate as necessary
question
functional escalation
answer
-transferring an incident, problem, or change to a technical team with a higher level of expertise to assist in an escalation -technical escalation -service desk: escalate when it is clear it cannot restore service within agreed timeframes; always owns the incident and responsibility for user communication -can involve internal and external teams -may be done multiple times for an individual incident -rules for escalations must be part of OLAs and UCs
question
hierarchic escalation
answer
-informing or involving more senior levels of management to assist with an escalation -management escalation -relay information, for example, major incidents -make required decisions, for example, resource allocation or incident assignment -settle disagreements
question
investigation and diagnosis
answer
might include: -establishing what has gone wrong -understanding chronological order of events -confirming full impact of the incident -identifying any events that may have triggered the incident -performing detailed knowledge searches coordination is critical when simultaneous activities are occurring all activities should be fully documented in the incident record
question
resolution and recovery
answer
solutions applied and tested as they are identified: -asking the user to undertake directed activities on their own desktop or remote equipment -the service desk implementing the resolution either centrally or remotely using software to take control of the user's desktop to diagnose and implement a resolution -specialist support groups being asked to implement specific recovery actions -a third-party supplier or maintainer being asked to resolve the fault actions must be coordinated by incident management sufficient testing should be performed to validate resolution incident is passed back to service desk for closure
question
Closure
answer
service desk responsible for incident closure: -user confirmation/acceptance -closure categorization -user satisfaction survey -incident documentation -ongoing or recurring problem? -formal closure automated incident closure: -may not be appropriate for VIPs and major incidents -must be discussed, agreed on, and communicated
question
closure categorization
answer
check and confirm that the initial incident categorization was correct or, where the categorization subsequently turned out to be incorrect, update the record so that a correct closure categorization is recorded for the incident - seeking advice or guidance from the resolving group(s) as necessary
question
user satisfaction survey
answer
carry out a user satisfaction callback or email survey for the agreed percentage of incidents
question
incident documentation
answer
chase any outstanding details and ensure that the incident record is fully documented so that a full historic record at a sufficient level of detail is complete
question
ongoing or recurring problem?
answer
determine (in conjunction with resolver groups) whether the incident was resolved without the root cause being identified. In this situation, it is likely that the incident could recur and require further preventive action to avoid this. In all such cases, determine if a problem record related to the incident has already been raised. If not, raise a new problem record in conjunction with the problem management process so that preventive action is initiated
question
formal closure
answer
formally close the incident record
question
triggers of incident management process
answer
-user calls service desk phone -user submits Web-based incident -event management sends automated alerts -technical staff -suppliers
question
inputs of incident management process
answer
-information about CIs and their status -known errors and workarounds -communication about incident symptoms -communication about RFCs and releases -communication of events -operational and service level objectives -customer feedback on incident resolution -agreed criteria for escalating incidents
question
outputs of incident management process
answer
-resolved/updated incidents -updated classifications -raised problem records -validation that incidents have not recurred -feedback on incidents related to changes and releases -identification of related CIs -customer feedback -feedback to event management on monitoring levels -communication of incident and resolution history
question
service design
answer
service level management information security management capacity management availability management
question
service transition
answer
service asset and configuration management change management
question
service operation
answer
problem management access management
question
service level management interaction with incident management
answer
-ability to resolve incidents in a specified time is key par of delivering an agreed level of service -enables SLM to define measurable responses to service disruptions -provides reports that enable SLM to review SLAs objectively and regularly -incident management is able to assist in defining where services are their weakest so that SLM ca define actions as part of the SIP (service improvement plan)
question
SLM defines....
answer
the acceptable levels of service within which incident management works, including: incident response times, impact definitions, target fix times, service definitions, which are mapped to users, rules for requesting services, and expectations for providing feedback to users
question
Information security management
answer
providing security-related incident information as needed to support service design activities and gain a full picture of the effectiveness of the security measures as a whole based on an insight into all security incidents. This is facilitated maintaining log and audit files and incident records
question
capacity management
answer
incident management provides a trigger for performance monitoring where there appears to be a performance problem; may develop workarounds for incidents
question
availability management
answer
will use incident management data to determine the availability of IT services and look at where the incident lifecycle can be improved
question
service asset and configuration management
answer
This process provides the data used to identify and progress incidents. One of the uses of the CMS is to identify faulty equipment and to assess the impact of an incident. The CMS also contains information about which categories of incident should be assigned to which support group. In turn, incident management can maintain the status of the faulty CIs. It can also assist service asset and configuration management to audit the infrastructure when working to resolve an incident
question
change management
answer
where a change is required to implement a workaround or resolution, this will need to be logged as an RFC and progressed through change management. In turn, incident management is able to detect and resolve incidents that arise from failed changes
question
problem management
answer
for some incidents, it will be appropriate to involve problem management to investigate and resolve the underlying cause to prevent or reduce the impact of recurrence. Incident management provides a point where these are reported. Problem management, in return, can provide known errors for faster incident resolution through workarounds that can be used to restore service
question
access management
answer
incidents should be raised when unauthorized access attempts and security breaches have been detected. A history of incidents should also be maintained to support forensic investigation activities and resolution of access breaches
question
Incident management tools provide the following types of information
answer
-incident and problem history -incident categories -action taken to resolve incidents -diagnostic scripts that can help first-line analysts to resolve the incident, or at least gather information that will help second- or third-line analysts resolve it faster
question
the following types of data are contained in the service catalog
answer
-key service delivery objectives, levels and targets -information about the service in terms that the customer and users understand -information that can be used for communication with customers and users
question
incident management should have access to the CMS to be able to identify relationship information such as the following about CIs:
answer
-identification of affected CIs -ability to estimate the scope and impact of the incident
question
resolutions information, such as the following, can be found in the knowledge error database (KEDB)
answer
information about workarounds that may be used to potentially restore service for the incident
question
incident records should contain the following types of data
answer
-unique reference number -incident categorization -incident urgency -incident impact -incident prioritization -date and time recorded -name or ID of the person recording the incident -method of notification -name, dept., phone, and location of user -call-back method -description of symptoms -incident status -related CI -support group or person incident is assigned to -related problem or known error -activities undertaken to resolve the incident -resolution date and time -closure category -closure date and time
question
IM CSF: Resolve incidents as quickly as possible, minimizing impacts to the business
answer
KPI: Mean elapsed time to achieve incident resolution or circumvention, broken down by impact code KPI: Breakdown of incidents at each stage KPI: percentage of incidents closed by the service desk without reference to other levels of support KPI: number and percentage of incidents resolved remotely, without the need for a visit KPI: number of incidents resolved without impact to the business
question
IM CSF: Maintain quality of IT services
answer
KPI: total number of incidents KPI: size of current incident backlog for each IT service KPI: number and percentage of major incidents for each IT service
question
IM CSF: Maintain user satisfaction with IT services
answer
KPI: average user/customer survey score KPI: percentage of satisfaction surveys answered versus total number of surveys sent
question
IM CSF: Increase visibility and communication of incidents to business and IT support staff
answer
KPI: average number of service desk calls or other contacts from business users for incidents already reported KPI: number of business user complaints or issues about the content and quality of incident communications
question
IM CSF: Align incident management activities and priorities with those of the business
answer
KPI: percentage of incidents handled within agreed response time KPI: average cost per incident
question
IM CSF: Ensure the standardized methods and procedures used for efficient and prompt response, analysis, documentation, ongoing management and reporting of incidents to maintain business confidence in IT capabilities
answer
KPI: number and percentage of incidents incorrectly assigned KPI: number and percentage of incidents incorrectly categorized KPI: number and percentage of incidents processed per service desk agent KPI: number and percentage of incidents related to changes and releases
question
challenges of incident management
answer
-early detection of incidents -convincing staff to log all incidents -information about problems and known errors -integration into the CMS -integration into SLM
question
risks of incident management
answer
-being inundated with incidents -unmanaged backlog of incidents -inadequate information -poorly aligned OLA and UC causing mismatched objectives
question
incident management process owner
answer
-carrying out the generic process owner role for incident management -designing incident models and workflows -working with other process owners to ensure proper integration of ITSM processes
question
incident management process manager
answer
-carrying out the generic process manager role for incident management -planning and managing support for incident management tools and processes -coordinating interfaces between incident management and other service management processes -ensuring and monitoring incident management efficiency and effectiveness -producing management information -managing the work of incident management staff -suggesting improvements for incident management -developing and maintaining incident management process, procedures, and systems -managing major incidents
question
first-line analyst
answer
-recording incidents -routing incidents to support specialists as needed -prioritizing, categorizing, and providing initial support for incidents -providing resolution and recovery of incidents not escalated -closing incidents -monitoring the status of incidents -engaging in ongoing communication with users about incident progress -escalating incidents as necessary
question
second-line analysts
answer
more specialized than first-line: -have additional time for diagnosis and resolution -resolve less complicated incidents to allow third-line to focus on the most difficult -advantages to locating them close to first-line support
question
third-line analysts
answer
-specialized level support includes a number of teams, such as: network support; application support; suppliers