ITIL - Availability formula for Service Level Agre : ITILNews.com

ITIL - Availability formula for Service Level Agreements (SLAs) [5 comments]

Establishing effective Service Level Management (SLM) requires Service Level Agreements (SLA) to be defined and agreed with the Business or Customer community. The contents of the SLA should be clear and concise, objective and measureable. Having agreed the SLA the next stage is to report against the SLA.

To obtain the necessary buy-in and confidence from the Business / Customers, the report needs to accurately reflect the availability of the relevant services provided and agreed to in the Service Catalogue â€" the Service Catalogue may be attached to the SLA as an Appendix.

The formulas below offer one approach to establishing the availability of a service and depending on the level of detail contained within the Service Catalogue, together with the information collected when an Incident occurs (Incident Management), the second formula provides the opportunity to determine the availability of a Customer / Business group, region or location:

*Agreed Hours - (Incident(s) x Duration)*	x	100
*Agreed Hours*		1

The above formula provides a means whereby the availability of a service can be calculated for a specified period.

The second formula provides a means of calculating the availability of a specific group, region or location, where the service is delivered. It is important to remember the accuracy is dependant upon the Incident being reported to IT in the first place and secondly the data captured whilst recording the Incident:

*Agreed Hours x No. of Users - sum of (Incident x Duration x %. of Users Affected)*	x	100
*Agreed Hours x No. of Users*		1

Hopefully the above formulas assist you with constructing your Service Level Agreements. If you have different formulas or approaches to determining the availability of your services why not share them with ITILnews.com and its community.

5 VISITOR COMMENTS

2011-07-25 by "kapoor.anshul36"

I have a questions here. Why do we need the count of incidents to calculate the availibility? For example - 10 users who work on same application and all are affected at the same time, it's quite possible that they raise only 1 incident but that would be for 10 users, however in a case if each one would have called up seprately, it would have resulted into 10 incidents. But in both scenarios the issue and the impact remains the same.

Reply on 2011-08-01

Many thanks for your comment.

The formula can be modified or customized to suit a particular organization or circumstances. Regarding your example of incident reporting you are quite correct in the fact that the 'impact' of the incident remains the same. The important aspect here is that the 'impact' of the incident is captured.

I have implemented in some organizations the establishment of 'key service contacts' who act as a focal point of communication between IT and in this case the internal customer community. When multiple customer effecting incidents have occurred the Key service contact has reported to the Service Desk the impact of the incident upon those using the particular service, exactly as per your example.

At the end of the month or reporting period the incident is reviewed and the 'impact' of the incident inputted into the formula, enabling the availability figure for the service to be determined.

We hope this helps.

2012-02-29 by "msrajesh12"

How many people would be required in a general shift if there are 1000 calls per day? Shift timings are from 9am to 6pm and average handling time is 15 minutes.

2012-10-30 by "olivier.lavaux"

How do you consider the planned downtime (i.e. maintenance)?

Is it considered as an incident?

Does the Agreed hours ajusted with the planned downtime?

Reply on 2012-11-15

Planned Maintenance windows / slots / periods are normally agreed with the business in advance and often captured in an Service Level Agreement and underpinned with Operational Level Agreements.

Planned Maintenance are always captured as Requests For Change and are approved by the appropriate stakeholders both in the Business and also in IT. Even though the slots may have been agreed in advance they still need to be reviewed in light of current events. An example could be that a bank has for whatever reason suffered a major incident that has impacted it's customer community. Although the service may have been recovered the business may deem it too risky to apply a Maintenance Change, which could adversely effect the same customer community. In such cases the planned maintenance would be re-scheduled for a later date and once again reviewed at that time.

The planned maintenance should not be adversely considered in the availability formula as it was not unplanned.

2014-08-06 by "brandcl1"

When determining Availability, what impact does routine maintenance and emergency changes have on the measurement, if at all any. In other words, what's best practice in this regard? :-)

Reply on 2014-08-08

In answer to your questions. Providing 'routine maintanence has been specified and agreed in your Service Level Agreement and agreed together with agreed notification period then such outages should be excluded from the Availability measurement. In addition with my ITIL best practice hat on I would also expect or encourage that even rountine maintenance outages follow the Change management process to enable the 'Change' to be assessed against all other changes taking place and also in relation to the demands of your user or customer base at that time. Using your second question as an example if you are implementing or have implemented an emergency change that has effected the availaiblity of a service to its customers, if you have a 'routine maintenance' change planned around the same time you consider re-scheduling if possible to a later date with the agreement of the customer/business representative.

As for emergency changes you need to consider each one on a case by case basis as to whether or not the customer will accept that they are excluded from the availability measurement. Common sense should hopefully prevail. If you have a potential security risk that may involve taking offline infrastructure or components to perhaps apply an emergency patch or an anti-virus update it would hopefully be considered on its merits of protecting the business/customers. If an emergency change that was expected not to impact service availability but when implemented did impact availability it may potentially have an impact on the availability measurement.

2016-02-24 by "novastar55"

Would downtime be calculated as the time from outage -> agreement for a solution (adding in the solution -> actual fix ONLY if the date is missed)?

Or...

Downtime would be calculated as the time from outage -> actual fix? And how would this relate to availability?

Asking for a friend...

Reply on 2016-02-26

Thank you for your question and hopefully we have understood it correctly.

Regarding the downtime if the Supplier is expected to be monitoring the Service then one could potentially expect to measure the unavailability from the first moment it occurred. Alternatively if the Supplier is not tasked with monitoring and is simply expected to react to incidents when they are assigned to them then, the unavailabilty can be measured from the point the Supplier is made aware of the outage. I would expect the Service Desk to not only assign the incident to the Supplier Support Group but also telephone the Support Group to inform them of the assigned incident.

Regarding the last question I would suggest that the Incident Management process stipulates response and target resolution times to the various Incident severities or priorities. I have on several occassions stipulated that the response time for the highest Severity / Priority incidents as 15 minutes with a target resolution time of 120 minutes. The 'Response time' states that the Support Group go back to the Service Desk to acknowledge they have received the incident and the intial plan of action, which in turn is relayed to relavant Key Service Contacts, Management and Stakeholders, together when the next update can be expected.

A pragmatic approach is advised regarding availability and penalizing the Supplier, for example: if the Supplier has taken ownership of the incident, committed resources, provides regular informative updates and striving to resolve the incident as quickly as possible then the organization must consider what value and benefit is there in lambasting and applying financial penalties / service credits. It can be extremely demoralising for the Supplier and its workforce, furthermore it could sour relationships between each party.

We hope this response helps ?

There is 1 comment awaiting user validation. There is 5 comment awaiting publication.

YOUR COMMENTS...

Please submit any comments you have about this article.

Your feedback will help add value to the content for other ITILnews.com visitors and help us develop the content for the benefit of all.

You will need to provide and verify your e-mail address but your personal information will not be published or passed on to others. To identify each post we take the part of your email address before the @ sign and use that as the identifier, so if you are john.smith@itilnews.com your post will be marked "by john.smith".

NB: We respond personally to every post, if it calls for it.

If you prefer to respond without posting your comment please use our contact form.

Click the REVIEW button below to preview your comments.