A guide to data centre maintenance
  • Quick Order
  • Wishlists
  • 0
    £0.00 ex VAT

A guide to data centre maintenance

25 October 2021 By Bailey Walker - Maintenance Support Manager at 2bm

Data centre

Whether you’re operating a private or colocation facility, taking the appropriate steps to avoid a data centre outage should be amongst the highest priorities for all data centre managers – with maintenance at the very top of the list.

Over the next few months, we aim to provide a series of blogs that form a comprehensive guide to ‘best practice’ when it comes to data centre and comms room maintenance. For all the emphasis on the technical challenges of maintaining continuous service, following substantial research by the ‘Uptime Institute’, they identified the top causes of data centre downtime:

 

Pie chart

 

By understanding key areas of risk, a comprehensive maintenance schedule can be drawn up and implemented to help reduce the potential risks from a critical failure. Through our series of blogs, we intend to highlight the areas of concern and how a maintenance contract can help to mitigate the risk of future potential failures. 

What can happen?

There is so much that could happen, however a recent outage event at content delivery network ‘Fastly’ literally took down thousands of company websites spanning multiple countries around the world for almost one hour – this includes household names such as Amazon, Twitter and Spotify – resulting in total system failure. 

However, some failures can be measured by pure financial impact, like the outage which affected over two million bank customers, resulting in the bank paying over £370 million pounds in compensation. And another at a major airline company, where their outage left 75,000 passengers stranded and the airline facing a bill of £150 million in compensation.

These examples highlight the extreme impacts that can occur, will almost every outage involving a multitude of impacts, including:

  • Detection costs
  • Recovery costs
  • Replacement equipment costs
  • User productivity loss
  • Containment costs
  • Ex post response cost
  • IT productivity loss
  • Third party costs

Choosing the right contract for you!

We would recommend that as an urgent priority, all clients create an asset register for their organisation. This should include a list of serial numbers, positions and room numbers, together with a clear indication of dates of installation and any equipment guarantees. From such a list, you can then identify required SLA’s and PPM requirements, resulting in a bespoke contract being formulated that will include comprehensive and reactive options, dependent on budgets and support requirements.

Not all contracts are comprehensive enough to cover everything. Often there are a few exclusions inserted into the contracts, which is why 2bm recommends that you always study the content carefully. For example, UPS contracts often exclude consumables, batteries and capacitors, whilst fire suppression/Vesda systems frequently exclude gas bottles and Vesda filters.

Response times and Planned Preventative Maintenance (PPM) visits vary too and can be tailored to suit an individual organisations’ requirements. 

Throughout this series of blogs, we will provide a guide to ‘best practice’ and include updates with all the relevant information to inform and provide protection of data centre infrastructure, which includes the following:

  • Generators
  • Clinical Cleaning
  • UPS Systems including batteries
  • LV Panels/electrical installations/power strips
  • Fire suppression / VESDA
  • Monitoring and CDM maintenance planning
  • Cooling

The 2bm maintenance department is always available to provide consultancy as well as technical and physical support for ALL data centre and comms rooms infrastructure – for more information call 0115 9256000.

 

If you have any questions or would like to know more about how our Data Centres, Maintenance, Cloud Hosting and Security framework can help, please get in touch with the team.

Leave a comment