As more brands pivot to selling online and embrace the digital transformation, they become reliant on cloud providers and softwares as a service (SaaS) companies, to get their work done.
This gives them some independence, but also makes them more vulnerable to problems like outages and other uptime incidents. In 2020, server outages cost up to a staggering $400,000 per hour of downtime globally. And when your customers are interacting with you online, an outage can make the world of a difference.
And server incidents are becoming a recurring news story. Last year, Fastly’s brief server outage took out Reddit, The New York Times, and even the UK government’s website. Not to mention the infamous Facebook, Instagram, and WhatsApp outage when we all realized just how much time we really spend online.
Future markets took a brief dip in the minutes after the outage, hinting at how much of the world economy is now threatened by these incidents. It’s troubling, then, that we saw a 142% increase in server outages in 2021 alone.
Among other things, this shows how important it is for you to have incident communication plans in place. This is basically a plan of how you plan to let your customers know what the problem is and what they can expect in the meantime. Done well, good incident communication can be an asset to your company and help you prepare for the worst that *hopefully* never comes.
The recovery paradox is a well-known phenomenon where customers will think more highly of you for recovering well from a failure than if you’d just provided the service flawlessly.
Let’s go over seven best practices for incident communication, some you can start to implement now, so that when the time comes, you’ll be ready to recover in the eyes of your customers.
1. Define severity
The first step when an incident occurs is defining the severity of the situation. You might want to group the possible incidents into different classes of severity. Are your product images not loading or is the entire ecommerce store down?
This will help the IT team determine the best course of action in a high-pressure situation. The Uptime Institute has its own classification system you can use as a start. This covers everything from unnoticeable “Category 1” outages to total catastrophes.
If you have unexpected downtime lasting more than a couple of minutes, it’s time to use one of your pre-written action plans. After all, you don’t want your customers heading to your competitors’ site out of frustration.
2. Use an information checklist and action plan
Incident communication should begin long before there’s even an incident. You know what they say–boy scouts and ecommerce store owners are always prepared!
Information checklists and an action plan should be drawn up and shared with the team so they can refer to them when something unexpected occurs. An information checklist should cover all of the questions that need to be answered in an incident. This should include information like:
- What time did the incident start?
- How long do you think this could last?
- Are all of your customers affected, or is it just a few regions?
- Are there any temporary solutions or alternatives for customers we can direct them to?
Employees in different parts of the company could gather this information over a text channel or face to face live chat, whichever is quickest. Once your team has gathered the information they need, it’s time to move on to your action plan.
Make sure everyone in the company knows where to find your action plan. If necessary, arrange it like a flowchart with different paths based on different findings. Depending on the incident, it could be anyone from web designers and engineers to customer service team members who are communicating with customers about it, so it’s important that everyone knows what happens in an outage situation.
3. Communicate the scope of the incident
Your incident communication messages should begin with the scope of the outage. Describing outage issues can get abstract quickly, so focus your communications on the customer impact. Which specific features are down? Are they down for everyone, or is it just customers in the EU?
Incident communication demands quick action. There shouldn’t be endless debating in conference rooms. You should have a communications lead selected ahead of time: one person who can make snap decisions about what goes out to customers and when.
You can have as many people draft your message templates as you like. But in the moment, there needs to be one person choosing which path in the action plan you go down.
Which channels you use for incident communication should be specified in the action plan. The most appropriate channel will vary based on the incident. If there’s noticeable downtime for regular customers, you should use your targeted email marketing channels and update people on social media depending on where your customers engage with you.
Whether your incident communications come from the engineering team or customer service will depend on your brands. But in any case, your marketing team knows the brand voice and how to build rapport with loyalty program members. Have them look over your incident communication templates and add your brand personality to them while keeping it professional.
4. Take responsibility
Even if the fault is with a third party whom you’re relying on like an app you use, your company is the one responsible for your customers’ experience. Afterall, your customers don't care if Shopify or BigCommerce’s servers are down. They’ll care that they can’t buy your new product line.
In all your incident communication, take responsibility for the problems your customers are having and apologize. Taking responsibility doesn’t mean blaming yourself. It means putting yourself in charge of fixing the problem for the customers who put their trust in you.
On the other hand, full transparency means describing the issues in detail, which can mean naming the problem service. One tactful way to navigate this is to name the downed service in a status page while referring to them in other communications outside of that as “one of our third-party service providers”.
5. Provide regular updates
Some of your users will be refreshing your site again and again. It’s essential to provide regular updates, especially in a situation that can change so quickly.
Small, regular updates are better than no updates at all, even if it’s just “We’re still working on the issue.” Even just adding a timestamp to your status page or sending out regular social media posts lets users know that this is the most relevant information you have. If you just update the timestamp on your status page in an incident, that’s better than being seen to provide no updates at all.
In a high-pressure situation, people want to hear from whichever source feels trustworthy and relevant. If your site is served across multiple domains, consider serving your incident communication page across those sites so that a user in Australia is getting information from your .au domain.
6. Write to the audience’s technical level
Whatever channel you’re on, incident communications should be simple and to the point. You can’t baffle already-stressed customers with technical detail. However, you might need to give IT managers or third-party tech support companies more detailed information than other users.
If you don’t have a dedicated status page on your website, Twitter is a great platform to keep your customers informed with that latest news. Take a page from Instagram’s book when they updated their customers’ that the app was back up and running. No technical jargon. No long-winded explanation. Just a quick apology and a turkey gif.
Another way to navigate this is to put out simple messages that link to your website status page, which should be a dedicated channel for this kind of information. See GitHub’s status page for an example of a page that’s clear and well-designed but also has enough detail to keep people in the know.
7. Follow up
Once the incident is resolved, consider a post-mortem blog post like GrooveHQ’s framework here. Or a simple social media post or email could suffice. This should cover what happened, how you solved it, and what you’re going to do to stop it from happening again.
You could use this to find out how your most important customers were affected by the incident and how you might navigate it better next time. It’ll help you refine your action plans and create a good customer experience in the wake of a problem.
The importance of incident communication
Good incident communication requires planning. But that planning is essential to helping your team confidently navigate a high-pressure situation. Good communication can make the difference between customers admiring your dedication to solving the problem, feeling like part of a valued brand community, and customers looking around for other providers.
This is a guest post from John Allen, Director of SEO for 8×8.