Problem Management Best Practices
Understanding and implementing problem management best practices for effective incident resolution.
Understanding and implementing problem management best practices for effective incident resolution.
Problem management is an ITSM process that deals with the underlying causes of multiple incidents. It aims to prevent possible service interruptions by eliminating these causes. We elaborated on problem management in another article. One resolved problem can prevent multiple incidents simultaneously, reducing costs for reactive incident management measures, improving the IT team’s brand with the users, and boosting the team’s morale – because they no longer need to cope with repeat issues.
That is why establishing a problem management process in your organization is of utmost importance. We’ve put together this list of problem management recommendations for those who understand the value of problem management but don’t know how to start implementing it. Most apply to companies that are only beginning their problem management practices, but you might find valuable tips even if you already have an established problem management process.
A critical point about problem management is its intersection with incident management. To understand their mutual dependency, consider problem management as a broader, more profound method of working with IT issues. The object of both problem management and incident management is the same: an IT service interruption. But incident management deals with particular evidence of these interruptions, while problem management addresses the more significant issues behind them.
Here are some problem management practices for supporting the connection between incident management and problem management:
Although incident management and problem management intersect, for both to work well, remember that these are two independent processes. When you direct special efforts to problem management, you can rely on it to prevent incidents and eventually bring you the value you expect. Practically this means that you should:
One of the vital problem management best practices is the KEDB. The KEDB stands for the known errors database. Known errors, or known issues, are the problems identified earlier and whose resolution didn’t result in a permanent output on the technical level. This might have happened because the resolution was too expensive or too complex. Instead, teams find temporary solutions for the problem. Such solutions are called workarounds.
To illustrate what a known error is, let’s check out Grammarly Support Portal. One of the known errors of Grammarly is that its browser extension doesn’t work on some websites. To the Grammarly team, this is a known issue. On their help portal, they’ve collected some tips on dealing with this problem, such as deactivating other extensions or restarting the browser. If none of this helped, they advise the reader to switch to Grammarly for Windows or Mac. The team can’t resolve the issue permanently. So instead, they document workarounds and broadcast them.
The KEDB contains all known errors and the workarounds for them. The KEDB should be designed to access and search through its contents quickly. Keyword tagging will help incident owners from the help desk check whether there is a known error with the symptoms they’re facing.
When you’ve built the KEDB, both the incident and the problem support team should use it each time they have doubts about the origins of an incident or a problem.
It’s hard to suggest what should be prioritized as the result of a problem resolution process. A workaround or a permanent solution? In your problem management practices, try to keep a balance between them with your business objectives in mind. Workarounds help save time, while permanent resolution paths are better for repeat incidents and those frequently used IT services. For example, imagine every visitor to your office building stumbles upon an error when scanning their entrance pass. In peak hours, this can lead to people waiting in line to get to the office, missed meetings, and tasks not being completed on time. Apparently, a permanent resolution to the error at the entrance is preferable in this case, even if complex and expensive.
Before going into detail, let’s ensure we’re on the same page about the terms. Reactive measures take place immediately to respond to a situation, while proactive measures aim to avoid a situation before it happens. In incident and problem management, reactive activities are all activities with issues reported by users. Reactive problem management doesn’t try to prevent issues; instead, it reacts to those issues identified. Proactive measures, on the contrary, include analyzing incident and problem management data and interviewing users to see the big picture: what’s happening and what problems might arise.
It’s vital to distinguish between these two methods and apply both in your problem management practices.
Start your trial with Alloy Software today
Usually, companies are fine with reactive problem management. To be proactive and prevent potential issues, in contrast, you have to approach things with more creativity and an analytical mindset. In addition, proactive problem management requires a significant investment of time and resources. This may be a low priority for many organizations that currently focus on short-term results. The tips for jumpstarting proactive problem management are similar to the recommendations for promoting any innovation in an organization:
Still, with all the benefits of proactive problem management, don’t forget about reactive measures because they allow for maintaining day-to-day activities.
Organizations choose different tools for problem management. The easiest form would be a spreadsheet, while larger companies might need something more complex. With any tool, you’ll likely need to design the record form on your own. This includes creating a list of the record fields. To build a comprehensive form, think of all stakeholder groups using the records. They are likely to pay attention to the different parts of the record. For example, an executive gathering data for a report will want to see the business outcomes. While a second-level support technician will need all the raw numbers about the problem to perform an analysis. Convert the columns, or fields, to a format suitable for the purposes of the main potential user groups.
The right content of the training for your support employees could help your problem management practices, as well. Ensure they learn not only about the technical environments of your organization but also about the business processes these environments maintain. By learning about the business implications of your IT system, your employees can gain a deeper understanding of the impact of incidents and problems on the organization’s operations and overall success.