Practical Vulnerability Remediation - When Should You Act?

Nir Dagan, DevSecOps Engineer
Nir Dagan, DevSecOps Engineer
December 21, 2023

The Challenge 

During the vulnerability remediation process, it is often the case that the number of security issues discovered by the different scanners is often larger than the capacity of security teams and their existing toolsets to resolve.  

When looking at a real-world vulnerability remediation lifecycle, the security team must consider several significant constraints. These include the ability to review each finding, prioritize it and find the relevant stakeholder for the remediation process. Additional constraints that are beyond the security team's control are the capacity of system owners or software developers to fix the issues and the risk that a change to a system might introduce. A software package update might break functionality, making the fix much more complex.  Once detected, each vulnerability should have an SLA, and a time for fix is determined. This is done to acknowledge the vulnerability and ensure that measures are taken to remediate it. 

The time to remediate should be derived from the known constraints and the risk the vulnerability introduces to the environment. Therefore,  a complete remediation timeline will comprise of the risk the vulnerability introduces, the operational risk the change might introduce and the availability of resources of the team that owns the system or service. 

Assessing Risk 

Assessing the risk is a critical next step of the prioritization process. It begins with assigning a risk level to the vulnerability so that vulnerabilities with higher risk will be prioritized for remediation.   

The most common approach to assessing risk is using the “impact-to-likelihood” matrix:  

Impact-to-Likelihood Matrix  

The likelihood is the probability of an exploit actually taking place, and the impact is the potential damage if the exploit does occur. 

CVSS is not enough

One common parameter for assessing the severity of the vulnerability is the CVSS score which  is usually provided by the different scanners. 

This scoring system provides a way to capture the principal characteristics of a vulnerability and produce a numerical score reflecting its severity. The score is calculated while considering the likelihood and the impact of the vulnerability, and most organizations rationalize prioritizing the vulnerabilities with the highest CVSS score as a preferred mode of operation. 

This, however, is not the most efficient or recommended way to achieve prioritization, as many factors that can impact the vulnerability risk are not calculated in this score.  The following factors describe situations where the CVSS falls short in assessing the overall risk of a vulnerability.

On the impact side:

The impact score - This metric depends on the system's importance and connectivity to other systems. An exploitation of a publicly exposed service is by far more impactful than an exploitation of a stand-alone service for some tests. Understanding the system or service on which the vulnerability is found and what this system does is critical for assessing the vulnerability risk factor. 

On the likelihood side:

  1. On the other axis of the matrix there is the likelihood of the vulnerability to be exploited. The likelihood is often hard to determine without deep technical analysis, but it is useful to employ a model that takes into account several factors such as historical data, tools that have an exploit for the vulnerability and other factors. Such a model is the EPSS model, and security teams can use the score this model provides for a vulnerability and use it to assess the likelihood of the vulnerability being exploited. 
  2. Environmental considerations are also essential when generating a likelihood score. Factors such as the exposure of a system and mitigating controls such as WAF and others, can dramatically affect the likelihood score.

The factors that are not included in the CVSS may significantly change the likelihood impact matrix compared to when looking at only the CVSS score. 

Ease of Fix

Another factor to consider besides the security risk that the vulnerability introduces to the system, is the operational risk that occurs when introducing a change to the system. This operational risk should be considered when determining the vulnerability SLA. An updated package might break functionality, and a change in configuration might cause secondary services to lose connectivity. This operational consideration should be assessed when determining the SLA for the remediation process.

Organizations that do not consider these additional factors are most likely assigning remediation resources to vulnerabilities that should not be prioritized as urgent, and therefore diverting resources from problems and issues that pose a real threat to the organization.  

Putting it all together

Once a vulnerability’s risk score is generated, it is time to determine in practical terms when to remediate. This means setting an SLA for remediation, which should be constructed by the level of the overall risk determined with the environmental factors as described previously, the operational complexity of the fix and the resources available to remediate the vulnerability .

Setting the Appropriate SLA

Here are four possible categories of potential SLAs for vulnerability fixes according to the overall risk assessed and operational constraints: 

  1. Immediate fix: For high-risk cases in which critical systems are considered to be in  imminent danger of exploitation. A good example would be log4j, a widely used library with an easy-to-exploit vulnerability that may lead to an attacker's ability to run code on an affected system.  Such a case necessitates a complete redirection of all resources to the immediate remediation or mitigation of the vulnerability, as it must be handled immediately. 
  2. Next maintenance window: This is a high-risk case, in which the vulnerability can be exploited but the exploitation path is not clear for a potential attacker. It might require additional steps from the attacker such as exploiting another service before. This requires assigning out-of-band engineering resources to address the vulnerability. It is expected that the issue will be fixed at the next maintenance window. 
  3. Next version update: This is for cases with fairly low severity, in which vulnerabilities can be part of an exploit chain, but are unable to lead to a serious exploit on their own. Remediation will be part of the task pipeline for the current development phase or part of the next version update. 
  4. Next major version update: This is for low severity vulnerabilities or vulnerabilities that are not high severity and remediating them might break compatibility with other packages. 

The process of prioritizing and determining the SLA can be challenging for a single vulnerability, and attempting to do it at scale is an incredible challenge. Vulnerabilities, systems and teams all have different aspects that should be considered and can differ depending on the specific case. 

As a way to assist security teams in dealing with this challenge,  vulnerability remediation platforms were created.  Vulnerability remediation platforms take these factors into consideration and create automated processes for prioritizing and determining the vulnerabilities’ SLA, as well as orchestrating the full vulnerability remediation lifecycle with the different teams. Such platforms help categorize, assess and determine risk for vulnerabilities in an automated manner and can scale those processes for an enterprise-size organization.  

If you want to learn more about tools and techniques that will help your tacke vulnerability remediation at scale, feel free to contact Opus Security for a demo.