Trends and tips about auto-remediation of security issues

Developers should primarily focus on business features that support the core business of the organizations they work for. In DevOps, security can’t be forgotten, especially when operating in a public cloud environment. Business representatives and security gurus often clash in terms of priorities and precious resources to find the right balance between business functionalities and stable, secure software. Security tools help developers to find, classify and remediate security issues early on in the software development process. Often, developers find it painful to shift their focus from business logic to fixing security issues. Modern tools help to fix or lower the security issues that are introduced. It’s one of the latest trends in the area of software security tools: auto-remediation of security issues. This article explains the latest trends and provides tips to deal with them.

How it was in the past

Consider a developer which writes a software application. The application consists of pieces of source code to set up the (public cloud) infrastructure resources, the business logic itself as well as the database or other data-related configuration files. Errors can happen in every aspect as well as security aspects. It’s easy to include an AWS secret key in your Terraform script or introduce a piece of source code that is vulnerable to cross-site scripting attacks. Due to tight deadlines, developers can easily ignore this kind of issue. One can think “I’ll fix it in the next release”. But the next release can be the next hour.

So the developer rushes to delete the AWS Secret and he/she fixes the cross-site scripting weakness in the next commits. But before we’re here, the developer needs to understand the problem space and find an alternative solution (externalize the AWS Secret). He/she also needs to understand the concept of cross-site scripting and follow the instructions to prevent the application from being vulnerable to that. It all requires time and a thorough experience to do this in a timely manner.

Tools are there to automate these processes to speed up. But what exactly is it?

Auto-remediation

Security tools nowadays often have the option to switch “auto-remediation” on or off for specific security issues. Not all security issues can be auto-remediated because they’re too complex or in case there are multiple solutions for the same problem. Auto remediation only works for security issues that have a clear solution. In addition to that, the security tool needs to have the right permissions to change the underlying infrastructure resource or application. It needs to have “write” access to these resources, otherwise, it can only “signal” the security issue and not actually push the remediation.

Often the security issues trigger an alert which is an indication that something is wrong. The alert shows information about the classification of the issue, for example malicious, suspicious or no threat found. As soon as someone decides to respond to the alert, he/she can choose to “auto-remediate” the issue. In return, the system can actually update the resource so the issue is solved.

Practical implementations

We’ve tackled the theoretical aspects so far. What does this look like in reality? To answer this question, we’ll take a look at two popular security tools: Palo Alto’s Prisma Cloud and Sysdig Secure DevOps Platform.

Prisma Cloud

Palo Alto Networks offers their Prisma Cloud Enterprise tool to enable Cloud Security Posture Management for your cloud-based resources. Every major cloud provider is covered. It offers two modes to connect your cloud accounts to their SaaS-based product: read-mode and read-write mode. Only the read-write mode enables you to actually auto-remediate specific cloud security issues. The process is as follows:

Monitored cloud resources which are evaluated by the security policies, rules, and frameworks generate one or more alerts
You can view those alerts in a big list. Some of them can be remediated “with the click of a button”
When clicking the button, a command is sent to the cloud provider which actually updates the resource
Your source code is left intact, the only thing that is changed is the actual resources which become “compliant” based on the actual security policies and rules

The remediation rules are configured upfront based on the security policy that triggers the security alert. It’s also possible to write your own remediation steps/commands.

Sysdig Secure

Yet another approach applies to Sysdig Secure which offers auto-remediation as part of its DevOps platform. First of all, you should know that Sysdig acquired Apolicy a couple of years ago. Apolicy was built on top of Open Policy Agent and adds a lot more to the product which is now also integrated into Sysdig Secure.

One of the key strengths of their solution is to detect specific security issues in IaC templates (such as Kubernetes Manifests), report them back to the DevOps team and also change the actual source code that is the real root cause of the security issue.

The platform creates a pull request per security issue and anyone from the DevOps team can just review it and/or accept it to propagate the fix. And it does not stop here: it’s also possible to detect runtime issues and suggest changes to security issues that pose problems in runtime environments.

Challenges

Given the examples which are mentioned above, there are some challenges that might make you think about what it means for your workflow or other problem areas of your software development activities.

Failing tests

What happens when the tool suggests a solution that is also linked to a unit test, a component test, or even an integration test? Perhaps you are assuming a specific value or outcome of a piece of infrastructure and after the auto-remediation, your test failed. You need to carefully evaluate the suggestion and re-build your pipeline as soon as possible and re-run your tests.

Dependencies

Imagine your tool creates a pull request with a proposed change. Say the tool upgrades a package to a newer version since your current version has a major security problem. Suppose this new version is not backward compatible. This will cause problems in other areas of your application that could not be foreseen by the tool. In this situation, you really need to validate the pull request manually and not just click “accept”, carry on, and hope for the best.

Rollback?

Sometimes, a tool suggests a solution that makes your system more secure. Fine, but also more strict, for example to auto-apply less permissive permissions. Suppose a service account is restricted to only allow access to a certain Kubernetes deployment. And after the remediation, it does not has access to that deployment anymore. Your application is broken and you need to roll back as soon as possible and fix the issue in another way.

Therefore, it’s important to carefully evaluate every type of solution that enables auto-remediation for security issues.

Configuration drift

It’s tempting to let tools auto-fix any issues that they find in your runtime environments. This might prove to be a quick and efficient way to solve issues, even in production systems. No need to recreate your application and infra components to roll out a new version. Besides, no downtime if your system has been set up in a proper manner. But the devil is in the details.

Suppose your runtime system is “patched” on the fly, this does not mean your source code is fixed as well. Not all products offer this great capability. In case you do not fix the security issues in your actual source code, the problem pops up again after you redeploy your application and/or infrastructure resources. Besides patching things on a runtime level, you need to “reverse engineer” the proposed solution to keep your IaC templates and application “in sync”. Otherwise, you have a false sense of security until your next release sees the light.

More to explore

Developers and other business representatives might be enthusiastic about these solutions. The following links provide more information and practical examples about auto-remediation.

Auto-Remediation in Prisma Cloud: 4 steps that guide you through the entire process.
AWS and the Cloud Conformity Platform which teaches you how auto-remediation for your AWS-based resources works. It’s part of the “well-architected framework” and AWS partners with Cloud Conformity.
Microsoft Azure teams up with Aqua CSPM. You can create custom policies to scan your Azure-based resources and automatically let Azure apply those policies to fix security issues that are found.
The professional company ASecureCloud offers a tool with a set of templates you can add to your existing cloud resources stacks. These templates consist of rules to auto-remediate issues that might pop. It’s a cloud independent solution and you can select which templates to use. Very powerful as “add-ons” to your cloud resources.

Of course, there are many more tools and (standalone) solutions. These are just the tip of the iceberg but are worthwhile to explore.

Conclusion

Auto-remediation of security issues is an interesting topic for DevSecOps practitioners as well as (cloud) security engineers. The implementation of this concept differs per tool and cloud provider but there are great ways to support developers to speed up their coding efforts. You should be well informed about the pros and cons and carefully evaluate which tool offers the best for your situation.

I hope this article offered you valuable insights to make your (security) life a bit easier.

If you have questions related to this topic, feel free to book a meeting with one of our solutions experts, mail to sales@amazic.com.