As part of a recent initiative at ForAllSecure to analyze more open source software with Mayhem, a next-generation fuzzing solution, we decided to investigate some cryptographic libraries.
Why Crypto Libraries?
Why look at crypto libraries? First there is the obvious: SSL, TLS, and related protocols managed by these libraries power much of the secure web. Every day these sorts of software are relied upon to secure everything from usernames and passwords to emails to banking information.
Although there have been some cryptographic issues in these protocols, the majority of vulnerabilities in codebases like this end up being more “standard” coding errors. It is easy to focus on the pure math and cryptographic aspect of secure communications, but establishing secure connections requires a lot of work parsing certificates and messages and is easy to mess up.
Despite some excellent recent efforts to write cryptographic libraries in memory safe languages, demands for performance, scalability, and portability have led many people to develop in C, which is a notoriously difficult language to use for secure software development. Further complicating matters, even small bugs that might not be security sensitive in some applications can be disastrous for cryptographic scenarios.
For all these reasons, cryptographic libraries are popular targets for security analysis: whether that be by fuzzing, static analysis, or even manual audits. Despite the large amount of review these codebases have, Mayhem has still been able to identify issues in the past. For example, Mayhem discovered CVE-2016-7053 in OpenSSL, simply by using symbolic execution on the existing harness used by the maintainers and by OSS-fuzz.
MatrixSSL (now known as Inside Secure TLS Toolkit) is an open-source library aimed at IoT and lightweight scenarios with minimal dependencies for portability. As is the case with many of these libraries, much of the code could be considered on the attack surface. We chose to look at parsing x509 certificates. Because this often is done with untrusted parties on the web when first establishing keys, it’s very important from a security standpoint. Moreover, it is incredibly difficult to do correctly. The standard encoding for these certificates is ASN.1. Despite its use in many places, ASN.1 is notoriously difficult to implement without bugs. Put together its security sensitive nature as a vital part of key exchange, and its error-prone nature, and you find a great target for security analysis.
Because MatrixSSL is open-source, we are able to put together a simple fuzzing harness without much effort to test just decoding certificates. In fact, it was just two lines of code we had to write that were specific to the project, and then a small shim we reuse across projects. Just like we recommend to our customers when source code is easy to access and modify, we build both a version compiled with LibFuzzer and ASAN and one without. This makes it easy for Mayhem to both fuzz and run symbolic execution over the program.
Shortly after running, we found some issues with bounds checking of ASN.1 decoding values, which we immediately disclosed to the maintainers. Because Mayhem uses dynamic analysis rather than static analysis, we already had a test case ready to go to demonstrate the issue, and the maintainers were able to quickly fix the potential issue.
Seeing these sorts of issues in one library, we decided to look at another similar project. WolfSSL is a relatively light-weight cryptographic library that supports a wide variety of platforms, including embedded devices available as open source software or commercially. With an almost identical approach, we were able to fuzz WolfSSL as well. Again, just a few lines of code needed to be written for the different codebase.
And just as before, we were able to find a similar issue with ASN.1 parsing with an easily reproducible test case we could give to the maintainers, who quickly fixed it.
There are several takeaways from this quick journey analyzing some cryptographic libraries, which apply to software built by anyone for any purpose.
- Developers should fuzz their code! It’s a great way to identify bugs early, before it makes it to the hands of customers.
- You don’t need to write a lot of code to add dynamic analysis harnesses to a piece of software. As we saw, it was only 2-3 lines of actual code to write a harness here. Although there was a bit that needed to be done to properly build and link the project together, this would be very easy to do for the maintainers of the code.
- Reporting and fixing bugs found with dynamic analysis is easy. With a concrete input that demonstrates the problem, developers and quickly reproduce the issue with their debugging tools of choice. Once a proposed fix is in place, it’s easy to check that it worked by re-running the example and making sure it behaves correctly.
The developers both of WolfSSL and MatrixSSL were great and quickly fixed what we reported. Hopefully in the future we’ll have some chances to do more experiments like this to help analyze and secure open source software.