This month, as interns at ForAllSecure, we participated in a contest to test the beta version of Mayhem on various open source projects. If you’re not familiar with Mayhem, it’s a software security tool that uses behavior testing, a patented technique that combines guided fuzzing and symbolic execution, to uncover defects in software with zero false positives. In applying Mayhem for our summer project, one open-source project in particular yielded some exciting results. Jumping right to the punchline, using Mayhem, we discovered that the Netflix Dial Server (found in the Netflix Github repository) had a remote out-of-bounds read.
The criticality of this bug can be often overlooked since this is an older piece of code from Netflix, but the potential this bug lures is critical. Any device, including TVs from 2010, that runs this code can be remotely crashed with an out of bounds read that can be used in conjunction with other bugs to get a pretty serious situation. Imagine if someone has this code running on a connected smart TV—all devices could be in trouble.
When considering potential Mayhem targets, we specifically searched for web servers because they are a strong suite for Mayhem since many other fuzz testers are unable to fuzz network communication in an efficient manner. This often means that many web servers are full of bugs that Mayhem can find.
In early 2013, Netflix was working on a project called Discovery And Launch, better known as DIAL. The DIAL server was one of the first prototypes at streaming Netflix and YouTube directly from your smartphone to your television. The DIAL server would come preinstalled on most TVs of the time. If this all sounds too familiar, it's because Google created a device called Google Chromecast that does the same thing.
Google Chromecast originally used DIAL as its means for communications but eventually switched to mDNS—and thus began the end of DIAL's usage. Although DIAL stopped shipping with new televisions, it’s not clear whether or not DIAL has been purged from modern Netflix and YouTube applications. After a small investigation, we found that the DIAL server could still communicate with an iPhone -- making this an interesting target.
We started by doing a bit more research on DIAL. Looking through the dial-reference readme, it says: "The DIAL client uses CURL to send HTTP REST commands to the DIAL server." This is right up Mayhem's alley since Mayhem is good at finding bugs in network services. Quickly reversing some of the dial-server code showed that the server listens over a constant IP Address and port.
The main component of the DIAL server is the mongoose server, which is based on the original mongoose embedded server released around the same time as DIAL.
This code is significantly modified to be as compact as possible.
Weighing in at about 960 lines, the mongoose.c file had the potential to be buggy, so we decided this would be a good starting point for Mayhem.
Harnessing is usually one of the harder tasks when fuzzing webservers, but luckily the engineers at ForAllSecure are highly experienced at fuzzing and have spent the last two years on a mission to make it both more powerful and easier to use in the Mayhem product. Simply running the Mayhem packaging system on the DIAL Server binary collected all the linked libraries and placed them into a well formatted file. The last part of harnessing was simply specifying what networking protocol (TCP or UDP), what port, and what IP to fuzz on--presto, we have a fuzzer.
Mayhem did surprisingly well on the binary without modification to the source code or customized compilation. Executions clocked in at around 1400 executions per second on a slightly old workstation, which is a respectable speed. The server explored branches relatively fast and found a crash within the first 24 hours.
Triaging Rogue Requests
The DIAL server crashed with a SIGSEGV showing a backtrace culprit of discard_current_request_from_buffer function. After spending some time tracking the passing of bytes and requests, it was clear that the problem was coming from the mongoose.c code.
When the dial-server program is launched it spawns multiple threads and makes a call to mongoose.c to process the incoming connections.
The function process_new_connection is called to process the current request. The requested content-length, specified in the GET request, is placed into the variable conn->request_len and later used in a memmove call for an address offset.
The vulnerability is found in the check on the request length. On line 734 an assert is called: assert(conn->data_len >= conn->request_len). Though the assert checks for conn->request_len to be less than the allocated data_len, the assert does not check if the conn->request_len is negative. No check on the value allows this value to be passed to the discard_current_request_from_buffer function as the new content_len.
Once content_len makes it to the discard function, another check is made on line 712 to check the content_len is smaller than the buffered size:
else if (conn->content_len < (int64_t) buffered_len)
which will pass and do:
body_len = (int) conn->content_len;
This allows a large 64 bit negative value (larger than 32 bit), to be converted into 32 bits, which allows the negativeness to be dropped. So, a value passed of length: -13377777777777 (0xfffff3d53e4ec38f), gets converted to 1045349263--allowing an arbitrary length read of a memory location.
If the passed size is of certain size, it will cause a memmove of a non-readable memory location, which will crash the program.
When sent to the server, it causes a SIGSEGV crash.
Simple Fix, Complex Benefit
Since the bug is found in the length check, a fix could be as simple as changing:
assert(conn->data_len >= conn->request_len);
assert(conn->data_len >= conn->request_len && conn->request_len > 1);
Here at ForAllSecure we take responsible disclosure very seriously. Before making this post we contacted Netflix and informed them of the bug and the simple way to fix it. The bug was publicly fixed on the Netflix Github with the commit number of d1b1aa2636f89df95e57aa0c68836ce8d52f4638. Credit for the reference was also published in their reference repository at the security-bulletins. After confirming the fix had been pushed, we created this blog post.
Credit Where Credit is Due
Finding and triaging this bug would have not been possible without the hard work that engineers put into making Mayhem awesome and without my fellow coworker Paul Emge who spent as much time as I did on finding this bug.