Verified Executables for NetBSD



What is it?

In a nutshell this is functionally the same as a kernel level tripwire for NetBSD. The integrity of specified executables and files are verified before they are run or read. This makes it much more difficult to insert a trojan horse into the system and also makes it more difficult to run binaries that are not supposed to be running, for example, packet sniffers, DDoS clients and so on.

The concept for this idea sprang from an frustration with an observation on the Bugtraq mailing list that there was no way of knowing whether or not the binary you were running had been tampered with or not. At the time there had been a spate of public servers that had been broken into by subverting a binary on the system or, more simply, arranging the login PATH variable to include a trojaned binary.

At this point it occurred to me that there should be a way of protecting the "Trusted Computing Base" (the TCB) and ensuring that the binaries executed had not been tampered with in any way. As I saw it, simply applying more stringent permissions did not solve the problem because you still had no verification that what you thought you were running was what you were actually running.

Having had experience with Tripwire in the past and knew that the scan vs system impact was always a trade off, scan too frequently and the machine is too loaded to perform useful work. Scan too infrequently and you provide a wider window of opportunity for a cracker to get into the system and do damage. What was needed was something that scanned the files on an as-needed basis to detect modifications to the system, this line of thinking brought about what I initially called signed exec. I recently changed the name to verified exec as it was pointed out to me that what I have done is not really signing anything, it is more a verification of integrity.

How does it work?

Firstly I analysed the NetBSD kernel code and found that the exec subsystem lent itself to inserting an extra verification routine quite simply. By placing a function in the exec path that evaluated the fingerprint of the file to be executed and compared it to a previously stored value I could allow or deny execution depending on the results of this comparison.

It was apparent to me simply evaluating the fingerprint every time exec was called was not going to have a positive impact on the performance of the machine on a few levels. Simplistically, the impact on CPU would be major but other things like demand paging would also be affected and would cause a major slowdown on the machine. I believed this to be an unacceptable situation and investigated ways of addressing the slowdown. An obvious answer was to somehow cache the fingerprint result so that the fingerprint calculation would not need to be done again.

As it turns out there is a lot of information that is already kept about a file inside the kernel, since this information is expensive to generate there is a tendency for the kernel to hold onto the information until demand forces a recycling of the data structure holding it. By adding an extra field to this structure I could keep the results of the fingerprint comparison. Thus, if the same file was executed again the previous result of the comparison could be used to decide whether or not the exec should be allowed instead of recalculating the fingerprint.

The result of this caching mechanism takes a technique that made the machine run 70% slower (i.e. things took 1.7 times longer to run) to a point where the impact on the system cannot be realistically measured.

With the performance issue solved, a revisit of the exec path gave me an interesting inspiration. I noticed that the path through the exec code for a shell script interpreter followed a slightly different path to a normal exec. This allowed me to distinguish how a shell interpreter has been called. The upshot of this was that I was able to add a feature where the execution of the shell interpreter could be blocked but the same binary could still be used to interpret shell scripts. This gives a very unique opportunity not available by other mechansims - have a powerful, feature rich scripting language, for example PERL, that could only be used to run scripts that match a fingerprint.

How was it implemented?

At this point I had a working proof of concept running on NetBSD. I used this as a basis to present this paper at the 2000 Australian Unix Users Group summer conference. At this point I was calling the idea signed exec and the paper focuses a lot on the execution control aspects of the idea. After this I back-burnered the idea for a while until another NetBSD developer rekindled my interest in making it go.

With the help of another developer, we set about addressing some of the points on the todo list in the paper I presented. First on the list was to work on a method of verifying the integrity of shared libraries used by dynamically linked objects. This was done by modifying the file open function at the VFS layer to fingerprint a file when it is opened for read and to deny the open if the fingerprint comparison failed. Only files that have fingerprints associated with them will have this done to them to prevent having to fingerprint every file on the system.

By protecting the shared libraries in this manner we not only can verify the integrity of dynamically linked executables but as a bonus we now had the facility to fingerprint and verify any file on the system, not just executables. This means that critical control files could be verified correct, giving confidence not only in the binaries and scripts but also the configuration of those binaries or scripts.

Once we had covered the shared library issue the next important thing to us was to add support for more fingerprint methods. When I first worked on this idea I was aware that md5 had been broken but at the time it was more important to me to make sure the idea was workable than worry about the integrity of the hash function. Now that we were more serious about this it was time to look at what needed to be done to add other fingerprinting methods.

This was pretty much a total rewrite of the original code as it was heavily tied to the md5 hash and the actual checking was tied into the exec path in such a way that it could not simply support other fingerprints. We rewrote things so that the fingerprint evaluation and checking was split out into small functions that were called by a switcher function that selected the appropriate methods depending on the fingerprint used. This laid the framework for adding more fingerprint methods by writing a couple of small functions and tying these into the switcher function. To test this we added the SHA1 fingerprint method. This meant that we could freely intermix both MD5 and SHA1 fingerprints in the in-kernel list on a file-by-file basis.

At this point we felt the code in a good enough state to merge into the NetBSD current source tree. The merge was done on the 29th of October 2002 and I posted an announcement to the tech-security and current-users NetBSD mailing lists. There is still much work that needs to be done and I look forward to refining and improving the code that has been committed.

Thanks

My thanks go to the NetBSD community who are a a very technically demanding group of people but still manage to be friendly. Special thanks goes to Jason R. Fink for helping more than he believes he has.

Feedback

If anyone has comments or feedback they can mail it to me. I am blymn and the mail domain is netbsd dot org (apologies for doing this but I get enough spam thank you very much)