KuangPlus: Automating Vulnerability Detection
Warren Toomey & Jeff Howard
School of Computer Science, Australian Defence Force Academy

Abstract

System administrators need good tools to detect system security vulnerabilities (bugs and configuration errors) on a timely basis. This paper examines a tool called KuangPlus, which helps to automate vulnerability detection, and which keeps its database of known problems up to date via interpreted information downloaded from vendors.

Introduction

One of the dilemmas facing systems administrators today is the amount of time they need to spend finding and fixing known security deficiencies in their systems. Information about new security deficiencies is made available in a timely fashion from operating systems vendors, application vendors, computer emergency response teams and other groups interested in computer security. A diligent sysadmin could spend every working hour monitoring these sources, determining if the local system is affected, and taking the steps to rectify any holes found.

Currently, deficiency reports are usually written in a human language, e.g English, and describe what the problem is and how it affects a system's security. In some cases, exploits or other programs are available to test if a system has a given weakness. These reports and programs are often digitally signed with a public key cryptosystem, so that the system administrator can verify that they did come from a particular vendor, and that the report or program has not been tampered with.

In many cases, newly-found security holes give an attacker full system rights, e.g to become `root' under Unix or `administrator' under NT. In other cases, the holes give an attacker limited system rights. However, combinations of existing system deficiencies may be combined by an attacker to gain greater system rights than a single hole by itself. The vendor reports about individual security holes obviously cannot describe the effect of combined deficiencies.

Existing tools like COPS, Satan and Nessus allow a sysadmin to scan a system for known software and configuration vulnerabilities. The Kuang tool, part of the COPS package, can detect chains of configuration mistakes which when combined can be used to penetrate a system's security. However, all these tools rely on a database of known problems which are only updated when new releases are made; these tools do not keep up with the daily round of new vulnerabilities.

We have a situation where

a tool like Kuang can provide an automated way of determining a system's vulnerability to known security holes and their combination, but the tool does not track newly-found security deficiencies; and
vendors and computer security groups provide timely reports of newly-found security deficiencies in a tamper-proof fashion, but only in a format which must be processed by a human.

It seems obvious that, if the computer security community could be persuaded to provide details of security deficiencies in a rule-based format, then these rules could be processed by a Kuang-like inference engine to automatically test a system's vulnerability to the deficiencies.

In order for such a combination to actually be taken up by both the providers of such rulesets, and by the end-users of the rulesets, such a system must have a number of characteristics:

Vendors must produce security reports of new holes in ruleset form.
One or more mechanisms must allow end-users to obtain new rules quickly and automatically.
End-users must be able to trust the rules obtained: they must be able to verify who created the rules, and verify that the rules have not been tampered with.
As the rules must be executed on the end-user's system, they must be written in a fashion that is relatively easy to read and understand. The rules must therefore be transmitted in source form.
A Kuang-like inference engine should form part of the system, as this can determine the effect of deficiency combinations. Again, the engine must be distributed in source code form, in a way that identifies the author and shows that it has not been tampered with.
The copy of the software and rulesets on the end-user's system must be able to verify its own `intactness' before it is used each time. This prevents attackers from exploiting existing system holes, and modifying the system to prevent detection of the deficiency. Therefore, sections of the system should be designed to be rarely modified, and to be placed on read-only media.
The system and the rules should be implemented in a machine- and system-independent language which has access to each systems' APIs.
The system should be able to obtain new rules from various rule sources, verify their author and integrity, integrate them into the local system, apply them, and report new system deficiencies to the system administrator in one operation.

KuangPlus is a tool which has been designed to meet the criteria listed above. It was originally specified as the topic for a Masters' project. Warren Toomey constructed the initial design for the tool in early 1999. This was passed to Jeff Howard, who improved the design significantly and implemented the prototype of KuangPlus at the end of 1999.

Design of KuangPlus

Before the authors sat down and constructed KuangPlus, we drew up a set of design guidelines that would ensure we met the criteria outlined above.

Flexibility:

Separate the tool from the database of vulnerabilities. In this way the tool will be flexible and will be able to detect new vulnerabilities as soon as they are added to the database.

Generality:

Implement the tool in such a way that it is independent of the platform and operating system it is running on.

Maintenance:

Have a central core of static (unchanging) code with dynamic rules representing the ``database'' of vulnerabilities loaded at runtime. The central core will not require maintenance as new vulnerabilities are discovered.

Trust:

The tool should be simple and obvious in its design and secure in its operation.

The source for the tool and the rules must be in an easy to read format that gives the system administrator the confidence to use them.

Downloaded rules should only be executed if their author can be determined with certainty, and if the sysadmin permits rules from that author to be executed.

Completeness:

Use an inference engine to reveal complex vulnerabilities as well as simple ones. This is a direct development from Kuang approach where a backward chaining, goal based, breadth first search, inference engine was used.

Ease of use:

Have a well described language for ``rules'' and have clear instructions for creating them, in order to make it easy for vendors and other interested parties to generate rules.

The tool should generate useful information suitable to a wide range of system administrators, from the novice to the experienced.

Choice of Implementation Language

Perl was chosen for the language to implement KuangPlus for several reasons. It is commonly available on a broad range of systems, and has a rich standard library which will obviate the need to ``re-code'' the interface between the tool and each specific platform.

KuangPlus is going to run in an environment where trust is very important. Perl executes interpreted code: this will allow the system administrator to read the code of KuangPlus' core and any downloaded rulesets. Perl also provides features such as `taint' mode and `use strict', which help to prevent the evaluation of information from an untrusted source. In version 5.004 of Perl, the `Safe' module was further refined, and this has been used in the prototype to prevent the rules interfering with either the KuangPlus core or the environment in which the tool is running.

Some Terminology

Before we look at the overall structure for KuangPlus, we need to introduce some terminology to describe what is downloaded on the fly from vendors, and what is used by the KuangPlus core inference engine to determine the existence of security vulnerabilities.

A maxim is a small piece of Perl code which is written by a vendor, security organisation, or security interest group to detect a security vulnerability on a system. The maxim will be digitally signed by its author and when downloaded, will be run within a safe `sandbox' environment within KuangPlus.

If the maxim detects a system problem during execution, it will produce one or more rules which describe the problem. Each rule has an initial state, an end state, and an operation which will allow the transition from one state to another.

For example, imagine that a junior sysadmin on a Unix system has left their home .cshrc file world-writable. Insertion of csh commands into this file would allow any user to masquerade as this sysadmin. A maxim written to detect this vulnerability might produce these rules.

Initial State	Operation	Final State
Any user	Write fred's `.cshrc`	User fred
Any user	Write fred's `.cshrc`	Group operator

When all of the maxims available to KuangPlus have been executed and generated a set of rules, the inference engine in KuangPlus will attempt to chain them together to create one or more plans. A plan is a single instance of a chain of rules which allows progression from a `known' state to a `goal' state. For example, the desired plan might be Unknown external user -> Root/Administrator.

KuangPlus Structure

KuangPlus will be composed of three modules (refer to Figure 1). The first module will provide the `front end' to the tool. It will provide the user interface, handle the loading of maxims and will build a search space of rules. The second module will contain the `inference engine', which will be invoked with a reference to the search space of rules and will return any successful plans (i.e exploits) found. The third module is suggested by the use of the Perl `Safe' module and will encompass any routines which should be available to the rules as they are evaluated within the `Safe' compartment.

Use of the Perl `Safe' Module

The existence of the `Safe' module was a minor revelation during the development of the prototype. It was stumbled across when one author was perusing the security related items discussed in the ``Programming Perl'' book (Wall, Christiansen & Schwartz, 1996).

There are three implicit properties which the process of loading and evaluating maxims should exhibit:

1.: The integrity of the KuangPlus main logic should be guaranteed to be quarantined from any interference with respect to the evaluation of the maxims.
2.: When each maxim is evaluated, there should be no residual effect on the evaluation environment caused by a previously evaluated maxim. Stating this same property from the other point of view, having been evaluated, a maxim shouldn't be able to leave any residual effect that might affect subsequent maxim evaluation.
3.: In the process of evaluating a maxim, there should be no unplanned interaction with the system on which it is being evaluated.

The use of the `Safe' module with a set of routines which allow and control interaction with the system satisfies the above three requirements. The Safe module is part of the standard Perl library in version 5.004 of Perl. The Safe module enables Perl code to be evaluated in a restricted environment where the only variables and routines which it can `see' are explicitly `shared' into its environment. This should satisfy property 1 and 3 presented above. By loading each piece of code into a new `compartment' the possibility of rules interfering with each other should be eliminated and that will satisfy the property 2 above. Similar sorts of sandbox execution environments exists in other languages such as Java and SafeTcl.

Whilst the use of the `Safe' module should give the users of the tool confidence that the operation of KuangPlus is reliable, there are some things which it can't protect against which are worth noting. The potential for code to consume the CPU or memory of the host system is identified as a means by which clumsy or malicious rules could prevent the calling script from ever finishing. There are also complex issues surrounding the possibility of disclosure of environment variables and side effects that might occur if the compartment is able to access variables within the namespace of routines `shared' into it. Whilst the consumption of memory and CPU is beyond our control at this stage, we have modified the design of our prototype so as to avoid these other situations.

Implementation of the KuangPlus Prototype

There were a number of time and other constraints placed on the implementation of KuangPlus by Jeff Howard's Masters' project. The project was therefore limited to the development of a working prototype which would prove the KuangPlus concept. One notable omission placed on Jeff was that of digital signatures for maxims. Despite these constraints, Jeff produced a working system that can be easily extended to become a final version of KuangPlus.

The design of the KuangPlus prototype is shown in Figure 2 below. It is composed of four logical units: the front-end, the inference engine, the `safe' operations and the maxims themselves. Ignoring the maxims, the prototype consists of 620 lines of well-commented Perl code.

The front-end is the program which the user invokes. It handles resolving the command line options; setting up the run-time environment as required including initialising variables and loading additional modules; evaluating or rejecting each of the maxims; invoking the inference engine; and handling the results in some meaningful way. Of course, KuangPlus can be invoked automatically at a set time without any manual involvement.

The inference engine is of the ``backwards chaining, goal based, breadth first'' type. In simple terms this means that, based on a nominated `goal', the logic will look through the search space of rules to see which rules can be combined (`chained') to achieve the goal given some initial condition. The logic is such that it starts with the goal (hence the descriptive terms `backwards' and `goal based'), and attempts to find a non-empty chain of rules which will achieve the goal. Having found a non-empty set of matching rules, these rules will then act as a temporary goals, for which the search space will be re-examined to see which rules will allow these new goals to be achieved (hence the application of the term `breadth first'). The process repeats until a rule is found which represents an initial condition or there are no matching rules. At this point the work of the inference engine is complete.

The general properties of the `Safe' module have been discussed already. For the purpose of the prototype, the default set in Perl has been used. This decision was based on a need to develop a working prototype and in the knowledge that the methods available in the "Safe" module and the associated "Opcode" module will allow this decision to be easily reviewed and further restrictions added as necessary.

An example of the error message which is generated at runtime by a maxim trying to access an operation to which it has not been given access is shown below. In this example, the stat operation was not available to the maxim:

    Use of uninitialized value at ./safe.pl line 31.

    $KuangPlus::rtn = "" (stat trapped by operation mask at
                /home/Kuangplus/unsafe.pl line 16.)

If a given maxim attempts to access an operation which is not available to it, the attempt will be detected at runtime, an error similar to that shown above will be written to the screen and the maxim in question will not be evaluated.

Interface Between Front-End and Maxims

Each maxim is loaded from a file and evaluated in its own sandbox created by the `Safe' module. The interaction between the maxim and the environment in which it is evaluated is restricted to two types: the creation of rules in a specific associative array which is `shared' into the compartment, and the use of a set of subroutines also `shared' into the compartment.

In the KuangPlus protoype, the subroutines available to the maxims are caching emulations of normal Perl system calls: stat(), uname(), getpwent(), getpwname() & getpwuid(). These give a maxim the ability to derive account and system specific information. In the full-blown KuangPlus, many other safe routines will be added to the sandbox.

Because the subroutines have the same name as the operating system equivalents, maxims can be tested outside of the `Safe' environment. The emulated subroutines also cache information: system information such as user-ids need only be obtained once, and will then be served to maxims from the cache. The biggest advantage though is that maxims must use these routines and so there is a tight control over what information about the system is available to them and how they can get at it.

Interface Between Front-End and Inference Engine

The inference engine is passed a reference to an associative array which contains the accumulated rules from the evaluation of the various maxims. The induction engine returns to the front-end an array of successful exploits, if any were found, in the form of plans. If no exploits were discovered, then a message is printed by the front-end stating as much. If the return value is non-empty, then a subroutine within the front-end is invoked which will cause each exploit to be printed as a chain of states and a description of how an intruder would transition from one state to the next.

Syntax of Rules

The generated rules that are produced when a security deficiency is found must be able to express that deficiency. At present, the KuangPlus prototype has borrowed much of the details from the original Kuang tool; we expect that other states and transitions will be required to represent more sophisticated system security deficiencies.

To review: a rule describes a security deficiency, and has an initial state, a final state, and a method of transitioning from the initial to the final state. The state types available in the prototype are:

user-id: A particular numeric user-id on a Unix system
group-id: A particular numeric group-id on a Unix system
pathname: A full pathname for a file on a Unix system
version: The version details for a particular piece of software

The transition operators in the prototype are:

``can obtain user-id''
``can obtain group-id''
``can overwrite''
``can replace''
``is version''

The KuangPlus prototype encodes both state information and the transition operators in order to reduce the storage size of each rule, and to make rule chaining more efficient.

Some example rules (in initial state, operator, final state format) are given below. Symbolic user/group identifiers have been substituted for numeric ones.

1.: User-id any, is version Sendmail 5.64, User-id root
2.: Group-id wheel, can overwrite /etc/passwd, User-id root
3.: User-id operator, can overwrite /etc/group, Group-id wheel
4.: User-id any, can overwrite /home/staff/fred/.cshrc, User-id fred
5.: User-id any, can overwrite /home/staff/fred/.cshrc, Group-id operator

Having knowledge of the above rules, any user on the system can gain root privileges via two routes. Because Sendmail 5.64 is installed, a user can exploit bugs in this service to become root immediately.

Alternatively, a user could chain rules 5, 3 and 2 together as follows: overwrite /home/staff/fred/.cshrc to obtain operator group permissions, overwrite /etc/group to obtain group wheel permissions, then finally overwrite /etc/passwd to obtain root privileges.

An Example Maxim

The following is an example of a simple KuangPlus maxim to be executed in the `Safe' environment. The maxim detects a security problem if the running Linux kernel is too old.

package Linux;

main::pdebug "Loading Linux rule now ...", 6.1;

# CIAC_J035 - 
# ESB-1999.039 -- CIAC Bulletin J-035
# Linux Blind TCP Spoofing
# 22 March 1999
sub CIAC_J035 {
        my ($description) = "CIAC_J035";
        
        # The uname array has something like this in it:
        # Linux (none) 2.0.34 #1 Fri May 8 16:05:57 EDT 1998 i486 unknown
        my ($version) = (main::uname())[2];
        @frag = split('\.', $version);
        if (($frag[0] <= 2 ) && ($frag[1] <= 0) && ($frag[2] <= 36)) {
                # This rule is triggered if the version of the 
                # Linux kernel running on this machine is less
                # than a known level. Note that we record the
                # string "Linux-$version" using the kernel
                # version of this system so that the "known_facts" 
                # will match it when the "rule_engine" chews over all 
                # the plans.
                $main::new_plans{"u -1 v Linux-$version"} = $description;
        } 
} 

my (@uname) = main::uname();
if ( "$uname[0]" =~ /Linux/ ) {
    main::pdebug "Invoking Linux rule set", 6.3;
    CIAC_J035();
}  else {
    main::pdebug"Skipping run_Linux rules", 6.3;
}

This maxim, once loaded by the KuangPlus front-end, will determine if the cached system uname information contains the word `Linux'. If so, the CIAC_J035 subroutine is invoked. This tests to see if the version of the Linux kernel is below 2.0.36. If so, the rule `u -1 v Linux-$version' is generated and exported back to the front-end. The meaning of the encoded rule is: if the system is using a Linux kernel below 2.0.36, then any external user can become any real user-id on the system.

Sample Output from Prototype

When a plan (a chain of rules) has been found that leads to a desired final state (such as obtaining root privileges), KuangPlus can print out the plan in encoded format, or with textual details provided by the maxims themselves.

The following example shows a verbose-style plan, where a Linux system has been ``seeded'' with a world-writeable /etc/group file and is running a 2.0.34 Linux kernel.

Success: "u 0 w /etc/passwd g 0 w /etc/group u .* v Linux-2.0.34"
The verbose breakdown follows:
        The goal is "u 0"
        Plan: "u 0 w /etc/passwd"
        (root access via. writeable /etc/passwd)
        (verbose) you can get access to userid 0 if
        (verbose) you can overwrite file /etc/passwd if
        Plan: "w /etc/passwd g 0"
        (/etc/passwd is writeable by gid 0)
        (verbose) you can overwrite file /etc/passwd if
        (verbose) you can get access to the gid 0 if
        Regex Plan: g .* w /etc/group == g 0w /etc/group
        (/etc/group can be overwritten)
        Plan: "w /etc/group u .*"
        (/etc/group is world writeable)
        (verbose) you can overwrite file /etc/group if
        (verbose) you can get access to userid .* if
        Regex Plan:u -1 v Linux-2.0.34 == u .* v Linux-2.0.34
        (CIAC_J035)
        Known: "v Linux-2.0.34"
        (From POSIX::uname() call.)

With a Linux 2.0.34 kernel and a world-writable /etc/group file, any external user can obtain root privileges on this system.

Current Status

At present, no further work has been done on KuangPlus since the completion of Jeff Howard's Masters' project. The KuangPlus prototype that he developed is available at http://minnie.tuhs.org/Seminars/KuangPlus, along with his project report.

We expect that the prototype will form the basis of a fully-formed version of KuangPlus. One issue that is yet unresolved is the choice of an appropriate system of digital signatures to determine a maxim's author. The `Pretty Good Privacy' package developed by Phil Zimmerman was considered a likely candidate for this role. Whilst developing the prototype, the authors became aware of the `Penguin' module for Perl which seems tailor-made for this role. Penguin is described as having ``vastly simplified, superior, and innate methods of ensuring safety and security''. To date, Penguin is not part of the standard Perl distribution, but it is considered a very likely candidate for future inclusion.

In order for KuangPlus to be valuable, not only must a complete version of the tool exist, but a critical mass of vendors must supply vulnerability reports in KuangPlus maxim format. During the production of the complete KuangPlus, we will need to encourage vendors and other security interest groups to provide maxim-format reports. We will also produce a base set of KuangPlus maxims to detect many of the common and well-known Unix security vulnerabilities.

Although the prototype of KuangPlus was developed on a Linux system, one prime goal was for it to be highly portable. The prototype core works `as is' on FreeBSD systems, and during the development of the complete version, we want to ensure that the core executes on such diverse platforms as Linux, FreeBSD, Solaris, NT and MacOS. Of course, some maxims can be shared across two or more platforms, but many maxims will be specific to a single operating system.

Conclusion

A new computer security tool, KuangPlus, has been designed and prototyped. The design uses `on the fly' loading to access a database of known security vulnerabilities which would be interpreted to produce a number of existing vulnerability `rules'. The rules can then be assessed by a backward chaining, goal based, breadth first inference engine. Any vulnerabilities detected would be reported to the system administrator. The existing prototype meets most of the design goals chosen for KuangPlus.

Future development of the prototype include developing automatic methods for authenticating the author of maxims and generating interest for the tool in the security community, so as to open the prototype to scrutiny and to generate interest in writing of maxims.

The development of KuangPlus along the lines presented above has the capacity to create a tool which is general enough to run on just about any computing platform. Combined with a rich and timely set of maxims, KuangPlus would enable any publicly identified vulnerability, in either the configuration or the software running on the system, to be exposed pro-actively by the system administrator.

Warren Toomey
2000-05-25