Terry Brugger's Graduate School Work

I'm now officially a Doctor of Philosophy (2009) in Computer Science from the University of California at Davis. I started my coursework in the fall of 2000, was officially admitted to the program in the fall of 2001, and finished my coursework in the spring of 2004.

The way the PhD program works in CS at UC Davis is you do your coursework, hopefully pulling off A's for the courses in the four core areas (Theory, Systems, Applications, and Architecture), which thankfully I did, because the alternative is to take the preliminary exam for each area (and I don't test well). You then prepare some preliminary research and a dissertation proposal into a paper, which you present to your qualifying committee, who will grill you over it. And I mean grill you (more on that in a moment). Eventually, they pass you, you do the research you said you would in the proposal, put it together into a dissertation, and have your dissertation committee sign off on it, pass go (I originally said that to be funny, but it turns out that the steps between having your committee sign off and finishing are nontrivial) and collect your PhD.

Notice that I didn't say anything about a dissertation defense. That's right -- no dissertation defense. Instead, you do your defense up front, in the form of your qualifying examination. The qualification examination is as grueling, if not more so, than most dissertation defenses. I think this is great, because it keeps students from traveling down a long path on a dissertation only to have their committees tell them that their research is fundamentally flawed. What's more, it keeps committees from rejecting the research just because they don't like the findings; for example, discovering that an idea is a bad approach to a problem is a valid (and rather common) finding for research, but some committees don't like that and send students back to the drawing board. The approach at Davis ensures that students don't waste time with bad approaches to research.

Shortly after starting grad school, I knew that I wanted to research data mining approaches to network intrusion detection. This was prompted by a project at work where they came to me and said, "We've got all this connection log data from a firewall -- we want a tool that will look at it and flag the suspicious connections." I figured that there was probably some off the shelf tools we could grab and, with a little integration work, kick this project out. Well, there weren't any. Okay, so I figured that some academics probably had solved the problem, and we just needed to build a system that implemented their techniques. Indeed, there was a little research, but it was obvious that it was far from a solved problem. In fact, at the time (late 2000), it was a very hot research area. Dissertation city, here I come!

Spending a few years surveying the field, I developed a presentation

which I gave as a seminar for the UC Davis Seclab and our College Cyber Defenders (CCD) program at LLNL. That presentations was a dry run for my Qualification Examination. The presentation is based on my original dissertation proposal The feedback from my committee was that the survey was excellent, however they had some reservations about the proposal itself. Based on this, I extracted the survey portion of the paper

Now then, having actually survived the Qualifying Examination, proper, allow me to offer some Advice to other UC Davis CS Qualifying Exam.

So, I got through the exam with a "Conditional Pass". Instead of waiting for the chair of the committee to get back to me with the changes they wanted before continuing, I forged ahead. The first step was to baseline against Snort with the DARPA IDS Eval data. What I found was that the data was very good at modeling attacks that signature-based IDS, such as Snort, wouldn't easily detect, however there was no basis for the background traffic generated in the dataset. In other words, the data could tell you if your IDS had a good true-positive rate for non-signature detectable attacks; however, it was useless for evaluating the false-positive rate of the system. These results were written up as

which I submitted to a number of conferences where it was consistently rejected because they felt the results were nothing new -- it's been commonly accepted in the network intrusion detection community for years now that the DARPA IDEval data is flawed. I think the research was interesting in that it showed that the dataset did have at least one redeeming quality. After it became clear that no one wanted to publish the paper, I released it as a UC Davis CS Department Tech Report. To give credit where credit is due, Jedidiah was one of my College Cyber Defender students, who put together the actual scripts for the assessment -- quite impressive for someone just out of high school. Jed's now in the CS program at UC Berkeley.

Despite these problems with the DARPA dataset, I still see it widely used in KDD research. As a result, I wrote the following:

Some people have written me and asked if there are any other dataset for doing data mining for network intrusion detection (not that's known to the wider network security community), and what they can do if they want to apply data mining methods for intrusion detection. Here's what I responded to one person:
  1. Don't use network intrusion as an application area. Yes, it's harsh, but it's the most honest answer I have.
  2. Mix in real network data as Maholney & Chan did; problem is, that only corrects some of the known flaws in the data. Personally, I wouldn't much stock in any results from this approach.
  3. Grab some real network data and run it through a signature based NID like Snort or Bro (or both) to identify known vulnerabilities. Treat anything they did not alert on as unknown (as opposed to assuming it's normal). The real data mining task becomes malicious use detection generalization (as opposed to either strict signature matching or anomaly detection): by training on the known attacks in the training portion of the data, can the method identify variants or different types of attacks in the test data that were not present in the training portion (in addition to the attacks that were present in both)? For such a task, you couldn't reliably report the false positive rate, only the true positive rate (due to the unlikely event that it finds an attack Snort didn't -- again, that's why the non-attack connections are unknown, not normal).
  4. Grab some real network data and start labeling. Problem is that often times we can't infer the intent of a given connection. One option here would be to score each connection with how suspicious it is, say from 1 (you're sure it's benign) to 10 (known network exploit). At this point, you can apply data mining methods to see how they compare to the human analyst's assessment. The results of the data mining may also cause you to reconsider some of the labels. Making such a labeled dataset available to the data mining & network intrusion communities (after appropriate annonymization) would be a huge boon to both.

By the time I finished the DARPA data assessment, I still didn't have the requirements to pass the qualifying examination. I did, though, understand why my committee had serious reservations about my proposal. I couldn't effectively test my data mining methods for network intrusion detection without data. Now, you have to understand that one can not use real network data to test IDSs, because one does not know what the intent behind every connection is. While many may appear obvious, there are still many that may either be malicious attacks, or benign misconfigurations. I think the chair of my committee, who is not only notorious for being overworked and nonresponsive, but is also one of the nicest gentlemen you'll ever meet, just didn't have it in his heart to tell me that there's no way my proposal would ever work, for lack of good test data.

So, the network intrusion detection community needs a better test dataset. Well, the Lincoln Labs DARPA project to create the IDEval dataset generated numerous PhD and MS theses, so surely this was a dissertation worthy endeavor. A month or so down that road and I said, "Okay, how do I validate that the traffic I'm generating looks like real network data?" This was, after all, the central problem with the DARPA IDEval dataset.

Nothing.

There were no established methods to do this. Long story short (I know, too late): a new dissertation proposal:

which includes my assessment of the DARPA data using Snort to establish the need for such a methodology. By the time I got that proposal done, the chair of my committee was serving as an area manager or some such for NSF. So I volunteered for a project at work that needed to send people to DC, where upon I kicked off a job and headed over to NSF headquarters. Allow me to note that security at NSF is as good as any other government facility I've visited, so I wasn't able to just go camp outside my chair's door. Calling from the lobby did actually get me a response though, so I was able to set up an appointment for the following day, where I presented my new proposal. Given a couple more weeks to read it and take care of the paperwork, he finally signed off on it and I advanced to candidacy. I'm pretty sure that two years between the qual exam and advancing to candidacy is a record.

Three years of pounding hard on this research, and I'm done! Here's my recommended version:

It turns out to file the dissertation electronically, it must be under 100MiB, so I had to change all of my high-resolution vector graphics to bitmaps (because I really didn't want to print 722 pages). Here's the high-resolution version: And if, for some reason, you want to see the official version -- which has more of the graphs downsampled into bitmaps -- here it is: I've found the need to frequently defend the need for such a work. Indeed, before I started down this path, I figured it was a solved problem. In the hopes of raising awareness of the need for research in this area (and, let's face it, to hopefully get some funding), I've put this together: I've also discovered that many connection metrics I've seen used over the years are somewhat ambiguously defined. I'm hoping that the networking community can agree on more concrete definitions for these metrics, and in order to spur such work, I've put together: And to go with it, a set of proposed definitions for connection metrics: As noted, the three above papers are drafts, and I welcome any and all feedback on them. I tried publishing the last already as an RFC, however it was rejected -- apparently engaging the IETF working groups (WGs) or Area Directors is more than "recommended". So besides a Technical Report (it's really much too dry to be a conference or journal paper), I might try the RFC route again if I can engage the proper people.

Biography Contact Information PGP Projects Geek Code Audio Visual
The Meaning of Zow Global Thermonucular War

"Zow" Terry Brugger
Last modified: Sun Jan 7 2007