Skip to main content

more options


IAI@Rome Summer 2003

Current Research Topics in Systems and Information Assurance

Lecture Titles and Abstracts

Peer to Peer: The Big Picture
Speaker: Ken Birman
Slides

Peer to Peer protocols are a hot topic for the distributed computing community. Such protocols are widely believed to offer better scalability and adaptivity under stresses that can disable more conventional styles of protocols. In this talk we'll review some of the major styles of P2P computing, look at where the trends might take the field over the next few years, and ask how such techniques might be applied to Air Force and other military computing needs. We'll see that P2P systems raise serious performance and security issues, and are sometimes far less robust than one might have expected. Yet, despite these concerns, such systems also that they have the potential to be very useful in some important settings.

APSS: Asynchronous Proactive Secret Sharing
Speaker: Fred B. Schneider
Slides

A proactive secret sharing protocol enables a set of secret shares to be periodically refreshed with a new, independent set, thereby thwarting so-called mobile adversary attacks. This talk discusses APSS, a proactive secret sharing protocol for asynchronous systems. APSS resists denial of service attacks, which slow processor execution or impede message delivery and thus violate the defining assumptions of a synchronous system. An informal derivation of the APSS protocol will be given along with protocol transformations that have general applicability in the construction of secure and fault-tolerant distributed computations.

Wave scheduling--an approach to energy-efficient routing in ad-hoc sensor networks
Speaker: Alan Demers
Slides

Recently there has been considerable interest in sensor networks, deployments of many small computers called nodes, equipped with measuring devices to monitor their environment and with low power radios to report measurements to a central site for querying and analysis. Sensor nodes are battery powered, and radio communication is orders of magnitude more expensive than local computation. Thus, in-network storage and query processing can greatly reduce energy consumption and extend the lifetime of the network. In our sensor database architecture, a designated subset of the nodes, called view nodes, store sensor data for later retrieval or for in-network query processing. The communication patterns are not random, but are generated by the ongoing transmission of sensor readings to view nodes and by distributed query processing there. This talk presents wave scheduling, a technique that exploits knowledge of view node locations and communication patterns, together with topology control techniques and an application-specific MAC layer, to achieve scalable, energy-efficient communication. For a given collection of view nodes, we schedule node transmissions to avoid all interference at the MAC layer and to maximize the time during which node radios can be powered down. The schedules achieve a graceful tradeoff between energy consumption and message latency. Under reasonable assumptions about message locality, wave schedules can scale to arbitrarily large networks with constant background power per node and constant energy per message.

Ensemble Selection
Speaker: Rich Caruana
Slides

In this talk I'll present a new approach to machine learning called ensemble selection. In this approach, we generate many different models using a variety of learning procedures: SVMs, decision trees, neural nets, k-nearest neighbor, logistic regression, etc. For each of these learning methods, we try as many different settings of the control parameters as possible. From the 1000's of models generated, we use forward selection to select models to maximize the performance of an ensemble on the problem. Models in the ensemble are combined by simple averaging. Forward stepwise model selection is performed as follows:

  1. Start with the empty ensemble.
  2. Select the model with best performance on the held-out test set and add it to the ensemble.
  3. Repeatedly add to the ensemble the model which when combined with the other models already in the ensemble yields to best performance on the test set.

Adding or removing a model from the ensemble very fast; the cost of recalculating the performance criterion after each update dominates the cost of updating the predictions themselves. The performance of the new method is excellent: ensemble selection yields performance that we cannot achieve using any other learning procedures. One key advantage of ensemble selection is that we can optimize the performance of the ensemble to almost any performance metric, even though we may not know how to train the individual models to the metric. Another advantage of the method is that it is easy to make fully automatic, making it possible to achieve world-class machine learning performance without extensive expertise.

Secure Program Partitioning
Speaker: Andrew Myers
Slides

A challenging unsolved security problem is how to specify and enforce system-wide security policies; this problem is even more acute in distributed systems with mutual distrust. This talk describes a way to enforce policies for data confidentiality and integrity in such an environment. Programs annotated with security specifications are statically checked and then transformed by the compiler to run securely on a distributed system with untrusted hosts. The code and data of the computation are partitioned across the available hosts in accordance with the security specification. The key contribution is automatic replication of code and data to increase assurance of integrity---without harming confidentiality, and without placing undue trust in any host. The compiler automatically generates secure run-time protocols for communication among the replicated code partitions. We will describe results from a prototype implementation. This is joint work with Lantian Zheng, Steve Chong, and Steve Zdancewic.

Using First-Order Logic to Reason about Policies
Speaker: Joe Halpern
Slides

I show that a fragment of (multi-sorted) first-order logic can be used to represent and reason about policies. This fragment is expressive enough to express many policies of interest, while still remaining tractable. More precisely, questions about entailment, such as `May Alice access the file?', can be answered in quadratic time, as can the consistency question `Do the policies permit and forbid the same action?'. If time permits, I'll also discuss a prototype that we have built whose reasoning engine is based on the logic and whose interface is designed for non-logicians, allowing them to enter both policies and background information, such as `Alice is a student', and to ask questions about the policies. Finally, I'll discuss the extent to which our logic can capture policies written in XrML, a popular language for writing policies. This is joint work with Vicky Weissman.

On the Correctness of Java Bytecode Verification
Speaker: Dexter Kozen
Slides

The Java bytecode verification algorithm as described in the Java Virtual Machine (JVM) specification [Lindholm and Yellin 1996] is a standard worklist algorithm commonly used in dataflow analysis. It derives a least-fixpoint type annotation representing a conservative approximation to the types of values in local variables and on the operand stack at any point in the program.

The bytecode verification algorithm and its specification have come under considerable scrutiny, since the security of Java depends on it. The specification has been faulted by many authors as being ambiguous and incorrect. One major criticism raised by Abadi and Stata [1998] concerns jsr/ret instructions, which are used to implement the Java try/catch/finally construct. As the criticism goes, because jsr and ret are asymmetric---jsr pushes a return address on the stack, ret returns to a return address in a local variable---there is no guarantee that jsr subroutines exhibit LIFO (last-in-first-out) behavior; in other words, there is no guarantee that a ret instruction returns to the most recent call site. The analysis is further complicated by the polymorphism of jsr subroutines over untouched local variables. Abadi and Stata [1998] (see also O'Callahan [1999]) devise a fairly involved type system and show that it guarantees LIFO behavior. However, they avoid issue of whether the original worklist algorithm is correct.

In this talk I will describe how to interpret the JVM specification so as to resolve the minor ambiguities in the description of the treatment of jsr/ret and prove that the original algorithm as described in [Lindholm and Yellin 1996] is correct and guarantees LIFO behavior.

Type-Safe Memory Management in Cyclone
Speaker: Greg Morrisett
Slides

Cyclone is a type-safe dialect of C intended for systems applications where security and reliability are a prime concern. The type-safety guarantee ensures that a wide class of common C problems, such as buffer overruns and format string attacks, cannot happen. The most novel aspect of Cyclone is its support for programmer-controlled memory management. We support a flexible combination of garbage collection, stack allocation, lexical regions, arenas, linear pointers, and reference counted pointers. The primary technical challenge has been a smooth integration of these features while retaining a type-safety guarantee.

NUTSS: The De-facto Next Generation Internet
Speaker: Paul Francis
Slides

Recent announcements by Defense Department CIO John Stenbit call for transition to IPv6 by 2008. This aggressive push towards IPv6 entails a certain amount of risk, because the military may very likely find itself well ahead of the COTS curve, thus exposing itself to the costs and security risks of being an early adopter. At the same time, the IETF is quietly and comprehensively (and grudgingly) solving the problems associated with the use of NATs in IPv4. This work has progressed to a point where all of the goals of IPv6 are satisfied. Indeed the key components of this progression may be identified and expressed as a cohesive architecture, which we call NUTSS after its key components: NAT, URI, Tunnels, SIP, and STUN. This talk describes this emerging architecture, discusses its pros and cons, and suggests future directions.