Intelligent Agents for Complex, Uncertain Environments
In both control theory and AI, there is now a consensus that probabilistic and decision-theoretic methods provide a rigorous foundation for optimal decision making in environments with partial and uncertain sensory data and uncertain dynamics. Work in stochastic optimal control relates directly to AI work on rational agent design. Significant methodological differences have arisen from differences in assumptions about the environment: in control theory, offline design of control laws is used to address continuous-domain problems; in AI, online decision making is used to handle environments with large numbers of discrete variables. The aim of the proposed work is to broaden the relationship between AI and control-theoretic approaches by developing a solid theoretical and technological base for handling the representational and inferential complexity inherent in large, hybrid environments.Structured representations for complex environments
In rational agent design, a key concept is what might be called the belief state: the current joint probability distribution over states of the environment, conditioned on all prior observations. With incomplete and noisy sensors, optimal decisions must be computed from the current belief state. Kalman filtering, for example, is essentially a technique for maintaining belief states describable as multivariate Gaussian distributions. In the general case of hybrid domains with both discrete and continuous variables, the belief state if explicitly represented grows exponentially with the number of variables. Avoiding this exponential growth is essential.Probabilistic networks or PNs (also known as belief networks) take advantage of the locality of causal influence to produce compact, structured representations for complex environments. PNs represent the joint distribution over a collection of variables as a directed acyclic graph whose nodes represent variables and whose edges represent direct causal influences. The semantics of such a network is that each node is conditionally independent of its non-descendants given values for its parents. This allows the joint probability distribution over these random variables to be concisely represented by a conditional probability table (CPT) associated with each node. The CPT for node X specifies the probability distribution over the values of X given each combination of values for its parents. In typical domains, the size of the representation grows only linearly in the number of variables rather than exponentially.
Modeling complex stochastic processes
Dynamic probabilistic networks (DPNs) extend PNs by including multiple connected copies (called time slices) of a static PN, thereby enabling the modeling of stochastic temporal processes. The following figure shows the coarse structure of a generic DPN. The CPTs for a DPN include a state evolution model, which describes the transition probabilities between states, and a sensor model, which describes the observations that can result from a given state. Typically, one assumes stationarity of these evolution and sensor models. As a way of modeling a stochastic, partially observable process, DPNs compete directly with hidden Markov models (HMMs). DPNs allow the decomposition of the hidden state into several variables, thereby revealing additional structure in the process being modeled, Whereas a DPN represent the belief state via the assignment of values to O(n) state variables, an HMM represents the state simply as one of O(2^n) states. If the state evolution model can be described compactly in terms of the the CPTs for the state variables, DPNs provide a significant improvement over HMMs in both representation and inference.
![]()
Generic Stucture of a dynamic probabilistic network. In an actual network, there may be
many state and sensor variables in each time slice.
DPNs serve a number of purposes. They can be used for maintaining and updating the belief state over time, thereby providing the necessary information for optimal decision making. DPNs can be used to project possible future evolutions of the observed system by adding slices into the future. When decision nodes are added, they enable approximately rational decision-making with a limited horizon. We have used them for freeway surveillance [42] and for controlling an autonomous vehicle [43]. We will investigate the use of DPNs as the basic substrate for rational agent design. DPNs can also serve as the basis for powerful adaptive control methods (see Project 7).Hierarchical modeling
Recent progress in verifying large control systems has taken advantage of hierarchical modeling to reduce complexity. Similar ideas can be applied in the context of using DPNs to model stochastic dynamical systems. Just as complexity can be ``hidden'' in a control system by a layered hierarchy of controllers, we can use hierarchical layered DPNs to represent and control complex stochastic processes. We will investigate conditions under which such a hierarchical abstraction does not lose information, and analyze the tradeoff between error and computational savings in those cases where the abstracted model is only an approximation of the true model.Fully expressive languages for dynamically changing structure
Complex, dynamic environments with changing structure present serious problems for both control-theoretic and AI techniques. Standard representational methods such as state equations and attribute-based models (including PNs) assume a fixed, enumerated set of variables. This is is a major barrier to knowledge sharing and reuse, since we are forced to manually construct a model for each different system configuration. Even minor adjustments (e.g., switching from a six-lane to a four-lane freeway, or modeling the changing collection of vehicles in the field of view) may require major changes to the model structure. Currently, such problems are handled outside the modeling formalism.We propose to augment existing modeling languages (such as dynamic probabilistic networks) with some of the expressive power of first-order logic. The resulting declarative language (an extension of [44] combines sound probabilistic semantics with the ability to formulate general rules. A single knowledge base in this language models many different stochastic dynamical systems. Specific models then be generated as needed, using a process called knowledge-based model construction. We propose to use this representation language as the basic tool for dealing with situations where the domain qualitatively changes over time.
Dynamic reallocation of computational resources
In addition to dynamically adapting the structure of the model, complex environments require dynamic reallocation of computational resources to focus on the most relevant aspects. In recent work, we have started to experiment with this idea in the context of Monte Carlo sampling algorithms. There, we use the current evidence, as well as our beliefs about the state of the world at time $t$, in order to guide the sampling process for the state at time t+\Delta. The algorithm, called survival of the fittest devotes a higher fraction of the stochastic samples to areas that currently appear to be more likely. This gives us a better approximation to the belief state at time t+\Delta. The resulting real-time algorithm [45] seems (in practice) to maintain a good approximation of the belief state indefinitely, with no accumulation of errors over time.These ideas can be extended further to allow real-time modulation of hierarchical abstraction in models of stochastic systems. For example, we may need to reason accurately and at 30 Hz about the location of a vehicle that is changing lanes in our immediate vicinity, but only at 10^-3 Hz about the weather. These are instances of the general problem of metareasoning, that is, the control of reasoning processes. In [46] , a theoretical foundation and efficient algorithms are given for decision-theoretic metareasoning, showing how an intelligent agent can rationally control its own computations on the basis of relevance to objectives. We propose to deepen and extend this foundation to handle the kinds of complex representations described above.
Modeling hybrid domains
Both the representations and the algorithms described above are particularly suitable for reasoning in hybrid domains, involving both discrete and continuous variables. Current technology does not provide an adequate solution for such domains. Exact representations and solutions exist only for a very restricted class of domains, where the distribution can be modeled as a mixture of Gaussians. In other cases, continuous variables are discretized, resulting either in very large or very imprecise representations. We will develop new representations and inference algorithms to handle general hybrid domains. In hybrid networks, the purely discrete conditional probability table is replaced by a general parameterized conditional density function. We propose to use families of canonical density functions, such as noisy thresholds for discrete variables with continuous parents, to provide the necessary representation language. (Such a language can also be embedded in the the first-order representation language of[44] .) The functional specification of conditional probabilities allows us to reduce the problem of probabilistic inference in hybrid networks to that of computing an integral over the posterior probability distribution. The random sampling algorithms described above can thus be interpreted as performing stochastic integration, allowing them to be analyzed using standard techniques. The survival of the fittest algorithm, which focuses attention on those values of a node that are most likely, is particularly appropriate in this case, as it seems to focus on the dominant part of the integral.Design of Intelligent Agent Architectures
The techniques described in the preceding paragraphs support a wide variety of complex reasoning and decision-making tasks. These must be integrated into an intelligent agent architecture. The architecture must address issues of real-time decision-making, adaptation, and hierarchical decomposition. Real-time control is handled by metareasoning (see above) and by the integration of multiple execution architectures ranging from compiled control laws to online planning. Adaptation is discussed in Project 7. Hierarchical decomposition has two key aspects: choice of decomposition and allocation of computational resources within the hierarchy. We have derived preliminary results showing that under certain conditions, an agent can be decomposed into distributed components, each of which handles a portion of the overall sensing and control problem. This can be done without compromising global optimality, the extent of the allowable decomposition depending on the structure of the domain and the multiattribute structure of the utility function. Thus, we believe that the process of designing optimal agent decompositions can be at least partially automated. The problem of resource allocation among the components is addressed in [47], which shows that optimal allocation can be achieved in linear time for tree-structured decompositions.