Murphy K. P., Dynamic Bayesian Networks Representation, Inference And Learning.pdf

Dynamic Bayesian Networks:

Representation, Inference and Learning

Kevin Patrick Murphy

B.A. Hon. (Cambridge University) 1992

M.S. (University of Pennsylvania) 1994

A dissertation submitted in partial satisfaction

of the requirements for the degree of

Doctor of Philosophy

Computer Science

in the

GRADUATE DIVISION

of the

UNIVERSITY OF CALIFORNIA, BERKELEY

Committee in charge:

Professor Stuart Russell, Chair

Professor Michael Jordan

Professor Peter Bickel

Professor Jeffrey Bilmes

Fall 2002

Dynamic Bayesian Networks:

Representation, Inference and Learning

Kevin Patrick Murphy

ABSTRACT

Dynamic Bayesian Networks:

Representation, Inference and Learning

Kevin Patrick Murphy

Doctor of Philosophy in Computer Science

University of California, Berkeley

Professor Stuart Russell, Chair

Modelling sequential data is important in many areas of science and engineering. Hidden Markov models

(HMMs) and Kalman ﬁlter models (KFMs) are popular for this because they are simple and ﬂexible. For

example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been

used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs

and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs

by allowing the state space to be represented in factored form, instead of as a single discrete random variable.

DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian.

In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact

and approximate inference in DBNs, and how to learn DBN models from sequential data.

In particular, the main novel technical contributions of this thesis are as follows: a way of representing

Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3 ) , where

T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T) ;

a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds

on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored

frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of

applying Rao-Blackwellised particle ﬁltering to DBNs in general, and the SLAM (simultaneous localization

and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety

of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation

of the ﬁeld of sequential data modelling.

ACKNOWLEDGMENTS

I would like to thank my advisor, Stuart Russell, for supporting me over the years, and for giving me so

much freedom to explore and discover new areas of probabilistic AI. My other committee members have also

been very supportive. Michael Jordan has long been an inspiration to me. His classes and weekly meetings

have proved to be one of my best learning experiences at Berkeley. Jeff Bilmes proved to be a most thorough

reviewer, as I expected, and has kept me honest about all the details. Peter Bickel brought a useful outsider’s

perspective to the thesis, and encouraged me to make it more accessible to non computer scientists (although

any failings in this regard are of course my fault).

I would like to thank my many friends and colleagues at Berkeley with whom I have had the pleasure of

working over the years. These include Eyal Amir, David Andre, Serge Belongie, Jeff Bilmes, Nancy Chang,

Nando de Freitas, Nir Friedman, Paul Horton, Srini Narayanan, Andrew Ng, Mark Paskin, Sekhar Tatikonda,

Yair Weiss, Erix Xing, Geoff Zweig, and all the members of the RUGS and IR groups.

I would like to thank Jim Rehg for hiring me as an intern at DEC/Compaq/HP Cambridge Research Lab

in 1997, where my Bayes Net Toolbox (BNT) was born. I would like to thank Gary Bradski for hiring me

as an intern at Intel in 2000 to work on BNT, and for providing me with the opportunity to work with people

spanning three countries formerly known as superpowers — USA, China and Russia. In particular, I would

like to thank Wei Hu and Yimin Zhang, of ICRC, for their help with BNT. I would also like to thank the many

people on the web who have contributed bug ﬁxes to BNT. By chance, I was able to work with Sebastian

Thrun during part of my time with Intel, for which I am very grateful.

I would like to thank my friends in Jennie Nation and beyond for providing a welcome distraction from

school. Finally, I would like to thank my wife Margaret for putting up with my weekends in the ofﬁce, for

listening to my sagas from Soda land, and for giving me the motivation to ﬁnish this thesis.

Contents

1 Introduction

1.1 State-space models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2 Hidden Markov Models (HMMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.4 The problem with HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Kalman Filter Models (KFMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.4 The problem with KFMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4 Overview of the rest of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.5 A note on software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.6 Declaration of previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 DBNs: Representation

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 DBNs deﬁned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3 Representing HMMs and their variants as DBNs . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3.1 HMMs with mixture-of-Gaussians output . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.2 HMMs with semi-tied mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.3 Auto-regressive HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.4 Buried Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

iii

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: