CS 281r
Planning:
without
and
with
Uncertainty
Harvard University
Fall 2006
MW 1 - 2:30, 60 Oxford Street, 330
Teaching Staff
- Professor: Avi Pfeffer
- Office: Maxwell Dworkin 251
- Email: avi@eecs.harvard.edu
- Office hours: Friday 1-2:30
- TF: Chih-han
Yu
- Office: Maxwell Dworkin 238
- Email: chihanyu@gmail.com
- Office Hours: TBA
The class web page is at
http://www.eecs.harvard.edu/~avi/CS281r/F06
You may email the teaching staff at
cs281r@eecs.harvard.edu
Course Description
Planning is one of the central challenges of artificial intelligence.
The task is to get from an initial condition to a goal condition via a
sequence of actions. In this course, we will explore a variety of
approaches to planning. We will begin by exploring traditional
planning approaches, that do not consider uncertainty about an agent's
knowledge of the world and the effects of its actions. Topics in this
section will include: planning as theorem proving, non-linear
planning, planning as graph search, planning as satifiability,
hierarchical planning, and planning with time and resource
constraints. The second section of the course will cover planning
under uncertainty. Topics here include: Markov decision processes
(MDPs), factored MDPs, first-order MDPs, partially observable MDPs,
and reinforcement learning. The course will conclude with a brief
look at robot motion planning.
Prerequisite
CS 181 or CS 182 or consent of instructor.
Requirements and Grading
This class will be driven by research papers and will largely be
discussion based.
There are three requirements.
-
Read the research papers,
and write a *brief* review of each paper that
summarizes the main contributions of the paper in your own words and
describes
major criticisms, concerns, or questions you have about the paper.
Discuss the applicability and limitations of the approach
described in the paper.
Your review does not have to be overly detailed.
The review should be emailed to the teaching staff (cs282@eecs) by the
beginning of class on the day the paper is covered.
There will generally be either one journal paper or two conference
papers per class.
Readings will be worth approximately 45% of the final grade.
-
Participation in class discussions. This will be worth approximately
10% of the final grade.
-
A final project on a topic related to the course. The project may be
theoretical or applied. We recommend that you begin planning the
project early in the semester. But by all means consider topics
covered towards the end of the course. The project will be worth
approximately 45% of the final grade.
Course Plan and Readings (subject to change)
Mon Sep 18: Introduction and historical beginnings: [Newell and Simon 1963]
Wed Sep 20: Planning as theorem proving [McCarthy and Hayes 69]
Mon Sep 22: Solving the frame problem and goal regression
[Reiter 91]
Wed Sep 24: Nonlinear planning [McAllester and Rosenblitt
91,
Peot and Smith 92]
Wed Oct 1: Graphplan
[Blum and Furst 97]
Satplan [Kautz and Selman
92,
Ernst et al. 97]
Hierarchical planning
[Erol et al. 96,
Yang 90]
Planning with time and resource constraints
[Wilkins 1990,
Haslum and Geffner 01]
Simple temporal problems
[Dechter et al. 91]
Markov decision processes (MDPs) and structure (3 lectures)
[Boutilier et al. 99]
Factored MDPs (2 lectures)
[Guestrin et al.
03
First-order MDPs
[
Yoon et al 02,
Sanner and Boutilier
06
]
Partially observable MDPs
[Kaelbling et al. 98a]
Reinforcement learning (RL)
[ Kaelbling et al. 98b]
RL with function approximation:
[ Boyan and Moore 95,
Gordon 95]
Policy search and hierarchical RL
[ Ng and Jordan 00,
Andre and Russell 02]
Applications of RL
[Tesauro 95, web version available here,
Ng et al. 04]
Robot motion planning
[Latombe 91, chapter 1]
Probabilistic roadmaps
[Kavraki et al. 96]
Rapidly-exploring random trees
[Lavalle and Kuffner
01,
Yu et al. 04]
Please note that there will be no class on Oct 2, due to Yom Kippur,
and on Oct 9, due to Columbus Day.
The last few classes will be devoted to project presentations. Papers
not available on the web can be picked up from MaryFran Mitrano,
Maxwell Dworkin 247.
References
AAAI = National Conference on Artificial Intelligence.
IJCAI = International Joint Conference on Artificial
Intelligence.
UAI = Conference on Uncertainty in Artificial Intelligence.
AIJ = Artificial Intelligence (journal).
JAIR = Journal of Artificial Intelligence Research.
A. Newell and H. A. Simon (1963). "GPS: A Program that Simulates
Human Thought", in E. A. Feigenbaum & J. Feldman (eds.), Computers and
Thought, pp. 279-293, R. Oldenbourg KG.
J. McCarthy and P. J. Hayes (1969). "Some Philosophical Problems from
the Standpoint of Artificial Intelligence", Machine Intelligence 4,
pp. 463-502.
R. Reiter (1991). "The Frame Problem in the Situation Calculus: A
Simple Solution (Sometimes) and a Completeness Result for Goal
Regression". In V. Lifschitz (ed.), Artificial Intelligence and
Mathematical Theory of Computation: Papers in Honor of John McCarthy,
pp. 359-370, Academic Press, New York.
D. A. McAllester and D. Rosenblitt (1991). "Systematic nonlinear
planning". AAAI.
M. A. Peot and D. E. Smith (1992). "Conditional Nonlinear Planning".
International Conference on Artificial Intelligence Planning Systems.
A. L. Blum and M. Furst (1997). "Fast Planning Through Planning Graph
Analysis". AIJ, 90(1-2), pp. 281-300.
H. Kautz and B. Selman (1992). "Planning as satisfiability".
European Conference on Artificial Intelligence.
M. D. Ernst, T. D. Millstein and D. S. Weld (1997). "Automatic SAT
Compilation of Planning Problems". IJCAI.
K. Erol, J. Hendler and D. S. Nau (1996). "Complexity Results for HTN
Planning". Annals of Mathematics and Artificial Intelligence, 18(1)
pp. 69-93.
Q. Yang (1990). "Formalizing Planning Knowledge for Hierarchical
Planning". Computational Intelligence 6, pp. 12-24.
D. E. Wilkins (1990). "Can AI Planners Solve Practical Problems?".
Computational Intelligence 6(4), 232-246.
P. Haslum and H. Geffner (2001). "Heuristic Planning with Time and
Resources". IJCAI-01 Workshop on Planning with Resources.
R. Dechter, I. Meiri and J. Pearl (1991). "Temporal Constraint
Networks". AIJ 49, pp. 61--95.
C. Boutilier, T. Dean and S. Hanks (1999). "Decision-Theoretic
Planning: Structural Assumptions and Computational Leverage". JAIR
11, pp. 1-94.
C. Guestrin, D. Koller, R. Parr and S. Venkataraman (2003). "Efficient
Solution Algorithms for Factored MDPs". JAIR 19, pp. 399-468.
S. W. Yoon, A. Fern and R. Givan (2002). "Inductive policy selection
for first-order MDPs". UAI.
S. Sanner, C. Boutilier (2006). "Practical linear value-approximation
techniques for first-order MDPs". UAI.
L. Pack Kaelbling, M. L. Littman and A. R. Cassandra (1998a). "Planning and
Acting in Partially Observable Stochastic Domains". AIJ 101,
pp. 99-134.
L. Pack Kaelbling, M. L. Littman and A. W. Moore (1998b).
"Reinforcement Learning: A Survey". JAIR 4, pp. 237-285.
J. A. Boyan and A. M. Moore (1995). "Generalization in reinforcement
learning: safely approximating the value function". Advances in
Neural Information Processing Systems (NIPS).
G. J. Gordon (1995). "Stable Function Approximation in Dynamic
Programming". Technical report CMU-CS-95-103, Carnegie Mellon
University, Computer Science Department.
A. Y. Ng and M. Jordan (2000). "PEGASUS: A policy search method for
large MDPs and POMDPs". UAI.
D. Andre and S. J. Russell (2002). "State Abstraction for
Programmable Reinforcement Learning Agents". AAAI.
G. Tesauro (1995). "Temporal Difference Learning and TD-Gammon".
Communications of the ACM, 38(3).
A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse,
E. Berger and E. Liang (2004). "Inverted autonomous helicopter flight via
reinforcement learning". International Symposium on Experimental
Robotics.
J.-C. Latombe (1991). Robot Motion Planning. Kluwer.
L. E. Kavraki, P. Svestka, J.-C. Latombe and M. H. Overmars (1996).
"Probabilistic Roadmaps for Path Planning in High-Dimensional
Configuration Spaces". IEEE Trans. on Robotics and Automation, 12(4),
pp. 566-580.
S. M. LaValle and J. J. Kuffner, Jr. (2001). "Randomized Kinodynamic
Planning". International Journal of Robotics Research, 20(5),
pp. 378-400.
C.-H. Yu, J. Chuang, B. Gerkey, G. J. Gordon, A. Ng (2004). "Open Loop
Plans in Multi-Robot POMDPs". Technical report, Stanford University,
Department of Computer Science.