Course:
Instructors: Jimmy Ba
Lecture hours: Wednesday 3 – 5 ES B142
Office hours: Jimmy: W 5 – 6 PT290D TAs: TH 3–4 PT290C
Teaching assistants: Tingwu Wang, Michael Zhang
Announcements:
-
New Oct 30: TA hours moved to 3-4PM, Thursday in Pratt 290
-
New Oct 30: You are encouraged to upload the link of your presentation slides to the seminar excel sheet.
-
Oct 11: The course project guideline is now posted. Guideline
-
Oct 3: Updated software resources. Enroll on Piazza to find project partners.
-
Sept 18: New classroom change from BA1240 to ES B142.
Course Overview:
Learning by interaction or trial-and-error is a core aspect of any intelligence system. Reinforcement learning (RL) is a paradigm aiming to develop computational methods that allow intelligent agents to learn by interacting with their environments. In this course, we will cover the basic formulation of the Markov decision process (MDP), learning algorithms for tabular MDPs. This course will mainly focus on various function approximation methods using deep neural networks. The examples will include game playing and robot locomotion control.
Calendar:
Resource:
Type | Name | Description |
---|---|---|
RL Code base | OpenAI Baseline | Implementations of common reinforcement learning algorithms. |
Google Dopamine | Research framework for fast prototyping of reinforcement learning algorithms. | |
Evolution-strategies-starter | Evolution Strategies as a Scalable Alternative to Reinforcement Learning. | |
Pytorch-a2c-ppo-acktr | PyTorch implementation of A2C, PPO and ACKTR. | |
Model-Agnostic Meta-Learning | Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. | |
Reptile | Reptile is a meta-learning algorithm that finds a good initialization. | |
General Framework | TensorFlow | An open source machine learning framework. |
PyTorch | An open source deep learning platform that provides a seamless path from research prototyping to production deployment. | |
Environments | OpenAI Gym | Gym is a toolkit for developing and comparing reinforcement learning algorithms. |
Deepmind Control Suite | A set of Python Reinforcement Learning environments powered by the MuJoCo physics engine. | |
Suggested (Free) online computation platform | AWS-EC2 | Amazon Elastic Compute Cloud (EC2) forms a central part of Amazon.com’s cloud-computing platform, Amazon Web Services (AWS), by allowing users to rent virtual computers on which to run their own computer applications. |
GCE | Google Compute Engine delivers virtual machines running in Google’s innovative data centers and worldwide fiber network. | |
Colab | Colaboratory is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud. |