学习资料重要
相关博客:http://blog.csdn.net/dark_scope/article/details/8252969
专栏:http://blog.csdn.net/column/details/deeprl.html
增强学习课程 David Silver (有视频和ppt):http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html
最好的增强学习教材:
Reinforcement Learning: An Introduction:https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html
深度学习课程 (有视频有ppt有作业):https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
深度增强学习的讲座都是David Silver的:
ICLR 2015 part 1 https://www.youtube.com/watch?v=EX1CIVVkWdE
ICLR 2015 part 2 https://www.youtube.com/watch?v=zXa6UFLQCtg
UAI 2015 https://www.youtube.com/watch?v=qLaDWKd61Ig
RLDM 2015 http://videolectures.net/rldm2015_silver_reinforcement_learning/
其他课程:
增强学习
Michael Littman: https://www.udacity.com/course/reinforcement-learning–ud600
AI(包含增强学习,使用Pacman实验)
Pieter Abbeel:https://www.edx.org/course/artificial-intelligence-uc-berkeleyx-cs188-1x-0#.VKuKQmTF_og
Deep reinforcement Learning:
Pieter Abbeel http://rll.berkeley.edu/deeprlcourse/
高级机器人技术(Advanced Robotics):
Pieter Abbeel:http://www.cs.berkeley.edu/~pabbeel/cs287-fa15/
深度学习相关课程:
用于视觉识别的卷积神经网络(Convolutional Neural Network for visual network):http://cs231n.github.io/
机器学习 Machine Learning
Andrew Ng:
https://www.coursera.org/learn/machine-learning/
神经网络(Neural Network for Machine Learning)(2012年的)
Hinton:https://www.coursera.org/course/neuralnets
最新机器人专题课程Penn(2016年开课):https://www.coursera.org/specializations/robotics
2 论文资料
https://github.com/junhyukoh/deep-reinforcement-learning-papers
https://github.com/muupan/deep-reinforcement-learning-papers
这两个人收集的基本涵盖了当前deep reinforcement learning 的论文资料。
3 大牛情况:
DeepMind:http://www.deepmind.com/publications.html
Pieter Abbeel 团队:http://www.eecs.berkeley.edu/~pabbeel/
Satinder Singh:http://web.eecs.umich.edu/~baveja/
CMU 进展:http://www.cs.cmu.edu/~lerrelp/
Prefered Networks: (日本创业公司)
Deep Reinforcement Learning Workshop NIPS 2015 : http://rll.berkeley.edu/deeprlworkshop/
深度学习研究总结:强化学习技术趋势与分析(经典论文)
ICLR 2017中和Deep Reinforcement Learning相关的论文我这边收集了一下,一共有30篇(可能有漏),大部分来自于DeepMind和OpenAI,可见DRL依然主要由DeepMind和OpenAI把持。
2 DeepMind的论文分析
[1] LEARNING TO COMPOSE WORDS INTO SENTENCES WITH REINFORCEMENT LEARNING
[2] LEARNING TO NAVIGATE IN COMPLEX ENVIRONMENTS
[3] LEARNING TO PERFORM PHYSICS EXPERIMENTS VIA DEEP REINFORCEMENT LEARNING
[4] PGQ: COMBINING POLICY GRADIENT AND Q- LEARNING
[5] Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
[6] REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
[7] SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
[8] THE PREDICTRON: END-TO-END LEARNING AND PLANNING
3 OpenAI的论文分析(包含Sergey Levine的论文)
[9] #EXPLORATION: A STUDY OF COUNT-BASED EXPLORATION FOR DEEP REINFORCEMENT LEARNING
[10] GENERALIZING SKILLS WITH SEMI-SUPERVISED REINFORCEMENT LEARNING
[11] LEARNING INVARIANT FEATURE SPACES TO TRANS- FER SKILLS WITH REINFORCEMENT LEARNING
[12] LEARNING VISUAL SERVOING WITH DEEP FEATURES AND TRUST REGION FITTED Q-ITERATION
[13] MODULAR MULTITASK REINFORCEMENT LEARNING WITH POLICY SKETCHES
[14] STOCHASTIC NEURAL NETWORKS FOR HIERARCHICAL REINFORCEMENT LEARNING
[15] THIRD PERSON IMITATION LEARNING
[16] UNSUPERVISED PERCEPTUAL REWARDS FOR IMITATION LEARNING
[17] EPOPT: LEARNING ROBUST NEURAL NETWORK POLICIES USING MODEL ENSEMBLES
[18] RL2: FAST REINFORCEMENT LEARNING VIA SLOW REINFORCEMENT LEARNING
4 其他论文
[19] COMBATING DEEP REINFORCEMENT LEARNING’S SISYPHEAN CURSE WITH INTRINSIC FEAR
[20] COMMUNICATING HIERARCHICAL NEURAL CONTROLLERS FOR LEARNING
ZERO-SHOT TASK GENERALIZATION
[21] DESIGNING NEURAL NETWORK ARCHITECTURES USING REINFORCEMENT LEARNING
[22] LEARNING TO PLAY IN A DAY: FASTER DEEP REIN- FORCEMENT LEARNING BY OPTIMALITY TIGHTENING
[23] LEARNING TO REPEAT: FINE GRAINED ACTION REPETITION FOR DEEP REINFORCEMENT LEARNING
[24] MULTI-TASK LEARNING WITH DEEP MODEL BASED REINFORCEMENT LEARNING
[25] NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING
[26] OPTIONS DISCOVERY WITH BUDGETED REINFORCE- MENT LEARNING
[27] REINFORCEMENT LEARNING THROUGH ASYNCHRONOUS ADVANTAGE ACTOR-CRITIC ON A GPU
[28] SPATIO-TEMPORAL ABSTRACTIONS IN REINFORCEMENT LEARNING THROUGH NEURAL ENCODING
[29] SURPRISE-BASED INTRINSIC MOTIVATION FOR DEEP REINFORCEMENT LEARNING
[30] TUNING RECURRENT NEURAL NETWORKS WITH REINFORCEMENT LEARNING