gusuperstar

gusuperstar / rl-tutorial-1.ipynb

Created April 8, 2019 06:00 — forked from awjuliani/rl-tutorial-1.ipynb

Reinforcement Learning Tutorial 1 (Two-armed bandit problem)

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

gusuperstar / SimplePolicy.ipynb

Created April 8, 2019 02:48 — forked from awjuliani/SimplePolicy.ipynb

Policy gradient method for solving n-armed bandit problems.

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

gusuperstar / pg-pong.py

Created April 2, 2019 13:33 — forked from karpathy/pg-pong.py

Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels

	""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
	import numpy as np
	import cPickle as pickle
	import gym

	# hyperparameters
	H = 200 # number of hidden layer neurons
	batch_size = 10 # every how many episodes to do a param update?
	learning_rate = 1e-4
	gamma = 0.99 # discount factor for reward