Last active
November 13, 2020 21:12
-
-
Save iambrian/2bcc8fc03eaecb2cbe53012d2f505465 to your computer and use it in GitHub Desktop.
OpenAI gym tutorial
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Getting Setup: | |
| Follow the instruction on https://gym.openai.com/docs | |
| ``` | |
| git clone https://github.com/openai/gym | |
| cd gym | |
| pip install -e . # minimal install | |
| ``` | |
| Basic Example using CartPole-v0: | |
| Level 1: Getting environment up and running | |
| ``` | |
| import gym | |
| env = gym.make('CartPole-v0') | |
| env.reset() | |
| for _ in range(1000): # run for 1000 steps | |
| env.render() | |
| action = env.action_space.sampe() # pick a random action | |
| env.step(action) # take action | |
| ``` | |
| Level 2: Running trials(AKA episodes) | |
| ``` | |
| import gym | |
| env = gym.make('CartPole-v0') | |
| for i_episode in range(20): | |
| observation = env.reset() # reset for each new trial | |
| for t in range(100): # run for 100 timesteps or until done, whichever is first | |
| env.render() | |
| action = env.action_space.sample() # select a random action (see https://github.com/openai/gym/wiki/CartPole-v0) | |
| observation, reward, done, info = env.step(action) | |
| if done: | |
| print("Episode finished after {} timesteps".format(t+1)) | |
| break | |
| ``` | |
| Level 3: Non-random actions | |
| ``` | |
| import gym | |
| env = gym.make('CartPole-v0') | |
| highscore = 0 | |
| for i_episode in range(20): # run 20 episodes | |
| observation = env.reset() | |
| points = 0 # keep track of the reward each episode | |
| while True: # run until episode is done | |
| env.render() | |
| action = 1 if observation[2] > 0 else 0 # if angle if positive, move right. if angle is negative, move left | |
| observation, reward, done, info = env.step(action) | |
| points += reward | |
| if done: | |
| if points > highscore: # record high score | |
| highscore = points | |
| break | |
| ``` |
FFS iambrian
Hi, thanks for the tuto. I believe there is a small mistake, the break in the final example needs one less indentation.
if done:
if points > highscore: # record high score
highscore = points
break
Basic tutorial question:
import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000): # run for 1000 steps
env.render()
action = env.action_space.sampe() # pick a random action
env.step(action) # take action
What am I supposed to do with this? Paste it to command line? Paste it to a file and run it with some command?
@kschultz1986 you should probably learn how to use Python first. you're not going to be able to use Gym if you don't know how to write and run a Python program, which seems to be the case here.
but if you insist...
assuming you've installed python and gym and all the dependencies correctly on your system, you can paste that code into a text file (say, test.py) and then run python test.py
Amazing! Thank you for this tutorial
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for the tutorial!
Hate to be picky but it is code and there's a type on line 6 in the code block for Level 1:
should be:
Also, I think the break in Level 3 is indented one level too many since it will not restart the episode when it's "done".