Skip to content

Instantly share code, notes, and snippets.

@ishuca
Last active November 13, 2017 23:58
Show Gist options
  • Select an option

  • Save ishuca/6e2fd4c2cc6e49733249856f74c726ff to your computer and use it in GitHub Desktop.

Select an option

Save ishuca/6e2fd4c2cc6e49733249856f74c726ff to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Simple Reinforcement Learning with Tensorflow Part 4: Deep Q-Networks and Beyond\n",
"\n",
"이 iPython noteboon 에서 Double DQN 과 Dueling DQN 둘다 사용해서 Deep Q-Network를 구현한다. 이 에이전트는 기본적인 격자 세계에서 네비게이션 작업을 해결하는 것을 배운다."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import gym\n",
"import numpy as np\n",
"import random\n",
"import tensorflow as tf\n",
"import matplotlib.pyplot as plt\n",
"import scipy.misc\n",
"import os\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 게임 환경 불러오기\n",
"\n",
"격자세계의 크기를 조절할 수 있다. 크기를 더 작게 하면 우리의 DQN 에이전트가 쉽게 작업할 수 있지만, 크게하면 도전과제가 된다.\n",
"\n",
"gridworld 모듈은 https://github.com/awjuliani/DeepRL-Agents/blob/master/gridworld.py 에서 받을 수 있다."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAWEAAAFiCAYAAAAna2l5AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAAPYQAAD2EBqD+naQAAGTxJREFUeJzt3X+U3XV95/HnG5HS4GZyFjSp66/YWLXHLuwMQrMW3Roq\n0l0Velrlis2xHJalNudkZ/cckTUep8nWctIjiT97oHargl4P3bNtgEXTCMWCKeFwh8JBAt1AImLM\nqNiduAkIkvf+8f2OvXMZJnNn7sxncvN8nPM9J/fz/dz7feWbmdd853u/ud/ITCRJZZxQOoAkHc8s\nYUkqyBKWpIIsYUkqyBKWpIIsYUkqyBKWpIIsYUkqyBKWpIIsYUkqaN5KOCL+ICL2RsSTEXFXRLxx\nvrYlSceqeSnhiHgP8HHgo8C/Ae4DtkfEafOxPUk6VsV8fIBPRNwF7MrM9fXjAL4DfDIzN3fMPRU4\nD9gHPNXzMJK08E4GXgVsz8wnppt4Yq+3HBEvBIaAj02MZWZGxNeB1VM85TzgS73OIUmLwMXAl6eb\n0PMSBk4DXgCMdYyPAa+dYv4+gOuvv55rr72WLVu2zEOk2RseHjbTDJhpZsx0dIstD3Sfaffu3bzv\nfe+Dut+mMx8l3K2nAK699loefvhhRkZGfrai0WjQaDRK5QJgYGCAwcHBohk6mWlmzDQziy3TYssD\n02dqNps0m81JY+Pj4xN/POop1vko4R8CzwLLO8aXAwee70lbtmxhZGSEG2+8cR4iSdL8mOpgcXR0\nlKGhoRk9v+dXR2TmM0ALWDMxVr8xtwbY2evtSdKxbL5OR1wNfD4iWsDdwDCwBPj8PG1Pko5J81LC\nmXlDfU3wRqrTEP8AnJeZP5jueaXP/07FTDNjppkx09Ettjwwv5nm5TrhrgJEDAKtVqu16E7GS9Js\ntJ0THsrM0enm+tkRklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnC\nklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQJSxJBVnCklSQ\nJSxJBVnCklSQJSxJBXVdwhFxTkTcGBHfjYgjEfHOKeZsjIj9EXE4InZExKrexJWk/jKbI+FTgH8A\nPgBk58qIuAJYB1wGnAUcArZHxElzyClJfenEbp+QmV8DvgYQETHFlPXApsy8uZ6zFhgDLgBumH1U\nSeo/PT0nHBErgRXArRNjmXkQ2AWs7uW2JKkf9PqNuRVUpyjGOsbH6nWSpDZdn46YL8PDwwwMDEwa\nazQaNBqNQokk6eiazSbNZnPS2Pj4+IyfH5nPeW9t5k+OOAJckJk31o9XAo8AZ2Tm/W3zbgfuzczh\nKV5jEGi1Wi0GBwdnnUWSFovR0VGGhoYAhjJzdLq5PT0dkZl7gQPAmomxiFgKnA3s7OW2JKkfdH06\nIiJOAVYBE1dGvDoiTgd+lJnfAbYCGyJiD7AP2AQ8DmzrSWJJ6iOzOSd8JvC3VG/AJfDxevwLwCWZ\nuTkilgDXAMuAO4DzM/PpHuRdFKa+Mk/SQprLqdTFZDbXCX+Do5zGyMwRYGR2kSTp+OFnR0hSQZaw\nJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVk\nCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBXUVQlH\nxJURcXdEHIyIsYj4q4j4pSnmbYyI/RFxOCJ2RMSq3kWWpP7R7ZHwOcCngLOBc4EXAn8TET8/MSEi\nrgDWAZcBZwGHgO0RcVJPEktSHzmxm8mZ+ZvtjyPi/cD3gSHgznp4PbApM2+u56wFxoALgBvmmFeS\n+spczwkvAxL4EUBErARWALdOTMjMg8AuYPUctyVJfWfWJRwRAWwF7szMB+vhFVSlPNYxfaxeJ0lq\n09XpiA6fBX4ZeFMvggwPDzMwMDBprNFo0Gg0evHykjQvms0mzWZz0tj4+PjMXyAzu16ATwPfBl7R\nMb4SOAL8647x24Etz/Nag0C2Wq08VlAd7bu4uBRcFrNWqzWRczCP0qddn46IiE8D7wJ+PTMfa1+X\nmXuBA8CatvlLqa6m2NnttiSp33V1OiIiPgs0gHcChyJieb1qPDOfqv+8FdgQEXuAfcAm4HFgW08S\nS1If6fac8OVUh9i3d4z/HvBFgMzcHBFLgGuorp64Azg/M5+eW1RJ6j/dXic8o9MXmTkCjMwijyQd\nV/zsCEkqyBKWpIIsYUkqyBKWpIIsYUkqyBKWpIIsYUkqyBKWpIIsYUkqyBKWpILm8nnCx68sHaAH\nonSAozkWd/Ki36nTOxZ3eR/wSFiSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iS\nCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCuqqhCPi8oi4LyLG62VnRLy9Y87GiNgfEYcjYkdE\nrOptZEnqH90eCX8HuAIYBIaA24BtEfF6gIi4AlgHXAacBRwCtkfEST1LLEl9pKsSzsz/nZlfy8xH\nMnNPZm4A/h/wq/WU9cCmzLw5Mx8A1gIvBS7oaWpJ6hOzPiccESdExEXAEmBnRKwEVgC3TszJzIPA\nLmD1XINKUj/q+h5zEfEG4O+Bk4EfAxdm5sMRsZrqLlVjHU8ZoypnSVKH2dzo8yHgdGAA+G3gixHx\n5rkGGR4eZmBgYNJYo9Gg0WjM9aUlad40m02azeaksfHx8Rk/PzLndovViNgB7AE2A48AZ2Tm/W3r\nbwfuzczh53n+INBqtVoMDg7OKctCiWP9rrpwDNwY+Fi89e+i36nTO8Z2eS7iwKOjowwNDQEMZebo\ndHN7cZ3wCcDPZeZe4ACwZmJFRCwFzgZ29mA7ktR3ujodEREfA74KPAb8C+Bi4C3A2+opW4ENEbEH\n2AdsAh4HtvUoryT1lW7PCb8E+ALwC8A4cD/wtsy8DSAzN0fEEuAaYBlwB3B+Zj7du8iS1D+6KuHM\nvHQGc0aAkVnmkaTjip8dIUkFWcKSVJAlLEkFWcKSVJAlLEkFWcKSVJAlLEkFWcKSVJAlLEkFWcKS\nVJAlLEkFWcKSVJAlLEkFzeb2RjrGb6AAi/8mCpF9sJOPNcfaLl/sX8Qz5JGwJBVkCUtSQZawJBVk\nCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQZawJBVkCUtSQXMq4Yj4UEQciYirO8Y3RsT+\niDgcETsiYtXcYkpSf5p1CUfEG4HLgPs6xq8A1tXrzgIOAdsj4qQ55JSkvjSrEo6IFwHXA5cC/7dj\n9XpgU2benJkPAGuBlwIXzCWoJPWj2R4Jfwa4KTNvax+MiJXACuDWibHMPAjsAlbPNqQk9auuP9Q9\nIi4CzgDOnGL1CqqPWh7rGB+r10mS2nRVwhHxMmArcG5mPtPLIMPDwwwMDEwaazQaNBqNXm5Gknqq\n2WzSbDYnjY2Pj8/4+ZE583uERMS7gP8FPMs/3wzlBVRHv88CrwP2AGdk5v1tz7sduDczh6d4zUGg\n1Wq1GBwcnHGWkiKOtfvAPNdivzNMLPaA/egY+7LuprsW2ujoKENDQwBDmTk63dxuzwl/HfgVqtMR\np9fLPVRv0p2emY8CB4A1E0+IiKXA2cDOLrclSX2vq9MRmXkIeLB9LCIOAU9k5u56aCuwISL2APuA\nTcDjwLY5p5WkPtOLuy1P+p0gMzdHxBLgGmAZcAdwfmY+3YNtSVJfmXMJZ+ZbpxgbAUbm+tqS1O/8\n7AhJKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJ\nKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJKsgSlqSCLGFJKqgXt7zXMShKBziaRR/wubJ0gDk6\nBnd5X/BIWJIKsoQlqSBLWJIKsoQlqaCuSjgiPhoRRzqWBzvmbIyI/RFxOCJ2RMSq3kaWpP4xmyPh\nB4DlwIp6+bWJFRFxBbAOuAw4CzgEbI+Ik+YeVZL6z2wuUftpZv7gedatBzZl5s0AEbEWGAMuAG6Y\nXURJ6l+zORJ+TUR8NyIeiYjrI+LlABGxkurI+NaJiZl5ENgFrO5JWknqM92W8F3A+4HzgMuBlcDf\nRcQpVAWcVEe+7cbqdZKkDl2djsjM7W0PH4iIu4FvA+8GHppLkOHhYQYGBiaNNRoNGo3GXF5WkuZV\ns9mk2WxOGhsfH5/x8yNzbv/Zsi7iHcDngEeAMzLz/rb1twP3Zubw8zx/EGi1Wi0GBwfnlGWhRPgf\nPPVc/rflhTXX7ppPo6OjDA0NAQxl5uh0c+d0nXBEvAhYBezPzL3AAWBN2/qlwNnAzrlsR5L6VVen\nIyLiT4CbqE5B/CvgD4FngK/UU7YCGyJiD7AP2AQ8DmzrUV5J6ivdXqL2MuDLwKnAD4A7gV/NzCcA\nMnNzRCwBrgGWAXcA52fm072LLEn9o9s35o76LllmjgAjs8wjSccVPztCkgqyhCWpIEtYkgqyhCWp\nIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtY\nkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgqyhCWpIEtYkgrquoQj4qURcV1E/DAiDkfEfREx\n2DFnY0Tsr9fviIhVvYssSf2jqxKOiGXAN4GfAOcBrwf+K/BPbXOuANYBlwFnAYeA7RFxUo8yS1Lf\nOLHL+R8CHsvMS9vGvt0xZz2wKTNvBoiItcAYcAFww2yDSlI/6vZ0xDuAeyLihogYi4jRiPhZIUfE\nSmAFcOvEWGYeBHYBq3sRWJL6Sbcl/Grg94GHgbcBfwp8MiJ+t16/AkiqI992Y/U6SVKbbk9HnADc\nnZkfqR/fFxFvAC4HrutpMkk6DnRbwt8DdneM7QZ+q/7zASCA5Uw+Gl4O3DvdCw8PDzMwMDBprNFo\n0Gg0uowoSQun2WzSbDYnjY2Pj8/8BTJzxgvwJeAbHWNbgDvbHu8HhtseLwWeBH7neV5zEMhWq5XH\nCqpTLi4uLgWXxazVak3kHMyj9Gq3R8JbgG9GxJVUVzqcDVwK/Me2OVuBDRGxB9gHbAIeB7Z1uS1J\n6ntdlXBm3hMRFwJXAR8B9gLrM/MrbXM2R8QS4BpgGXAHcH5mPt272JLUH7o9EiYzbwFuOcqcEWBk\ndpEk6fjhZ0dIUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGW\nsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQVZAlLUkGWsCQV\nZAlLUkGWsCQV1FUJR8TeiDgyxfKptjkbI2J/RByOiB0Rsar3sSWpP3R7JHwmsKJt+Q0ggRsAIuIK\nYB1wGXAWcAjYHhEn9SqwJPWTE7uZnJlPtD+OiHcAj2TmHfXQemBTZt5cr18LjAEXUBe1JOmfzfqc\ncES8ELgY+PP68Uqqo+NbJ+Zk5kFgF7B6bjElqT/N5Y25C4EB4Av14xVUpybGOuaN1eskSR3mUsKX\nAF/NzAO9CiNJx5uuzglPiIhXAOdSneudcAAIYDmTj4aXA/ce7TWHh4cZGBiYNNZoNGg0GrOJKEkL\notls0mw2J42Nj4/P/AUys+sFGAG+C5zQMb4fGG57vBR4EvidaV5rEMhWq5XHCqrTLi4uLgWXxazV\nak3kHMyj9GnXR8IREcD7gc9n5pGO1VuBDRGxB9gHbAIeB7Z1ux1JOh7M5nTEucDLgb/oXJGZmyNi\nCXANsAy4Azg/M5+eU0pJ6lNdl3Bm7gBeMM36EarTFZKko/CzIySpIEtYkgqyhCWpoFldJ3y8y+rS\nOkmaM4+EJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iS\nCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCrKEJakgS1iSCuqqhCPihIjYFBGP\nRsThiNgTERummLcxIvbXc3ZExKreRZak/tHtkfCHgP8EfAB4HfBB4IMRsW5iQkRcAawDLgPOAg4B\n2yPipJ4klqQ+cmKX81cD2zLza/XjxyLivVRlO2E9sCkzbwaIiLXAGHABcMMc80pSX+n2SHgnsCYi\nXgMQEacDbwJuqR+vBFYAt048ITMPAruoClyS1KbbI+GrgKXAQxHxLFWJfzgzv1KvXwEk1ZFvu7F6\nnSSpTbcl/B7gvcBFwIPAGcAnImJ/Zl7X63CS1O+6LeHNwB9n5l/Wj78VEa8CrgSuAw4AASxn8tHw\ncuDe6V54eHiYgYGBSWONRoNGo9FlRElaOM1mk2azOWlsfHx8xs/vtoSXAM92jB2hPrecmXsj4gCw\nBrgfICKWAmcDn5nuhbds2cLg4GCXcSSprKkOFkdHRxkaGprR87st4ZuADRHxOPAtYBAYBj7XNmdr\nPWcPsA/YBDwObOtyW5LU97ot4XVUpfoZ4CXAfuBP6zEAMnNzRCwBrgGWAXcA52fm0z1JLEl9pKsS\nzsxDwH+pl+nmjQAjs04lSccJPztCkgpaVCXc+Q7jYmCmmTHTzJjp6BZbHpjfTJbwUZhpZsw0M2Y6\nusWWB46jEpak440lLEkFWcKSVFC31wnPh5MBdu/ezfj4OKOjo6XzTGKmmTHTzJjp6BZbHug+0+7d\nuyf+ePLR5kZmzjJWb9SfR/yloiEkaX5cnJlfnm7CYijhU4HzqP6L81NFw0hSb5wMvArYnplPTDex\neAlL0vHMN+YkqSBLWJIKsoQlqSBLWJIKsoQlqaBFU8IR8QcRsTcinoyIuyLijQu47XMi4saI+G5E\nHImId04xZ2NE7I+IwxGxIyJWzWOeKyPi7og4GBFjEfFXEfFLhTNdHhH3RcR4veyMiLeXyvM8GT9U\n//tdXSpXRHy0ztC+PFgqT9s2XxoR10XED+vt3hcRgx1zFnI/7Z1iPx2JiE+VyFNv74SI2BQRj9bb\n3BMRG6aY19tcmVl8obqL81PAWuB1VHfl+BFw2gJt/+3ARuBdVPfQe2fH+ivqPP8BeAPw18AjwEnz\nlOcW4HeB1wO/AtxMdR31zxfM9O/r/fSLwCrgvwM/AV5fIs8U+d4IPEp1Q9mrC+6nj1LdX/HFVHef\neQnwL0vlqbe5DNhLdRuyIeCVwLnAyoL76dS2/fMSqvtSPgucU3A//Tfg+/XX+SuA3wIOAuvmcz/N\n+zfHDP/ydwGfaHscVPel+2CBLEemKOH9wHDb46XAk8C7FyjTaXWuX1ssmeptPgH8Xuk8wIuAh4G3\nAn/bUcILmqsu4dFp1i/4fgKuAr5xlDmlv8a3Av9YeD/dBPxZx9j/BL44n7mKn46IiBdS/XS+dWIs\nq7/d14HVpXJNiIiVwAom5zsI7GLh8i0DkuoncPFM9a9tF1HdfXtn6TxU9zy8KTNv68hZKtdr6lNb\nj0TE9RHx8sJ53gHcExE31Ke3RiPi0omVpf/96g64GPjzwnl2Amsi4jV1jtOBN1H9ZjpvuRbDB/ic\nBrwAGOsYHwNeu/BxnmMFVQFOlW/FfG88IoLqKOHOzJw4t1gkU0S8Afh7qv+S+WPgwsx8OCJWl8hT\nZ7oIOAM4c4rVJfbTXcD7qY7Mf4HqXot/V++7Ul9LrwZ+H/g48EfAWcAnI+InmXldwVwTLgQGgC/U\nj0vluYrqyPahiHiW6j2zD2fmV+Yz12IoYU3vs8AvU/1ELu0h4HSqb5jfBr4YEW8uFSYiXkb1A+rc\nzHymVI52mbm97eEDEXE38G3g3VT7r4QTgLsz8yP14/vqHwqXA9cVytTuEuCrmXmgcI73AO8FLgIe\npPrh/omI2F//sJoXxU9HAD+kOiG/vGN8OVD6HwWqDEGBfBHxaeA3gX+Xmd8rnSkzf5qZj2bmvZn5\nYeA+YH2pPFSnsV4MjEbEMxHxDPAWYH1EPE11hFLk325CZo4D/0j1Zmap/fQ9YHfH2G6qN58omIuI\neAXVm4R/1jZcKs9m4KrM/MvM/FZmfgnYAlw5n7mKl3B9BNOiencU+Nmv4GuoztEUlZl7qXZwe76l\nwNnMY766gN8F/HpmPrYYMk3hBODnCub5OtXVI2dQHaGfDtwDXA+cnpmPFsr1MxHxIqoC3l9wP32T\n557aey3VEXrpr6dLqH5Y3jIxUDDPEqoDwnZHqHty3nItxDufM3hX8t3AYSZfovYE8OIF2v4pVN/A\nZ9Q7/T/Xj19er/9gnecdVN/0fw38H+bv8p3PAv8EnEP1U3ZiObltzkJn+lid55VUl+b8MfBT4K0l\n8kyTs/PqiIXeT38CvLneT/8W2EFVMqeW2k9U58t/QnVE94tUv3L/GLio1H6qtxlUl17+0RTrSuT5\nC+Axqt8+X0l1rvr7wMfmM9eCfXPMYAd8oP4HeZLqzZ8zF3Dbb6nL99mO5X+0zRmhujzlMLAdWDWP\neabK8iywtmPeQmb6HNV1uE9SHQ38zUQBl8gzTc7b2ku4wH5qUl1e+WT9Df1l2q7HLbWf6mK5v97m\nt4BLppizoLmA36i/rqfcToE8pwBXU11Tfagu1z8ETpzPXH6esCQVVPycsCQdzyxhSSrIEpakgixh\nSSrIEpakgixhSSrIEpakgixhSSrIEpakgixhSSrIEpakgv4/qIeTLm5hDpYAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x19787f32438>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from gridworld import gameEnv\n",
"\n",
"env = gameEnv(partial=False,size=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"위에는 우리의 단순한 게임에서 출발 환경의 예이다. 에이전트는 파란색 사각형을 조정하고, 위, 아래, 왼쪽, 오른쪽으로 움직일 수 있다. 목표는 초록색 사각형(보상 +1)로 움직이고, 빨간색 사각형(보상 -1)을 피하는 것이다. 이 세 블록의 위치는 매 에피소드 마다 무작위이다."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 신경망 구현하기"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# 신경망 구현\n",
"class Qnetwork():\n",
" def __init__(self,h_size):\n",
" # 신경망은 게임으로부터 벡터화된 배열로 프레임을 받아서 \n",
" # 이것을 리사이즈 하고, 4개의 콘볼루션 레이어를 통해 처리한다.\n",
" \n",
" # 입력값을 받는 부분 21168 차원은 84*84*3 의 차원이다.\n",
" self.scalarInput = tf.placeholder(shape=[None,21168],dtype=tf.float32)\n",
" # conv2d 처리를 위해 84x84x3 으로 다시 리사이즈\n",
" self.imageIn = tf.reshape(self.scalarInput,shape=[-1,84,84,3])\n",
" \n",
" # 첫번째 콘볼루션은 8x8 커널을 4 스트라이드로 32개의 activation map을 만든다\n",
" # 출력 크기는 (image 크기 - 필터 크기) / 스트라이드 + 1 이다.\n",
" # zero padding이 없는 VALID 옵션이기 때문에\n",
" # (84-8)/4 + 1\n",
" # 20x20x32 의 activation volumn이 나온다\n",
" self.conv1 = tf.contrib.layers.convolution2d( \\\n",
" inputs=self.imageIn,num_outputs=32,kernel_size=[8,8],stride=[4,4],padding='VALID', biases_initializer=None)\n",
" # 두번째 콘볼루션은 4x4 커널을 2 스트라이드로 64개의 activation map을 만든다.\n",
" # 출력 크기는 9x9x64\n",
" self.conv2 = tf.contrib.layers.convolution2d( \\\n",
" inputs=self.conv1,num_outputs=64,kernel_size=[4,4],stride=[2,2],padding='VALID', biases_initializer=None)\n",
" # 세번째 콘볼루션은 3x3 커널을 1 스트라이드로 64개의 activation map을 만든다.\n",
" # 출력 크기는 7x7x64\n",
" self.conv3 = tf.contrib.layers.convolution2d( \\\n",
" inputs=self.conv2,num_outputs=64,kernel_size=[3,3],stride=[1,1],padding='VALID', biases_initializer=None)\n",
" # 네번째 콘볼루션은 7x7 커널을 1 스트라이드 512개의 activation map을 만든다.\n",
" # 출력 크기는 1x1x512\n",
" self.conv4 = tf.contrib.layers.convolution2d( \\\n",
" inputs=self.conv3,num_outputs=512,kernel_size=[7,7],stride=[1,1],padding='VALID', biases_initializer=None)\n",
" \n",
" # 마지막 콘볼루션 레이어의 출력을 가지고 2로 나눈다.\n",
" # streamAC, streamVC 는 각각 1x1x256\n",
" self.streamAC,self.streamVC = tf.split(3,2,self.conv4)\n",
" # 이를 벡터화한다. streamA 와 streamV는 256 차원씩이다.\n",
" self.streamA = tf.contrib.layers.flatten(self.streamAC)\n",
" self.streamV = tf.contrib.layers.flatten(self.streamVC)\n",
" # 256개의 노드를 곱해서 각각 A와 V를 구하는 가중치\n",
" self.AW = tf.Variable(tf.random_normal([256,env.actions]))\n",
" self.VW = tf.Variable(tf.random_normal([256,1]))\n",
" # 점수화 한다.\n",
" self.Advantage = tf.matmul(self.streamA,self.AW)\n",
" self.Value = tf.matmul(self.streamV,self.VW)\n",
" \n",
" # 가치 함수 값에 이득에서 이득의 평균을 빼준 값들을 더해준다.\n",
" self.Qout = self.Value + tf.sub(self.Advantage,tf.reduce_mean(self.Advantage,reduction_indices=1,keep_dims=True))\n",
" # 이것으로 행동을 고른다.\n",
" self.predict = tf.argmax(self.Qout,1)\n",
" \n",
" # 타겟과 예측 Q value 사이의 차이의 제곱합이 손실이다.\n",
" # 타겟Q를 받는 부분\n",
" self.targetQ = tf.placeholder(shape=[None],dtype=tf.float32)\n",
" # 행동을 받는 부분\n",
" self.actions = tf.placeholder(shape=[None],dtype=tf.int32)\n",
" # 행동을 one_hot 인코딩 하는 부분 (tf.one_hot은 내 컴퓨터에서 GPU 에러를 내기에 다음의 해법을 찾아 적용)\n",
" def one_hot_patch(x,depth):\n",
" sparse_labels=tf.reshape(x,[-1,1])\n",
" derived_size=tf.shape(sparse_labels)[0]\n",
" indices=tf.reshape(tf.range(0,derived_size,1),[-1,1])\n",
" concated=tf.concat(1,[indices,sparse_labels])\n",
" outshape=tf.concat(0,[tf.reshape(derived_size,[1]),tf.reshape(depth,[1])])\n",
" return tf.sparse_to_dense(concated, outshape,1.0,0.0)\n",
" self.actions_onehot = one_hot_patch(self.actions,env.actions)\n",
" \n",
" # 각 네트워크의 행동의 Q 값을 골라내는 것\n",
" # action 번째를 뽑고 싶지만 tensor는 인덱스로 쓸 수 없어서 이렇게 하는듯(내 생각)\n",
" self.Q = tf.reduce_sum(tf.mul(self.Qout, self.actions_onehot), reduction_indices=1)\n",
" \n",
" # 각각의 차이\n",
" self.td_error = tf.square(self.targetQ - self.Q)\n",
" # 손실\n",
" self.loss = tf.reduce_mean(self.td_error)\n",
" # 최적화 방법 adam\n",
" self.trainer = tf.train.AdamOptimizer(learning_rate=0.0001)\n",
" # 업데이트 함수\n",
" self.updateModel = self.trainer.minimize(self.loss)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experience Replay\n",
"\n",
"이 클래스는 경험을 저장하고 샘플을 뽑아 신경망에 랜덤하게 보내진다."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class experience_buffer():\n",
" def __init__(self, buffer_size = 50000):\n",
" self.buffer = []\n",
" self.buffer_size = buffer_size\n",
" # 더할 때 버퍼사이즈를 넘으면, 앞에서부터 지우고 다시 넣는다.\n",
" def add(self,experience):\n",
" if len(self.buffer) + len(experience) >= self.buffer_size:\n",
" self.buffer[0:(len(experience)+len(self.buffer))-self.buffer_size] = []\n",
" self.buffer.extend(experience)\n",
" # \n",
" def sample(self,size):\n",
" return np.reshape(np.array(random.sample(self.buffer,size)),[size,5])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"우리의 게임 프레임들을 리사이즈 해주는 단순한 함수"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def processState(states):\n",
" return np.reshape(states,[21168])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"이 함수들은 주요 신경망과 함께 타겟 신경망의 파라미터들을 업데이트 할 수 있다."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def updateTargetGraph(tfVars,tau):\n",
" # tfVars 는 학습 가능한 변수들\n",
" # tau는 타겟 신경망이 학습 신경망을 향하는 비율\n",
" # 학습 가능한 변수들의 수\n",
" total_vars = len(tfVars)\n",
" \n",
" # 연산자 저장 리스트\n",
" op_holder = []\n",
" # 학습 가능한 변수의 절반은 주요 신경망, 절반은 타겟 신경망\n",
" for idx,var in enumerate(tfVars[0:int(total_vars/2)]):\n",
" # 앞의 절반의 값에 tau 값을 곱하면, 주요 신경망의 weight에 곱해지고\n",
" # 뒤의 절반의 값에 1-tau 값을 곱하면, 타겟 신경망의 weight에 곱해져서\n",
" # 이부분 타겟 신경망을 업데이트하는 부분\n",
" op_holder.append(tfVars[int(idx)+int(total_vars/2)].assign((var.value()*tau) \n",
" + ((1-tau)*tfVars[int(idx)+int(total_vars/2)].value())))\n",
" return op_holder\n",
"\n",
"def updateTarget(op_holder,sess):\n",
" for op in op_holder:\n",
" sess.run(op)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Training the network\n",
"학습 파라미터들 설정하기"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"batch_size = 32 # 각 학습 단계에 대해 얼마나 많은 경험을 사용할지 결정\n",
"update_freq = 4 # 학습 단계를 얼마나 자주 수행할 것인가\n",
"y = .99 # 타겟 Q 값에 대한 할인 인자\n",
"startE = 1 # 무작위 행위의 시작 확률\n",
"endE = 0.1 # 무작위 행위의 최종 확률\n",
"anneling_steps = 10000. # startE부터 endE까지 몇단계에 걸쳐서 줄일 것인가.\n",
"num_episodes = 10000 # 몇개의 에피소드를 할 것인가.\n",
"pre_train_steps = 10000 # 학습 시작 전에 몇번의 무작위 행위를 할 것인가.\n",
"max_epLength = 50 # 에피소드의 최대 길이 (50 걸음)\n",
"load_model = False # 저장된 모델을 불러올 것인가?\n",
"path = \"./dqn\" # 모델을 저장할 위치\n",
"h_size = 512 # 이득 함수와 가치 함수로 나뉘기 전에 최종 콘볼루션의 크기\n",
"tau = 0.001 # 주요 신경망을 향해 타겟 신경망이 업데이트되는 비율"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Saved Model\n",
"500 2.4 1\n",
"1000 1.2 1\n",
"1500 1.7 1\n",
"2000 1.7 1\n",
"2500 1.3 1\n",
"3000 2.9 1\n",
"3500 1.5 1\n",
"4000 1.9 1\n",
"4500 2.9 1\n",
"5000 2.8 1\n",
"5500 3.1 1\n",
"6000 2.2 1\n",
"6500 2.1 1\n",
"7000 1.5 1\n",
"7500 1.3 1\n",
"8000 1.7 1\n",
"8500 2.6 1\n",
"9000 1.5 1\n",
"9500 2.9 1\n",
"10000 1.8 1\n",
"10500 1.2 0.9549999999999828\n",
"11000 1.1 0.9099999999999655\n",
"11500 1.6 0.8649999999999483\n",
"12000 1.3 0.819999999999931\n",
"12500 1.0 0.7749999999999138\n",
"13000 1.2 0.7299999999998965\n",
"13500 1.2 0.6849999999998793\n",
"14000 0.8 0.639999999999862\n",
"14500 1.3 0.5949999999998448\n",
"15000 1.5 0.5499999999998275\n",
"15500 1.5 0.5049999999998103\n",
"16000 2.7 0.4599999999998177\n",
"16500 1.3 0.41499999999982823\n",
"17000 1.4 0.36999999999983874\n",
"17500 1.3 0.32499999999984924\n",
"18000 0.7 0.27999999999985975\n",
"18500 0.6 0.23499999999986562\n",
"19000 1.2 0.18999999999986225\n",
"19500 0.9 0.14499999999985888\n",
"20000 0.4 0.09999999999985551\n",
"20500 0.3 0.09999999999985551\n",
"21000 1.1 0.09999999999985551\n",
"21500 1.8 0.09999999999985551\n",
"22000 -0.2 0.09999999999985551\n",
"22500 0.4 0.09999999999985551\n",
"23000 0.9 0.09999999999985551\n",
"23500 -0.4 0.09999999999985551\n",
"24000 1.5 0.09999999999985551\n",
"24500 1.5 0.09999999999985551\n",
"25000 1.2 0.09999999999985551\n",
"25500 1.1 0.09999999999985551\n",
"26000 1.3 0.09999999999985551\n",
"26500 1.0 0.09999999999985551\n",
"27000 0.7 0.09999999999985551\n",
"27500 0.9 0.09999999999985551\n",
"28000 0.5 0.09999999999985551\n",
"28500 1.6 0.09999999999985551\n",
"29000 0.2 0.09999999999985551\n",
"29500 0.9 0.09999999999985551\n",
"30000 1.3 0.09999999999985551\n",
"30500 1.6 0.09999999999985551\n",
"31000 1.8 0.09999999999985551\n",
"31500 0.1 0.09999999999985551\n",
"32000 1.7 0.09999999999985551\n",
"32500 1.6 0.09999999999985551\n",
"33000 2.5 0.09999999999985551\n",
"33500 1.0 0.09999999999985551\n",
"34000 1.9 0.09999999999985551\n",
"34500 1.0 0.09999999999985551\n",
"35000 1.8 0.09999999999985551\n",
"35500 0.4 0.09999999999985551\n",
"36000 1.1 0.09999999999985551\n",
"36500 0.7 0.09999999999985551\n",
"37000 0.6 0.09999999999985551\n",
"37500 1.0 0.09999999999985551\n",
"38000 2.6 0.09999999999985551\n",
"38500 1.9 0.09999999999985551\n",
"39000 1.5 0.09999999999985551\n",
"39500 0.7 0.09999999999985551\n",
"40000 1.4 0.09999999999985551\n",
"40500 0.4 0.09999999999985551\n",
"41000 1.6 0.09999999999985551\n",
"41500 2.3 0.09999999999985551\n",
"42000 1.7 0.09999999999985551\n",
"42500 1.4 0.09999999999985551\n",
"43000 2.2 0.09999999999985551\n",
"43500 2.8 0.09999999999985551\n",
"44000 1.4 0.09999999999985551\n",
"44500 1.9 0.09999999999985551\n",
"45000 1.4 0.09999999999985551\n",
"45500 1.6 0.09999999999985551\n",
"46000 1.5 0.09999999999985551\n",
"46500 2.1 0.09999999999985551\n",
"47000 1.6 0.09999999999985551\n",
"47500 2.3 0.09999999999985551\n",
"48000 0.6 0.09999999999985551\n",
"48500 1.9 0.09999999999985551\n",
"49000 0.9 0.09999999999985551\n",
"49500 1.7 0.09999999999985551\n",
"50000 1.7 0.09999999999985551\n",
"Saved Model\n",
"50500 3.4 0.09999999999985551\n",
"51000 2.3 0.09999999999985551\n",
"51500 1.9 0.09999999999985551\n",
"52000 1.0 0.09999999999985551\n",
"52500 0.6 0.09999999999985551\n",
"53000 1.8 0.09999999999985551\n",
"53500 1.5 0.09999999999985551\n",
"54000 1.6 0.09999999999985551\n",
"54500 1.7 0.09999999999985551\n",
"55000 1.5 0.09999999999985551\n",
"55500 2.5 0.09999999999985551\n",
"56000 2.2 0.09999999999985551\n",
"56500 2.1 0.09999999999985551\n",
"57000 2.2 0.09999999999985551\n",
"57500 2.1 0.09999999999985551\n",
"58000 1.2 0.09999999999985551\n",
"58500 2.0 0.09999999999985551\n",
"59000 3.4 0.09999999999985551\n",
"59500 2.1 0.09999999999985551\n",
"60000 1.7 0.09999999999985551\n",
"60500 1.5 0.09999999999985551\n",
"61000 1.1 0.09999999999985551\n",
"61500 2.5 0.09999999999985551\n",
"62000 1.3 0.09999999999985551\n",
"62500 1.2 0.09999999999985551\n",
"63000 3.9 0.09999999999985551\n",
"63500 3.0 0.09999999999985551\n",
"64000 2.6 0.09999999999985551\n",
"64500 1.9 0.09999999999985551\n",
"65000 2.0 0.09999999999985551\n",
"65500 2.7 0.09999999999985551\n",
"66000 3.6 0.09999999999985551\n",
"66500 3.7 0.09999999999985551\n",
"67000 3.2 0.09999999999985551\n",
"67500 3.9 0.09999999999985551\n",
"68000 4.0 0.09999999999985551\n",
"68500 1.4 0.09999999999985551\n",
"69000 3.9 0.09999999999985551\n",
"69500 2.8 0.09999999999985551\n",
"70000 3.4 0.09999999999985551\n",
"70500 2.4 0.09999999999985551\n",
"71000 2.8 0.09999999999985551\n",
"71500 3.0 0.09999999999985551\n",
"72000 2.9 0.09999999999985551\n",
"72500 1.4 0.09999999999985551\n",
"73000 2.6 0.09999999999985551\n",
"73500 2.9 0.09999999999985551\n",
"74000 2.8 0.09999999999985551\n",
"74500 3.3 0.09999999999985551\n",
"75000 1.4 0.09999999999985551\n",
"75500 2.5 0.09999999999985551\n",
"76000 2.9 0.09999999999985551\n",
"76500 4.8 0.09999999999985551\n",
"77000 4.8 0.09999999999985551\n",
"77500 2.9 0.09999999999985551\n",
"78000 3.1 0.09999999999985551\n",
"78500 3.0 0.09999999999985551\n",
"79000 2.6 0.09999999999985551\n",
"79500 4.4 0.09999999999985551\n",
"80000 2.7 0.09999999999985551\n",
"80500 5.5 0.09999999999985551\n",
"81000 3.8 0.09999999999985551\n",
"81500 5.7 0.09999999999985551\n",
"82000 5.1 0.09999999999985551\n",
"82500 4.3 0.09999999999985551\n",
"83000 3.9 0.09999999999985551\n",
"83500 5.4 0.09999999999985551\n",
"84000 3.8 0.09999999999985551\n",
"84500 6.8 0.09999999999985551\n",
"85000 5.1 0.09999999999985551\n",
"85500 3.8 0.09999999999985551\n",
"86000 7.9 0.09999999999985551\n",
"86500 8.3 0.09999999999985551\n",
"87000 6.3 0.09999999999985551\n",
"87500 5.7 0.09999999999985551\n",
"88000 5.3 0.09999999999985551\n",
"88500 4.5 0.09999999999985551\n",
"89000 6.2 0.09999999999985551\n",
"89500 7.8 0.09999999999985551\n",
"90000 6.7 0.09999999999985551\n",
"90500 10.0 0.09999999999985551\n",
"91000 9.5 0.09999999999985551\n",
"91500 10.6 0.09999999999985551\n",
"92000 10.1 0.09999999999985551\n",
"92500 8.8 0.09999999999985551\n",
"93000 11.0 0.09999999999985551\n",
"93500 7.3 0.09999999999985551\n",
"94000 8.0 0.09999999999985551\n",
"94500 11.7 0.09999999999985551\n",
"95000 12.9 0.09999999999985551\n",
"95500 10.2 0.09999999999985551\n",
"96000 11.4 0.09999999999985551\n",
"96500 11.6 0.09999999999985551\n",
"97000 9.7 0.09999999999985551\n",
"97500 11.6 0.09999999999985551\n",
"98000 16.5 0.09999999999985551\n",
"98500 14.2 0.09999999999985551\n",
"99000 8.8 0.09999999999985551\n",
"99500 15.1 0.09999999999985551\n",
"100000 9.8 0.09999999999985551\n",
"Saved Model\n",
"100500 10.4 0.09999999999985551\n",
"101000 11.3 0.09999999999985551\n",
"101500 10.4 0.09999999999985551\n",
"102000 15.5 0.09999999999985551\n",
"102500 14.8 0.09999999999985551\n",
"103000 14.1 0.09999999999985551\n",
"103500 17.2 0.09999999999985551\n",
"104000 13.2 0.09999999999985551\n",
"104500 11.0 0.09999999999985551\n",
"105000 18.2 0.09999999999985551\n",
"105500 18.0 0.09999999999985551\n",
"106000 15.0 0.09999999999985551\n",
"106500 12.6 0.09999999999985551\n",
"107000 13.8 0.09999999999985551\n",
"107500 16.2 0.09999999999985551\n",
"108000 16.7 0.09999999999985551\n",
"108500 19.1 0.09999999999985551\n",
"109000 16.2 0.09999999999985551\n",
"109500 12.6 0.09999999999985551\n",
"110000 16.3 0.09999999999985551\n",
"110500 18.2 0.09999999999985551\n",
"111000 17.9 0.09999999999985551\n",
"111500 19.4 0.09999999999985551\n",
"112000 14.8 0.09999999999985551\n",
"112500 16.0 0.09999999999985551\n",
"113000 15.3 0.09999999999985551\n",
"113500 14.5 0.09999999999985551\n",
"114000 13.9 0.09999999999985551\n",
"114500 17.6 0.09999999999985551\n",
"115000 18.5 0.09999999999985551\n",
"115500 17.0 0.09999999999985551\n",
"116000 18.3 0.09999999999985551\n",
"116500 13.3 0.09999999999985551\n",
"117000 18.1 0.09999999999985551\n",
"117500 19.2 0.09999999999985551\n",
"118000 17.7 0.09999999999985551\n",
"118500 12.9 0.09999999999985551\n",
"119000 17.6 0.09999999999985551\n",
"119500 16.8 0.09999999999985551\n",
"120000 20.3 0.09999999999985551\n",
"120500 15.0 0.09999999999985551\n",
"121000 17.0 0.09999999999985551\n",
"121500 14.7 0.09999999999985551\n",
"122000 17.4 0.09999999999985551\n",
"122500 16.7 0.09999999999985551\n",
"123000 19.1 0.09999999999985551\n",
"123500 19.4 0.09999999999985551\n",
"124000 19.9 0.09999999999985551\n",
"124500 19.6 0.09999999999985551\n",
"125000 20.8 0.09999999999985551\n",
"125500 21.5 0.09999999999985551\n",
"126000 18.6 0.09999999999985551\n",
"126500 16.6 0.09999999999985551\n",
"127000 19.7 0.09999999999985551\n",
"127500 15.3 0.09999999999985551\n",
"128000 22.0 0.09999999999985551\n",
"128500 17.0 0.09999999999985551\n",
"129000 18.5 0.09999999999985551\n",
"129500 19.0 0.09999999999985551\n",
"130000 19.4 0.09999999999985551\n",
"130500 19.7 0.09999999999985551\n",
"131000 18.4 0.09999999999985551\n",
"131500 21.6 0.09999999999985551\n",
"132000 20.2 0.09999999999985551\n",
"132500 18.4 0.09999999999985551\n",
"133000 19.4 0.09999999999985551\n",
"133500 18.8 0.09999999999985551\n",
"134000 21.4 0.09999999999985551\n",
"134500 20.4 0.09999999999985551\n",
"135000 17.9 0.09999999999985551\n",
"135500 20.3 0.09999999999985551\n",
"136000 18.2 0.09999999999985551\n",
"136500 19.9 0.09999999999985551\n",
"137000 22.0 0.09999999999985551\n",
"137500 18.7 0.09999999999985551\n",
"138000 15.6 0.09999999999985551\n",
"138500 19.0 0.09999999999985551\n",
"139000 18.7 0.09999999999985551\n",
"139500 19.0 0.09999999999985551\n",
"140000 20.1 0.09999999999985551\n",
"140500 19.2 0.09999999999985551\n",
"141000 20.5 0.09999999999985551\n",
"141500 21.5 0.09999999999985551\n",
"142000 21.6 0.09999999999985551\n",
"142500 20.4 0.09999999999985551\n",
"143000 19.8 0.09999999999985551\n",
"143500 22.6 0.09999999999985551\n",
"144000 19.7 0.09999999999985551\n",
"144500 20.8 0.09999999999985551\n",
"145000 19.9 0.09999999999985551\n",
"145500 20.2 0.09999999999985551\n",
"146000 20.9 0.09999999999985551\n",
"146500 22.3 0.09999999999985551\n",
"147000 18.6 0.09999999999985551\n",
"147500 21.3 0.09999999999985551\n",
"148000 19.4 0.09999999999985551\n",
"148500 21.1 0.09999999999985551\n",
"149000 20.0 0.09999999999985551\n",
"149500 21.0 0.09999999999985551\n",
"150000 20.1 0.09999999999985551\n",
"Saved Model\n",
"150500 21.1 0.09999999999985551\n",
"151000 21.4 0.09999999999985551\n",
"151500 19.5 0.09999999999985551\n",
"152000 21.8 0.09999999999985551\n",
"152500 19.2 0.09999999999985551\n",
"153000 18.2 0.09999999999985551\n",
"153500 22.9 0.09999999999985551\n",
"154000 19.8 0.09999999999985551\n",
"154500 22.1 0.09999999999985551\n",
"155000 21.7 0.09999999999985551\n",
"155500 17.1 0.09999999999985551\n",
"156000 21.1 0.09999999999985551\n",
"156500 21.8 0.09999999999985551\n",
"157000 20.8 0.09999999999985551\n",
"157500 20.7 0.09999999999985551\n",
"158000 19.2 0.09999999999985551\n",
"158500 19.6 0.09999999999985551\n",
"159000 21.3 0.09999999999985551\n",
"159500 21.6 0.09999999999985551\n",
"160000 19.6 0.09999999999985551\n",
"160500 19.1 0.09999999999985551\n",
"161000 22.2 0.09999999999985551\n",
"161500 21.6 0.09999999999985551\n",
"162000 18.8 0.09999999999985551\n",
"162500 20.3 0.09999999999985551\n",
"163000 19.6 0.09999999999985551\n",
"163500 22.7 0.09999999999985551\n",
"164000 18.8 0.09999999999985551\n",
"164500 19.5 0.09999999999985551\n",
"165000 20.3 0.09999999999985551\n",
"165500 20.5 0.09999999999985551\n",
"166000 22.8 0.09999999999985551\n",
"166500 21.3 0.09999999999985551\n",
"167000 23.2 0.09999999999985551\n",
"167500 23.5 0.09999999999985551\n",
"168000 19.5 0.09999999999985551\n",
"168500 22.6 0.09999999999985551\n",
"169000 19.4 0.09999999999985551\n",
"169500 18.8 0.09999999999985551\n",
"170000 21.1 0.09999999999985551\n",
"170500 20.8 0.09999999999985551\n",
"171000 20.1 0.09999999999985551\n",
"171500 22.8 0.09999999999985551\n",
"172000 22.0 0.09999999999985551\n",
"172500 20.5 0.09999999999985551\n",
"173000 20.6 0.09999999999985551\n",
"173500 21.0 0.09999999999985551\n",
"174000 23.5 0.09999999999985551\n",
"174500 16.0 0.09999999999985551\n",
"175000 17.1 0.09999999999985551\n",
"175500 22.0 0.09999999999985551\n",
"176000 21.8 0.09999999999985551\n",
"176500 20.2 0.09999999999985551\n",
"177000 23.2 0.09999999999985551\n",
"177500 20.7 0.09999999999985551\n",
"178000 21.4 0.09999999999985551\n",
"178500 23.6 0.09999999999985551\n",
"179000 21.8 0.09999999999985551\n",
"179500 22.9 0.09999999999985551\n",
"180000 21.8 0.09999999999985551\n",
"180500 21.1 0.09999999999985551\n",
"181000 22.3 0.09999999999985551\n",
"181500 22.2 0.09999999999985551\n",
"182000 20.6 0.09999999999985551\n",
"182500 19.6 0.09999999999985551\n",
"183000 22.1 0.09999999999985551\n",
"183500 18.7 0.09999999999985551\n",
"184000 21.0 0.09999999999985551\n",
"184500 19.0 0.09999999999985551\n",
"185000 22.0 0.09999999999985551\n",
"185500 22.8 0.09999999999985551\n",
"186000 22.2 0.09999999999985551\n",
"186500 23.4 0.09999999999985551\n",
"187000 23.0 0.09999999999985551\n",
"187500 21.1 0.09999999999985551\n",
"188000 22.3 0.09999999999985551\n",
"188500 23.4 0.09999999999985551\n",
"189000 19.6 0.09999999999985551\n",
"189500 19.4 0.09999999999985551\n",
"190000 21.6 0.09999999999985551\n",
"190500 20.3 0.09999999999985551\n",
"191000 18.9 0.09999999999985551\n",
"191500 21.8 0.09999999999985551\n",
"192000 21.4 0.09999999999985551\n",
"192500 21.1 0.09999999999985551\n",
"193000 18.5 0.09999999999985551\n",
"193500 23.8 0.09999999999985551\n",
"194000 21.9 0.09999999999985551\n",
"194500 17.4 0.09999999999985551\n",
"195000 20.5 0.09999999999985551\n",
"195500 18.3 0.09999999999985551\n",
"196000 22.6 0.09999999999985551\n",
"196500 22.8 0.09999999999985551\n",
"197000 22.9 0.09999999999985551\n",
"197500 20.5 0.09999999999985551\n",
"198000 18.2 0.09999999999985551\n",
"198500 22.8 0.09999999999985551\n",
"199000 22.4 0.09999999999985551\n",
"199500 19.1 0.09999999999985551\n",
"200000 19.9 0.09999999999985551\n",
"Saved Model\n",
"200500 19.5 0.09999999999985551\n",
"201000 20.3 0.09999999999985551\n",
"201500 19.2 0.09999999999985551\n",
"202000 21.6 0.09999999999985551\n",
"202500 21.5 0.09999999999985551\n",
"203000 21.7 0.09999999999985551\n",
"203500 21.2 0.09999999999985551\n",
"204000 20.1 0.09999999999985551\n",
"204500 22.8 0.09999999999985551\n",
"205000 22.2 0.09999999999985551\n",
"205500 22.1 0.09999999999985551\n",
"206000 21.8 0.09999999999985551\n",
"206500 17.7 0.09999999999985551\n",
"207000 20.5 0.09999999999985551\n",
"207500 23.3 0.09999999999985551\n",
"208000 18.7 0.09999999999985551\n",
"208500 22.3 0.09999999999985551\n",
"209000 22.0 0.09999999999985551\n",
"209500 20.5 0.09999999999985551\n",
"210000 22.7 0.09999999999985551\n",
"210500 24.6 0.09999999999985551\n",
"211000 22.4 0.09999999999985551\n",
"211500 20.4 0.09999999999985551\n",
"212000 22.7 0.09999999999985551\n",
"212500 21.9 0.09999999999985551\n",
"213000 18.2 0.09999999999985551\n",
"213500 19.8 0.09999999999985551\n",
"214000 19.8 0.09999999999985551\n",
"214500 23.7 0.09999999999985551\n",
"215000 21.8 0.09999999999985551\n",
"215500 19.3 0.09999999999985551\n",
"216000 22.3 0.09999999999985551\n",
"216500 22.5 0.09999999999985551\n",
"217000 22.7 0.09999999999985551\n",
"217500 20.1 0.09999999999985551\n",
"218000 22.1 0.09999999999985551\n",
"218500 20.7 0.09999999999985551\n",
"219000 21.8 0.09999999999985551\n",
"219500 21.7 0.09999999999985551\n",
"220000 18.9 0.09999999999985551\n",
"220500 18.6 0.09999999999985551\n",
"221000 22.2 0.09999999999985551\n",
"221500 20.8 0.09999999999985551\n",
"222000 22.0 0.09999999999985551\n",
"222500 21.9 0.09999999999985551\n",
"223000 19.7 0.09999999999985551\n",
"223500 21.5 0.09999999999985551\n",
"224000 20.0 0.09999999999985551\n",
"224500 22.2 0.09999999999985551\n",
"225000 23.8 0.09999999999985551\n",
"225500 21.9 0.09999999999985551\n",
"226000 23.0 0.09999999999985551\n",
"226500 20.5 0.09999999999985551\n",
"227000 21.5 0.09999999999985551\n",
"227500 23.8 0.09999999999985551\n",
"228000 19.6 0.09999999999985551\n",
"228500 21.1 0.09999999999985551\n",
"229000 18.8 0.09999999999985551\n",
"229500 21.6 0.09999999999985551\n",
"230000 22.7 0.09999999999985551\n",
"230500 18.7 0.09999999999985551\n",
"231000 22.9 0.09999999999985551\n",
"231500 21.2 0.09999999999985551\n",
"232000 19.9 0.09999999999985551\n",
"232500 20.3 0.09999999999985551\n",
"233000 23.2 0.09999999999985551\n",
"233500 16.6 0.09999999999985551\n",
"234000 21.5 0.09999999999985551\n",
"234500 21.4 0.09999999999985551\n",
"235000 20.9 0.09999999999985551\n",
"235500 21.8 0.09999999999985551\n",
"236000 20.7 0.09999999999985551\n",
"236500 22.9 0.09999999999985551\n",
"237000 23.1 0.09999999999985551\n",
"237500 21.3 0.09999999999985551\n",
"238000 23.0 0.09999999999985551\n",
"238500 21.7 0.09999999999985551\n",
"239000 22.5 0.09999999999985551\n",
"239500 20.8 0.09999999999985551\n",
"240000 20.6 0.09999999999985551\n",
"240500 25.3 0.09999999999985551\n",
"241000 22.7 0.09999999999985551\n",
"241500 23.5 0.09999999999985551\n",
"242000 20.6 0.09999999999985551\n",
"242500 18.2 0.09999999999985551\n",
"243000 21.2 0.09999999999985551\n",
"243500 22.4 0.09999999999985551\n",
"244000 22.1 0.09999999999985551\n",
"244500 18.2 0.09999999999985551\n",
"245000 20.5 0.09999999999985551\n",
"245500 20.5 0.09999999999985551\n",
"246000 21.5 0.09999999999985551\n",
"246500 22.5 0.09999999999985551\n",
"247000 22.9 0.09999999999985551\n",
"247500 22.6 0.09999999999985551\n",
"248000 19.5 0.09999999999985551\n",
"248500 21.4 0.09999999999985551\n",
"249000 22.5 0.09999999999985551\n",
"249500 22.1 0.09999999999985551\n",
"250000 22.2 0.09999999999985551\n",
"Saved Model\n",
"250500 17.8 0.09999999999985551\n",
"251000 22.2 0.09999999999985551\n",
"251500 22.3 0.09999999999985551\n",
"252000 22.9 0.09999999999985551\n",
"252500 20.9 0.09999999999985551\n",
"253000 20.5 0.09999999999985551\n",
"253500 21.3 0.09999999999985551\n",
"254000 19.0 0.09999999999985551\n",
"254500 20.6 0.09999999999985551\n",
"255000 21.2 0.09999999999985551\n",
"255500 20.9 0.09999999999985551\n",
"256000 23.2 0.09999999999985551\n",
"256500 20.1 0.09999999999985551\n",
"257000 21.5 0.09999999999985551\n",
"257500 22.1 0.09999999999985551\n",
"258000 21.4 0.09999999999985551\n",
"258500 19.8 0.09999999999985551\n",
"259000 22.8 0.09999999999985551\n",
"259500 21.1 0.09999999999985551\n",
"260000 22.1 0.09999999999985551\n",
"260500 20.4 0.09999999999985551\n",
"261000 24.3 0.09999999999985551\n",
"261500 23.3 0.09999999999985551\n",
"262000 22.1 0.09999999999985551\n",
"262500 22.6 0.09999999999985551\n",
"263000 19.6 0.09999999999985551\n",
"263500 21.9 0.09999999999985551\n",
"264000 21.3 0.09999999999985551\n",
"264500 19.2 0.09999999999985551\n",
"265000 19.7 0.09999999999985551\n",
"265500 22.2 0.09999999999985551\n",
"266000 22.0 0.09999999999985551\n",
"266500 20.6 0.09999999999985551\n",
"267000 21.4 0.09999999999985551\n",
"267500 23.3 0.09999999999985551\n",
"268000 21.4 0.09999999999985551\n",
"268500 23.1 0.09999999999985551\n",
"269000 23.7 0.09999999999985551\n",
"269500 21.9 0.09999999999985551\n",
"270000 22.2 0.09999999999985551\n",
"270500 21.8 0.09999999999985551\n",
"271000 22.9 0.09999999999985551\n",
"271500 23.2 0.09999999999985551\n",
"272000 24.2 0.09999999999985551\n",
"272500 22.3 0.09999999999985551\n",
"273000 22.7 0.09999999999985551\n",
"273500 22.8 0.09999999999985551\n",
"274000 22.2 0.09999999999985551\n",
"274500 20.8 0.09999999999985551\n",
"275000 18.6 0.09999999999985551\n",
"275500 20.8 0.09999999999985551\n",
"276000 21.0 0.09999999999985551\n",
"276500 20.4 0.09999999999985551\n",
"277000 23.4 0.09999999999985551\n",
"277500 21.3 0.09999999999985551\n",
"278000 18.5 0.09999999999985551\n",
"278500 20.2 0.09999999999985551\n",
"279000 21.3 0.09999999999985551\n",
"279500 24.1 0.09999999999985551\n",
"280000 21.8 0.09999999999985551\n",
"280500 23.6 0.09999999999985551\n",
"281000 23.7 0.09999999999985551\n",
"281500 20.1 0.09999999999985551\n",
"282000 21.2 0.09999999999985551\n",
"282500 21.5 0.09999999999985551\n",
"283000 22.4 0.09999999999985551\n",
"283500 22.4 0.09999999999985551\n",
"284000 22.9 0.09999999999985551\n",
"284500 20.6 0.09999999999985551\n",
"285000 19.9 0.09999999999985551\n",
"285500 20.0 0.09999999999985551\n",
"286000 22.8 0.09999999999985551\n",
"286500 21.3 0.09999999999985551\n",
"287000 19.8 0.09999999999985551\n",
"287500 23.5 0.09999999999985551\n",
"288000 21.6 0.09999999999985551\n",
"288500 20.4 0.09999999999985551\n",
"289000 21.7 0.09999999999985551\n",
"289500 21.1 0.09999999999985551\n",
"290000 22.5 0.09999999999985551\n",
"290500 25.1 0.09999999999985551\n",
"291000 18.7 0.09999999999985551\n",
"291500 22.6 0.09999999999985551\n",
"292000 22.4 0.09999999999985551\n",
"292500 21.4 0.09999999999985551\n",
"293000 21.6 0.09999999999985551\n",
"293500 22.2 0.09999999999985551\n",
"294000 24.0 0.09999999999985551\n",
"294500 20.3 0.09999999999985551\n",
"295000 20.0 0.09999999999985551\n",
"295500 18.9 0.09999999999985551\n",
"296000 21.3 0.09999999999985551\n",
"296500 22.6 0.09999999999985551\n",
"297000 22.7 0.09999999999985551\n",
"297500 22.3 0.09999999999985551\n",
"298000 22.7 0.09999999999985551\n",
"298500 22.9 0.09999999999985551\n",
"299000 21.5 0.09999999999985551\n",
"299500 22.9 0.09999999999985551\n",
"300000 22.3 0.09999999999985551\n",
"Saved Model\n",
"300500 24.3 0.09999999999985551\n",
"301000 21.5 0.09999999999985551\n",
"301500 23.2 0.09999999999985551\n",
"302000 20.5 0.09999999999985551\n",
"302500 22.8 0.09999999999985551\n",
"303000 22.8 0.09999999999985551\n",
"303500 20.1 0.09999999999985551\n",
"304000 21.8 0.09999999999985551\n",
"304500 21.4 0.09999999999985551\n",
"305000 24.0 0.09999999999985551\n",
"305500 22.1 0.09999999999985551\n",
"306000 22.5 0.09999999999985551\n",
"306500 21.8 0.09999999999985551\n",
"307000 20.4 0.09999999999985551\n",
"307500 22.3 0.09999999999985551\n",
"308000 22.0 0.09999999999985551\n",
"308500 21.0 0.09999999999985551\n",
"309000 23.6 0.09999999999985551\n",
"309500 22.0 0.09999999999985551\n",
"310000 24.0 0.09999999999985551\n",
"310500 19.1 0.09999999999985551\n",
"311000 22.6 0.09999999999985551\n",
"311500 22.6 0.09999999999985551\n",
"312000 22.1 0.09999999999985551\n",
"312500 23.5 0.09999999999985551\n",
"313000 23.4 0.09999999999985551\n",
"313500 22.9 0.09999999999985551\n",
"314000 22.9 0.09999999999985551\n",
"314500 20.3 0.09999999999985551\n",
"315000 21.8 0.09999999999985551\n",
"315500 19.1 0.09999999999985551\n",
"316000 24.2 0.09999999999985551\n",
"316500 21.6 0.09999999999985551\n",
"317000 21.6 0.09999999999985551\n",
"317500 23.3 0.09999999999985551\n",
"318000 20.7 0.09999999999985551\n",
"318500 20.6 0.09999999999985551\n",
"319000 20.9 0.09999999999985551\n",
"319500 21.9 0.09999999999985551\n",
"320000 22.0 0.09999999999985551\n",
"320500 21.8 0.09999999999985551\n",
"321000 20.7 0.09999999999985551\n",
"321500 22.0 0.09999999999985551\n",
"322000 21.7 0.09999999999985551\n",
"322500 15.4 0.09999999999985551\n",
"323000 22.2 0.09999999999985551\n",
"323500 21.9 0.09999999999985551\n",
"324000 22.4 0.09999999999985551\n",
"324500 21.4 0.09999999999985551\n",
"325000 22.2 0.09999999999985551\n",
"325500 21.0 0.09999999999985551\n",
"326000 24.1 0.09999999999985551\n",
"326500 22.4 0.09999999999985551\n",
"327000 21.5 0.09999999999985551\n",
"327500 22.6 0.09999999999985551\n",
"328000 20.3 0.09999999999985551\n",
"328500 23.5 0.09999999999985551\n",
"329000 20.8 0.09999999999985551\n",
"329500 22.5 0.09999999999985551\n",
"330000 21.3 0.09999999999985551\n",
"330500 22.4 0.09999999999985551\n",
"331000 23.3 0.09999999999985551\n",
"331500 24.0 0.09999999999985551\n",
"332000 21.3 0.09999999999985551\n",
"332500 20.0 0.09999999999985551\n",
"333000 22.8 0.09999999999985551\n",
"333500 22.3 0.09999999999985551\n",
"334000 22.0 0.09999999999985551\n",
"334500 22.9 0.09999999999985551\n",
"335000 20.6 0.09999999999985551\n",
"335500 20.6 0.09999999999985551\n",
"336000 19.7 0.09999999999985551\n",
"336500 19.3 0.09999999999985551\n",
"337000 21.4 0.09999999999985551\n",
"337500 19.9 0.09999999999985551\n",
"338000 21.8 0.09999999999985551\n",
"338500 22.1 0.09999999999985551\n",
"339000 22.3 0.09999999999985551\n",
"339500 21.8 0.09999999999985551\n",
"340000 20.6 0.09999999999985551\n",
"340500 23.3 0.09999999999985551\n",
"341000 21.8 0.09999999999985551\n",
"341500 24.0 0.09999999999985551\n",
"342000 22.5 0.09999999999985551\n",
"342500 23.4 0.09999999999985551\n",
"343000 22.3 0.09999999999985551\n",
"343500 23.1 0.09999999999985551\n",
"344000 23.1 0.09999999999985551\n",
"344500 23.5 0.09999999999985551\n",
"345000 21.0 0.09999999999985551\n",
"345500 25.5 0.09999999999985551\n",
"346000 22.1 0.09999999999985551\n",
"346500 21.7 0.09999999999985551\n",
"347000 23.3 0.09999999999985551\n",
"347500 23.7 0.09999999999985551\n",
"348000 20.9 0.09999999999985551\n",
"348500 25.9 0.09999999999985551\n",
"349000 23.7 0.09999999999985551\n",
"349500 22.7 0.09999999999985551\n",
"350000 20.5 0.09999999999985551\n",
"Saved Model\n",
"350500 21.9 0.09999999999985551\n",
"351000 21.9 0.09999999999985551\n",
"351500 20.3 0.09999999999985551\n",
"352000 24.5 0.09999999999985551\n",
"352500 19.5 0.09999999999985551\n",
"353000 23.8 0.09999999999985551\n",
"353500 24.0 0.09999999999985551\n",
"354000 19.6 0.09999999999985551\n",
"354500 22.6 0.09999999999985551\n",
"355000 21.3 0.09999999999985551\n",
"355500 19.9 0.09999999999985551\n",
"356000 22.0 0.09999999999985551\n",
"356500 22.2 0.09999999999985551\n",
"357000 21.5 0.09999999999985551\n",
"357500 20.7 0.09999999999985551\n",
"358000 20.4 0.09999999999985551\n",
"358500 21.5 0.09999999999985551\n",
"359000 19.6 0.09999999999985551\n",
"359500 23.1 0.09999999999985551\n",
"360000 20.7 0.09999999999985551\n",
"360500 21.1 0.09999999999985551\n",
"361000 20.9 0.09999999999985551\n",
"361500 18.7 0.09999999999985551\n",
"362000 22.5 0.09999999999985551\n",
"362500 23.5 0.09999999999985551\n",
"363000 22.7 0.09999999999985551\n",
"363500 22.0 0.09999999999985551\n",
"364000 22.3 0.09999999999985551\n",
"364500 23.5 0.09999999999985551\n",
"365000 20.3 0.09999999999985551\n",
"365500 22.0 0.09999999999985551\n",
"366000 20.1 0.09999999999985551\n",
"366500 23.8 0.09999999999985551\n",
"367000 23.6 0.09999999999985551\n",
"367500 22.5 0.09999999999985551\n",
"368000 22.6 0.09999999999985551\n",
"368500 23.6 0.09999999999985551\n",
"369000 21.4 0.09999999999985551\n",
"369500 21.7 0.09999999999985551\n",
"370000 21.1 0.09999999999985551\n",
"370500 24.3 0.09999999999985551\n",
"371000 24.0 0.09999999999985551\n",
"371500 23.8 0.09999999999985551\n",
"372000 23.5 0.09999999999985551\n",
"372500 24.0 0.09999999999985551\n",
"373000 24.1 0.09999999999985551\n",
"373500 21.9 0.09999999999985551\n",
"374000 22.8 0.09999999999985551\n",
"374500 22.7 0.09999999999985551\n",
"375000 23.5 0.09999999999985551\n",
"375500 19.9 0.09999999999985551\n",
"376000 23.3 0.09999999999985551\n",
"376500 19.7 0.09999999999985551\n",
"377000 20.6 0.09999999999985551\n",
"377500 23.0 0.09999999999985551\n",
"378000 23.1 0.09999999999985551\n",
"378500 23.4 0.09999999999985551\n",
"379000 22.8 0.09999999999985551\n",
"379500 22.9 0.09999999999985551\n",
"380000 19.6 0.09999999999985551\n",
"380500 20.4 0.09999999999985551\n",
"381000 18.6 0.09999999999985551\n",
"381500 23.2 0.09999999999985551\n",
"382000 21.1 0.09999999999985551\n",
"382500 24.0 0.09999999999985551\n",
"383000 22.4 0.09999999999985551\n",
"383500 23.8 0.09999999999985551\n",
"384000 22.6 0.09999999999985551\n",
"384500 20.3 0.09999999999985551\n",
"385000 20.3 0.09999999999985551\n",
"385500 23.7 0.09999999999985551\n",
"386000 24.2 0.09999999999985551\n",
"386500 22.7 0.09999999999985551\n",
"387000 21.1 0.09999999999985551\n",
"387500 21.6 0.09999999999985551\n",
"388000 24.0 0.09999999999985551\n",
"388500 21.9 0.09999999999985551\n",
"389000 21.6 0.09999999999985551\n",
"389500 23.7 0.09999999999985551\n",
"390000 22.9 0.09999999999985551\n",
"390500 22.8 0.09999999999985551\n",
"391000 24.0 0.09999999999985551\n",
"391500 21.7 0.09999999999985551\n",
"392000 22.7 0.09999999999985551\n",
"392500 19.9 0.09999999999985551\n",
"393000 20.2 0.09999999999985551\n",
"393500 21.3 0.09999999999985551\n",
"394000 21.4 0.09999999999985551\n",
"394500 23.2 0.09999999999985551\n",
"395000 24.3 0.09999999999985551\n",
"395500 22.1 0.09999999999985551\n",
"396000 23.5 0.09999999999985551\n",
"396500 23.9 0.09999999999985551\n",
"397000 22.3 0.09999999999985551\n",
"397500 24.5 0.09999999999985551\n",
"398000 22.6 0.09999999999985551\n",
"398500 25.5 0.09999999999985551\n",
"399000 22.9 0.09999999999985551\n",
"399500 25.4 0.09999999999985551\n",
"400000 24.9 0.09999999999985551\n",
"Saved Model\n",
"400500 22.6 0.09999999999985551\n",
"401000 20.1 0.09999999999985551\n",
"401500 23.8 0.09999999999985551\n",
"402000 21.0 0.09999999999985551\n",
"402500 23.0 0.09999999999985551\n",
"403000 21.6 0.09999999999985551\n",
"403500 22.2 0.09999999999985551\n",
"404000 22.8 0.09999999999985551\n",
"404500 21.3 0.09999999999985551\n",
"405000 22.7 0.09999999999985551\n",
"405500 24.0 0.09999999999985551\n",
"406000 20.7 0.09999999999985551\n",
"406500 22.6 0.09999999999985551\n",
"407000 23.3 0.09999999999985551\n",
"407500 24.4 0.09999999999985551\n",
"408000 20.9 0.09999999999985551\n",
"408500 21.4 0.09999999999985551\n",
"409000 22.8 0.09999999999985551\n",
"409500 23.2 0.09999999999985551\n",
"410000 24.0 0.09999999999985551\n",
"410500 22.8 0.09999999999985551\n",
"411000 21.5 0.09999999999985551\n",
"411500 22.4 0.09999999999985551\n",
"412000 24.1 0.09999999999985551\n",
"412500 22.7 0.09999999999985551\n",
"413000 22.1 0.09999999999985551\n",
"413500 20.8 0.09999999999985551\n",
"414000 20.6 0.09999999999985551\n",
"414500 23.2 0.09999999999985551\n",
"415000 23.0 0.09999999999985551\n",
"415500 22.1 0.09999999999985551\n",
"416000 22.9 0.09999999999985551\n",
"416500 22.8 0.09999999999985551\n",
"417000 22.8 0.09999999999985551\n",
"417500 23.3 0.09999999999985551\n",
"418000 21.6 0.09999999999985551\n",
"418500 23.2 0.09999999999985551\n",
"419000 21.8 0.09999999999985551\n",
"419500 21.0 0.09999999999985551\n",
"420000 21.2 0.09999999999985551\n",
"420500 22.4 0.09999999999985551\n",
"421000 20.3 0.09999999999985551\n",
"421500 23.9 0.09999999999985551\n",
"422000 20.6 0.09999999999985551\n",
"422500 23.1 0.09999999999985551\n",
"423000 22.8 0.09999999999985551\n",
"423500 22.0 0.09999999999985551\n",
"424000 24.0 0.09999999999985551\n",
"424500 22.6 0.09999999999985551\n",
"425000 21.2 0.09999999999985551\n",
"425500 22.9 0.09999999999985551\n",
"426000 21.5 0.09999999999985551\n",
"426500 23.3 0.09999999999985551\n",
"427000 21.9 0.09999999999985551\n",
"427500 24.5 0.09999999999985551\n",
"428000 23.2 0.09999999999985551\n",
"428500 23.3 0.09999999999985551\n",
"429000 19.0 0.09999999999985551\n",
"429500 20.8 0.09999999999985551\n",
"430000 22.6 0.09999999999985551\n",
"430500 22.4 0.09999999999985551\n",
"431000 21.6 0.09999999999985551\n",
"431500 21.5 0.09999999999985551\n",
"432000 24.0 0.09999999999985551\n",
"432500 25.3 0.09999999999985551\n",
"433000 24.7 0.09999999999985551\n",
"433500 21.7 0.09999999999985551\n",
"434000 22.9 0.09999999999985551\n",
"434500 22.6 0.09999999999985551\n",
"435000 22.9 0.09999999999985551\n",
"435500 21.3 0.09999999999985551\n",
"436000 23.4 0.09999999999985551\n",
"436500 20.2 0.09999999999985551\n",
"437000 22.8 0.09999999999985551\n",
"437500 21.5 0.09999999999985551\n",
"438000 22.3 0.09999999999985551\n",
"438500 22.5 0.09999999999985551\n",
"439000 20.0 0.09999999999985551\n",
"439500 22.7 0.09999999999985551\n",
"440000 22.0 0.09999999999985551\n",
"440500 20.4 0.09999999999985551\n",
"441000 20.4 0.09999999999985551\n",
"441500 19.0 0.09999999999985551\n",
"442000 20.8 0.09999999999985551\n",
"442500 22.3 0.09999999999985551\n",
"443000 22.7 0.09999999999985551\n",
"443500 20.3 0.09999999999985551\n",
"444000 24.2 0.09999999999985551\n",
"444500 23.0 0.09999999999985551\n",
"445000 22.0 0.09999999999985551\n",
"445500 24.9 0.09999999999985551\n",
"446000 23.5 0.09999999999985551\n",
"446500 21.1 0.09999999999985551\n",
"447000 21.9 0.09999999999985551\n",
"447500 20.6 0.09999999999985551\n",
"448000 22.8 0.09999999999985551\n",
"448500 21.8 0.09999999999985551\n",
"449000 23.0 0.09999999999985551\n",
"449500 22.9 0.09999999999985551\n",
"450000 22.1 0.09999999999985551\n",
"Saved Model\n",
"450500 20.5 0.09999999999985551\n",
"451000 20.5 0.09999999999985551\n",
"451500 23.2 0.09999999999985551\n",
"452000 22.8 0.09999999999985551\n",
"452500 23.7 0.09999999999985551\n",
"453000 22.9 0.09999999999985551\n",
"453500 22.3 0.09999999999985551\n",
"454000 22.8 0.09999999999985551\n",
"454500 21.6 0.09999999999985551\n",
"455000 23.6 0.09999999999985551\n",
"455500 22.8 0.09999999999985551\n",
"456000 21.5 0.09999999999985551\n",
"456500 22.5 0.09999999999985551\n",
"457000 23.9 0.09999999999985551\n",
"457500 22.4 0.09999999999985551\n",
"458000 23.4 0.09999999999985551\n",
"458500 20.9 0.09999999999985551\n",
"459000 23.8 0.09999999999985551\n",
"459500 24.3 0.09999999999985551\n",
"460000 23.1 0.09999999999985551\n",
"460500 20.4 0.09999999999985551\n",
"461000 23.0 0.09999999999985551\n",
"461500 21.1 0.09999999999985551\n",
"462000 19.4 0.09999999999985551\n",
"462500 21.7 0.09999999999985551\n",
"463000 22.9 0.09999999999985551\n",
"463500 20.6 0.09999999999985551\n",
"464000 20.9 0.09999999999985551\n",
"464500 23.4 0.09999999999985551\n",
"465000 24.1 0.09999999999985551\n",
"465500 22.1 0.09999999999985551\n",
"466000 20.7 0.09999999999985551\n",
"466500 22.9 0.09999999999985551\n",
"467000 20.7 0.09999999999985551\n",
"467500 21.7 0.09999999999985551\n",
"468000 22.1 0.09999999999985551\n",
"468500 23.3 0.09999999999985551\n",
"469000 21.1 0.09999999999985551\n",
"469500 22.7 0.09999999999985551\n",
"470000 23.1 0.09999999999985551\n",
"470500 24.0 0.09999999999985551\n",
"471000 24.0 0.09999999999985551\n",
"471500 22.3 0.09999999999985551\n",
"472000 22.2 0.09999999999985551\n",
"472500 23.1 0.09999999999985551\n",
"473000 21.1 0.09999999999985551\n",
"473500 22.4 0.09999999999985551\n",
"474000 22.2 0.09999999999985551\n",
"474500 21.9 0.09999999999985551\n",
"475000 21.2 0.09999999999985551\n",
"475500 20.7 0.09999999999985551\n",
"476000 23.2 0.09999999999985551\n",
"476500 23.0 0.09999999999985551\n",
"477000 20.4 0.09999999999985551\n",
"477500 22.6 0.09999999999985551\n",
"478000 24.5 0.09999999999985551\n",
"478500 22.6 0.09999999999985551\n",
"479000 23.1 0.09999999999985551\n",
"479500 23.1 0.09999999999985551\n",
"480000 21.5 0.09999999999985551\n",
"480500 23.1 0.09999999999985551\n",
"481000 19.6 0.09999999999985551\n",
"481500 23.9 0.09999999999985551\n",
"482000 22.6 0.09999999999985551\n",
"482500 23.3 0.09999999999985551\n",
"483000 20.9 0.09999999999985551\n",
"483500 23.5 0.09999999999985551\n",
"484000 22.6 0.09999999999985551\n",
"484500 24.4 0.09999999999985551\n",
"485000 23.1 0.09999999999985551\n",
"485500 22.1 0.09999999999985551\n",
"486000 23.2 0.09999999999985551\n",
"486500 23.1 0.09999999999985551\n",
"487000 21.5 0.09999999999985551\n",
"487500 22.1 0.09999999999985551\n",
"488000 21.8 0.09999999999985551\n",
"488500 23.0 0.09999999999985551\n",
"489000 22.0 0.09999999999985551\n",
"489500 24.6 0.09999999999985551\n",
"490000 24.2 0.09999999999985551\n",
"490500 23.2 0.09999999999985551\n",
"491000 23.1 0.09999999999985551\n",
"491500 20.2 0.09999999999985551\n",
"492000 21.5 0.09999999999985551\n",
"492500 21.0 0.09999999999985551\n",
"493000 23.0 0.09999999999985551\n",
"493500 21.6 0.09999999999985551\n",
"494000 22.3 0.09999999999985551\n",
"494500 20.8 0.09999999999985551\n",
"495000 23.6 0.09999999999985551\n",
"495500 23.5 0.09999999999985551\n",
"496000 24.9 0.09999999999985551\n",
"496500 22.8 0.09999999999985551\n",
"497000 21.9 0.09999999999985551\n",
"497500 22.5 0.09999999999985551\n",
"498000 21.6 0.09999999999985551\n",
"498500 21.6 0.09999999999985551\n",
"499000 24.6 0.09999999999985551\n",
"499500 22.8 0.09999999999985551\n",
"500000 24.2 0.09999999999985551\n",
"Percent of succesful episodes: 17.7144%\n"
]
}
],
"source": [
"# 그래프를 초기화한다\n",
"tf.reset_default_graph()\n",
"# 주요 신경망을 만든다\n",
"mainQN = Qnetwork(h_size)\n",
"# 타겟 신경망을 만든다.\n",
"targetQN = Qnetwork(h_size)\n",
"\n",
"# 변수들을 초기화한다\n",
"init = tf.global_variables_initializer()\n",
"\n",
"# saver를 만든다\n",
"saver = tf.train.Saver()\n",
"\n",
"# 학습가능한 변수를 꺼낸다\n",
"trainables = tf.trainable_variables()\n",
"\n",
"# 타겟 신경망을 업데이트하기 위한 값을 만든다\n",
"targetOps = updateTargetGraph(trainables,tau)\n",
"\n",
"# 경험을 저장할 장소\n",
"myBuffer = experience_buffer()\n",
"\n",
"# 무작위 행위 확률을 설정한다\n",
"e = startE\n",
"# 점점 줄여나간다.\n",
"stepDrop = (startE - endE)/anneling_steps\n",
"\n",
"# 에피소드별 총 보상과 걸음을 저장할 리스트를 만든다\n",
"jList = []\n",
"rList = []\n",
"total_steps = 0\n",
"\n",
"# 모델을 세이브할 장소를 만든다.\n",
"if not os.path.exists(path):\n",
" os.makedirs(path)\n",
"\n",
"# 텐서플로 세션을 연다\n",
"with tf.Session() as sess:\n",
" # 모델을 불러올지 체크\n",
" if load_model == True:\n",
" print ('Loading Model...')\n",
" # 모델을 불러온다\n",
" ckpt = tf.train.get_checkpoint_state(path)\n",
" saver.restore(sess,ckpt.model_checkpoint_path)\n",
" # 변수를 초기화한다.\n",
" sess.run(init)\n",
" # 주요 신경망과 동일하게 타겟 신경망을 설정한다\n",
" updateTarget(targetOps,sess) \n",
" # 에피소드 시작\n",
" for i in range(num_episodes):\n",
" # 에피소드별 경험 버퍼를 초기화한다\n",
" episodeBuffer = experience_buffer()\n",
" \n",
" # 환경과 처음 상태을 초기화한다\n",
" s = env.reset()\n",
" s = processState(s)\n",
" # 종료 여부\n",
" d = False\n",
" # 보상\n",
" rAll = 0\n",
" # 걸음\n",
" j = 0\n",
" # Q-Network\n",
" # 만약 50 걸음보다 더 간다면 종료한다.\n",
" while j < max_epLength:\n",
" j+=1\n",
" # Q-network 로부터 행동을 greedy 하게 선택하거나 e의 확률로 무작위 행동을 한다\n",
" if np.random.rand(1) < e or total_steps < pre_train_steps:\n",
" a = np.random.randint(0,4)\n",
" else:\n",
" # 신경망을 통해 Q 값을 가져오는 부분\n",
" a = sess.run(mainQN.predict,feed_dict={mainQN.scalarInput:[s]})[0]\n",
" \n",
" # 주어진 행동을 실행하고 다음 상태, 보상, 종료 여부를 가져옴\n",
" s1,r,d = env.step(a)\n",
" # 상태를 다시 21168 차원으로 리사이즈\n",
" s1 = processState(s1)\n",
" # 걸음수를 늘림\n",
" total_steps += 1\n",
" # 버퍼에 현재 상태, 행동, 보상, 다음 상태, 종료 여부를 저장한다\n",
" episodeBuffer.add(np.reshape(np.array([s,a,r,s1,d]),[1,5])) \n",
" \n",
" # 무작위 행동의 수를 넘으면 시작\n",
" if total_steps > pre_train_steps:\n",
" # 무작위 확률 값을 줄인다\n",
" if e > endE:\n",
" e -= stepDrop\n",
" \n",
" # 총 걸음이 업데이트 수로 나누어 떨어지면 시작\n",
" if total_steps % (update_freq) == 0:\n",
" # 경험으로부터 랜덤한 배치를 뽑는다\n",
" trainBatch = myBuffer.sample(batch_size) \n",
" # 아래는 target Q-value를 업데이트하는 Double-DQN을 수행한다\n",
" # 주요 신경망에서 행동을 고른다.\n",
" Q1 = sess.run(mainQN.predict,feed_dict={mainQN.scalarInput:np.vstack(trainBatch[:,3])})\n",
" # 타겟 신경망에서 Q 값들을 얻는다.\n",
" Q2 = sess.run(targetQN.Qout,feed_dict={targetQN.scalarInput:np.vstack(trainBatch[:,3])})\n",
" # 종료 여부에 따라 가짜 라벨을 만들어준다\n",
" end_multiplier = -(trainBatch[:,4] - 1)\n",
" # 타겟 신경망의 Q 값들 중에 주요 신경망에서 고른 행동 번째의 Q 값들을 가져온다.(이부분이 doubleQ)\n",
" doubleQ = Q2[range(batch_size),Q1]\n",
" # 보상에 대한 더블 Q 값을 더해준다. y는 할인 인자\n",
" # targetQ 는 즉각적인 보상 + 다음 상태의 최대 보상(doubleQ)\n",
" targetQ = trainBatch[:,2] + (y*doubleQ * end_multiplier)\n",
" # 우리의 타겟 값들과 함께 신경망을 업데이트해준다.\n",
" # 행동들에 대해서 targetQ 값과의 차이를 통해 손실을 구하고 업데이트\n",
" _ = sess.run(mainQN.updateModel, \\\n",
" feed_dict={mainQN.scalarInput:np.vstack(trainBatch[:,0]),mainQN.targetQ:targetQ, mainQN.actions:trainBatch[:,1]})\n",
" \n",
" # 주요 신경망과 동일하게 타겟 신경망을 설정한다.\n",
" # 물론 주요 신경망의 값은 tau 만큼만이 반영된다.\n",
" updateTarget(targetOps,sess) \n",
" # 총 보상\n",
" rAll += r\n",
" # 상태를 바꾼다.\n",
" s = s1\n",
" \n",
" # 종료가 되면 멈춘다.\n",
" if d == True:\n",
"\n",
" break\n",
" \n",
" # 이 에피소드로부터의 모든 경험을 저장한다\n",
" myBuffer.add(episodeBuffer.buffer)\n",
" # 걸음을 저장한다.\n",
" jList.append(j)\n",
" # 보상을 저장한다\n",
" rList.append(rAll)\n",
" # 주기적으로 모델을 저장한다\n",
" if i % 1000 == 0:\n",
" saver.save(sess,path+'/model-'+str(i)+'.cptk')\n",
" print (\"Saved Model\")\n",
" # 최근 10 에피소드의 평균 보상값을 나타낸다.\n",
" if len(rList) % 10 == 0:\n",
" print (total_steps,np.mean(rList[-10:]), e)\n",
" # 모델을 저장한다.\n",
" saver.save(sess,path+'/model-'+str(i)+'.cptk')\n",
"# 성공확률을 표시\n",
"print (\"Percent of succesful episodes: \" + str(sum(rList)/num_episodes) + \"%\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [conda root]",
"language": "python",
"name": "conda-root-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment