久保隆宏：Python で学ぶ強化学習

作成日 : 2025-03-16

最終更新日：

概要

「はじめに」から引用する。

（前略）本書は、なんらかのプログラミング経験のあるエンジニアが、強化学習を学び、さらに「実務へ適用」できるようになることを目的として書かれています。（後略）

環境構築で失敗

早速やってみたのだが、p.5 のサンプルコード実行用の環境（仮想環境）を作成するところで失敗した。私の環境は、Windows11 である。

> conda create -n rl-book python=3.6
> activate rl-book
(rl-book) > pip install -r requirements.txt
(中略)
Collecting qtpy
Downloading QtPy-2.0.1-py3-none-any.whl (65 kB)
|████████████████████████████████| 65 kB 2.0 MB/s
Building wheels for collected packages: gym, termcolor, pywinpty
Building wheel for gym (setup.py) ... done
Created wheel for gym: filename=gym-0.14.0-py3-none-any.whl size=1637523
sha256=57509a54367f123aec8852f27e1a4f9507a9531c52ce0cfbc0dd77a97a57e979
Stored in directory:
c:\users\username\appdata\local\pip\cache\wheels\87\de\93\2eb31d6f3ee17bf493a9511ce20cf48cf629daab859cd9ae3a
Building wheel for termcolor (setup.py) ... done
Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.whl size=4848
sha256=bb6c5e3d6e8cc01e7abc8fdb042ed7f06b2b4083a77b8e5ec2b4095e10c89e3e
Stored in directory:
c:\users\username\appdata\local\pip\cache\wheels\93\2a\eb\e58dbcbc963549ee4f065ff80a59f274cc7210b6eab962acdc
Building wheel for pywinpty (PEP 517) ... error
ERROR: Command errored out with exit status 1:
command: 'C:\Users\username\.conda\envs\rl-book\python.exe'
'C:\Users\username\.conda\envs\rl-book\lib\site-packages\pip\_vendor\pep517\in_process\_in_process.py'
build_wheel
'C:\Users\username\AppData\Local\Temp\tmp2vly1r9f'
cwd: C:\Users\username\AppData\Local\Temp\pip-install-fxx12kv3\pywinpty_73d82e8435724ff8af68284fc8f741b2
Complete output (13 lines):
Running `maturin pep517 build-wheel -i C:\Users\username\.conda\envs\rl-book\python.exe --compatibility off`
error: package `windows v0.58.0` cannot be built because it requires rustc 1.70 or newer, while the currently active
rustc version is 1.69.0
Either upgrade to rustc 1.70 or newer, or use
cargo update -p windows@0.58.0 --precise ver
where `ver` is the latest version of `windows` supporting rustc 1.69.0
徴 maturin failed
Caused by: Failed to build a native library through cargo
Caused by: Cargo build finished with "exit code: 101": `cargo rustc --manifest-path Cargo.toml --message-format json
--release --lib --`
逃 Including license file "LICENSE.txt"
沙 Building a mixed python/rust project
迫 Found pyo3 bindings
錐 Found CPython 3.6 at C:\Users\username\.conda\envs\rl-book\python.exe
Error: command ['maturin', 'pep517', 'build-wheel', '-i',
'C:\\Users\\username\\.conda\\envs\\rl-book\\python.exe',
'--compatibility', 'off'] returned non-zero exit status 1
----------------------------------------
ERROR: Failed building wheel for pywinpty
Successfully built gym termcolor
Failed to build pywinpty
ERROR: Could not build wheels for pywinpty which use PEP 517 and cannot be installed directly

よくわからないので、一度実行環境を破棄して、再度入れることにした。このとき、Python のバージョンは指定しなかった。また、一度破棄すると pip も消えてしまうので入れ直した。

(rl-book)> conda deactivate
> conda remove -n rl-book --all
> conda create -n rl-book
> conda activate rl-book
(rl-book)> conda install pip
(rl-book)> pip install -r requirements.txt
(中略)
Collecting pandas==0.24.2 (from -r requirements.txt (line 4))
Downloading pandas-0.24.2.tar.gz (11.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 8.1 MB/s eta 0:00:00
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [26 lines of output]
(後略)

とにかくうまくいかなかったことは確かだ。本書は 2025 年 4 月現在改訂 2 版が出ているからそちらを見るのがいいだろうが、もう少し悪あがきしてみる。

再度実行環境を破棄して、三度 rl-book を作り直した。

> conda remove -n rl-book --all

> conda create -n rl-book

この状態で、rl-book には Python 3.13 が入っている。

さて、requirements.txt に書かれている内容を見てみよう。

gym==0.14.0
jupyter==1.0.0
numpy==1.16.4
pandas==0.24.2
scipy==1.3.0
scikit-learn==0.21.2
matplotlib==3.1.0
tensorflow==1.14.0
-e git+https://github.com/ntasfi/PyGame-Learning-Environment.git#egg=ple
-e git+https://github.com/lusob/gym-ple.git#egg=gym-ple
h5py==2.9.0
pygame==1.9.6
tqdm==4.32.1

上記の必要なデータのうち、pandas のところでうまくいかなかったようだ。これはバージョンが指定されているからなのだろうか。

こんどは、conda で一つずつ入れてみるのはどうだろうか。

(rl-book)> conda install gym
(中略)
  LibMambaUnsatisfiableError: Encountered problems while solving:
- package gym-0.21.0-py310hbbfc1a7_1 requires python >=3.10,<3.11.0a0, but none of the providers can be installed

ということは、python 3.10 の環境で始めるのがいいのだろう。

> conda create -n rl-book python=3.10
> conda activate rl-book
(rl-book)> conda install gym jupyter numpy pandas scipy scikit-learn matplotlib tensorflow h5py tqdm
(略)
(rl-book)> pip install pygame
(略)
(rl-book)> pip install -e git+https://github.com/ntasfi/PyGame-Learning-Environment.git#egg=ple
Obtaining ple from git+https://github.com/ntasfi/PyGame-Learning-Environment.git#egg=ple
Cloning https://github.com/ntasfi/PyGame-Learning-Environment.git to c:\users\username\src\ple
Running command git clone --filter=blob:none --quiet https://github.com/ntasfi/PyGame-Learning-Environment.git
'C:\Users\username\src\ple'
Resolved https://github.com/ntasfi/PyGame-Learning-Environment.git to commit 3dbe79dc0c35559bb441b9359948aabf9bb3d331
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in c:\users\username\.conda\envs\rl-book\lib\site-packages (from ple) (1.26.4)
Requirement already satisfied: Pillow in c:\users\username\.conda\envs\rl-book\lib\site-packages (from ple) (11.1.0)
Installing collected packages: ple
DEPRECATION: Legacy editable install of ple from git+https://github.com/ntasfi/PyGame-Learning-Environment.git#egg=ple
(setup.py develop) is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to add a
pyproject.toml or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as
expected, try using --config-settings editable_mode=compat. Please consult the setuptools documentation for more
information. Discussion can be found at https://github.com/pypa/pip/issues/11457
Running setup.py develop for ple
Successfully installed ple
(rl-book)> pip install -e git+https://github.com/lusob/gym-ple.git#egg=gym-ple
Obtaining gym-ple from git+https://github.com/lusob/gym-ple.git#egg=gym-ple
Cloning https://github.com/lusob/gym-ple.git to c:\users\username\src\gym-ple
Running command git clone --filter=blob:none --quiet https://github.com/lusob/gym-ple.git 'C:\Users\username\src\gym-ple'
Resolved https://github.com/lusob/gym-ple.git to commit 7cedbf4e31be86f5ca2aae5c0dfd9d38825af64e
Preparing metadata (setup.py) ... done
Installing collected packages: gym-ple
DEPRECATION: Legacy editable install of gym-ple from git+https://github.com/lusob/gym-ple.git#egg=gym-ple (setup.py
develop) is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to add a pyproject.toml
or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as expected, try using
--config-settings editable_mode=compat. Please consult the setuptools documentation for more information. Discussion can
be found at https://github.com/pypa/pip/issues/11457
Running setup.py develop for gym-ple
Successfully installed gym-ple

さてうまくいくだろうか。

(rl-book)> python welcome.py
couldn't import doomish
Couldn't import doom
C:\Users\username\.conda\envs\rl-book\lib\site-packages\gym\utils\passive_env_checker.py:195: UserWarning: WARN: The result
returned by `env.reset()` was not a tuple of the form `(obs, info)`, where `obs` is a observation and `info` is a
dictionary containing additional information. Actual type: `<class 'NoneType'>`
  logger.warn(
  2025-04-26 22:44:38.241946: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized
  with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical
  operations: AVX2
  To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
  2025-04-26 22:44:38.245940: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with
  default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
  Traceback (most recent call last):
  File "C:\Users\username\Documents\baby-steps-of-rl-ja\welcome.py", line 36, in 
    welcome()
    File "C:\Users\username\Documents\baby-steps-of-rl-ja\welcome.py", line 18, in welcome
    brain.add(K.layers.Dense(num_action, input_shape=[np.prod(s.shape)],
    AttributeError: 'NoneType' object has no attribute 'shape'

だめそうだ。

書誌情報

書名	Python で学ぶ強化学習
著者	久保隆宏
発行日	2019 年 2 月 25 日（第４刷）
発行者	講談社
定価	2800 円（税別）
サイズ	A5 版 (303p 21cm)
ISBN	978-4-06-514298-1
その他	川口市立図書館で借りて読む

まりんきょ学問所＞コンピュータの部屋＞コンピュータの本＞ニューロコンピューティング・人工知能＞久保隆宏：Python で学ぶ強化学習

MARUYAMA Satosi