Blog Archives

TensorFlow：Eager Modeで高速化する方法

20/2/2019

突然ですが、TensorFlow の Eager Mode（Eager Execution）を使っていますか？私が初めて使った時、これは便利だと思ったのですが、速度が遅いので止めました。

そんなのろまな Eager Mode（Define by Run）を Graph Mode（Define and Run）より高速にする方法を紹介します。

参考にした記事は、Tensorflow Eager vs PyTorch （強化学習編）です。jack_ama さん、ありがとうございます <(_ _)>

やり方は簡単！下記の nn99_eager.py を見てもらえればわかるように、@tf.contrib.eager.defun を一行追加するだけです。
結果は以下のようになりました。（TensorFlow 1.13.1で再度計測しました2019/3/3）

処理時間比較（TensorFlow 1.13.1）
Mode	処理時間
Graph Mode	21.501 sec
Eager Mode（@tf.contrib.eager.defun無し）	85.694 sec
Eager Mode（@tf.contrib.eager.defun有り）	17.486 sec

なんと、Eager Mode（@tf.contrib.eager.defun有り）が、Graph Mode より約1.2倍高速になりました。
比較対象の Graph Mode 版は、TensorFlow：ニューラルネットワークによる乗算の学習の nn99_graph.py です。

今回作成したプログラムは、以下のようになります。

OS：Ubuntu 18.04
TensorFlow 1.13.1（Eager ModeではTensorFlow 1.12.0の方が高速です）

nn99_eager.py（@tf.contrib.eager.defun有り）


import math
import numpy as np
import tensorflow as tf
import time

tf.enable_eager_execution()

class Model:
  units1 = 2
  units2 = 7
  units3 = 1
  epochs = 50001

  def __init__(self):
    self.W1 = tf.Variable(tf.random_uniform([self.units1, self.units2], -1.0, 1.0, tf.float32))
    self.b1 = tf.Variable(tf.random_uniform([self.units2], -1.0, 1.0, tf.float32))
    self.W2 = tf.Variable(tf.random_uniform([self.units2, self.units3], -1.0, 1.0, tf.float32))
    self.b2 = tf.Variable(tf.random_uniform([self.units3], -1.0, 1.0, tf.float32))
    #self.optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    self.optimizer = tf.train.AdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1.0e-8)

  def model(self, x):
    return tf.matmul(self.f(tf.matmul(x, self.W1) + self.b1), self.W2) + self.b2

  def f(self, x):
    return tf.tanh(x)

  def loss(self, x, y):
    return tf.reduce_mean(tf.square(self.model(x) - y))

  def grad(self, x, y):
    with tf.GradientTape() as tape:
      loss = self.loss(x, y)
      params = [self.W1, self.b1, self.W2, self.b2]
    return tape.gradient(loss, params)

  @tf.contrib.eager.defun
  def optimize(self, x, y):
    grads = self.grad(x, y)
    params = [self.W1, self.b1, self.W2, self.b2]
    self.optimizer.apply_gradients(zip(grads, params))

  def train(self, x, y):
    start = time.time()
    for step in range(self.epochs):
      self.optimize(x, y)
      if step % 1000 == 0:
        loss = self.loss(x, y)
        print("Step={:5d}, Loss={:8.5f}".format(step, loss))
    stop = time.time()
    print("{:.3f}sec".format(stop - start))

  def predict(self, x):
    return self.model(x)[0].numpy()

model = Model()

x_train = tf.convert_to_tensor(np.array([[x1, x2] for x1 in range(1, 10, 1) for x2 in range(1, 10, 1)]), tf.float32)
y_train = tf.convert_to_tensor(np.array([[x1 * x2] for x1 in range(1, 10, 1) for x2 in range(1, 10, 1)]), tf.float32)

model.train(x_train, y_train)

for i in range(1, 21, 1):
  print("{:2d} * {:2d} -> {:7.3f}".format(i, i, model.predict([[float(i), float(i)]])[0]))

Graph Mode さよ～なら～ (^o^)/~

Eager Mode だ～いすき♪（菊川怜）

最後に残念なお知らせがあります。TensorFlow 2.0 では、@tf.contrib.eager.defun がサポートされないそうです。
どうしたらいいんでしょう．．．

参考サイト

TensorFlow：ニューラルネットワークによる乗算の学習

18/2/2019

今回は、掛け算の九九の表（1～9までの整数の掛け算）を3層のニューラルネットワークに学習させ、実数の乗算ができるようにします。
言い換えると、

　　F(x1, x2) = x1 * x2

という関数をニューラルネットワークで作るということです。
プログラムは、TensorFlow を使って作成します。また、活性化関数は、双曲線正接関数 tanh を使うことにします。

　　units1 ：入力層のユニット数（入力変数の個数）
　　units2 ：隠れ層のユニット数
　　units3 ：出力層のユニット数（出力変数の個数）
　　epochs：勾配降下法の反復回数

尚、3層ニューラルネットワークのモデル関数及び W1、b1、W2、b2 については、ニューラルネットワークで使う数式のまとめを参照してください。

今回作成したプログラムは、以下のようになります。

OS：Ubuntu 18.04
TensorFlow 1.12.0

nn99_graph.py


import math
import numpy as np
import tensorflow as tf
import time

class Model:
  units1 = 2
  units2 = 7
  units3 = 1
  epochs = 50001

  def __init__(self):
    self.W1 = tf.Variable(tf.random_uniform([self.units1, self.units2], -1.0, 1.0, tf.float32))
    self.b1 = tf.Variable(tf.random_uniform([self.units2], -1.0, 1.0, tf.float32))
    self.W2 = tf.Variable(tf.random_uniform([self.units2, self.units3], -1.0, 1.0, tf.float32))
    self.b2 = tf.Variable(tf.random_uniform([self.units3], -1.0, 1.0, tf.float32))
    self.xd = tf.placeholder(tf.float32, [None, self.units1])
    self.yd = tf.placeholder(tf.float32, [None, self.units3])
    #self.optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    self.optimizer = tf.train.AdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1.0e-8)
    self.loss = self.loss()
    self.optimizer_op = self.optimize(self.loss)

  def model(self, x):
    return tf.matmul(self.f(tf.matmul(x, self.W1) + self.b1), self.W2) + self.b2

  def f(self, x):
    return tf.tanh(x)

  def loss(self):
    return tf.reduce_mean(tf.square(self.model(self.xd) - self.yd))

  def optimize(self, loss):
    return self.optimizer.minimize(loss)

  def train(self, session, x, y):
    start = time.time()
    for step in range(self.epochs):
      session.run(self.optimizer_op, feed_dict={self.xd:x, self.yd:y})
      if step % 1000 == 0:
        loss = session.run(self.loss, feed_dict={self.xd:x, self.yd:y})
        print("Step=%5d, Loss=%8.5f" % (step, loss))
    stop = time.time()
    print("%.3f sec" % (stop - start))

  def predict(self, session, x):
    return session.run(self.model(x)[0])

model = Model()

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())

  x_train = np.array([[x1, x2] for x1 in range(1, 10, 1) for x2 in range(1, 10, 1)])
  y_train = np.array([[x1 * x2] for x1 in range(1, 10, 1) for x2 in range(1, 10, 1)])

  model.train(sess, x_train, y_train)

  for i in range(1, 21, 1):
    print("%2d * %2d -> %7.3f" % (i, i, model.predict(sess, [[float(i), float(i)]])[0]))

注）今回は、-1～+1の範囲で乱数を生成させ、パラメータの初期値を設定しましたが、入力変数や出力変数のデータの範囲を考慮して乱数の範囲を設定してください。

注）オプティマイザーとして局所最適解に陥りにくい Adam Optimizer を用いていますが、局所解に陥ることもあるので、何度か初期値を変えて学習させ、損失関数の最も小さいものを選んでください。

結果は以下のようになりました。もちろん、2.3 × 5.8 のような実数同士の掛け算もできます。


 1 *  1 ->   0.983
 2 *  2 ->   3.973
 3 *  3 ->   8.937
 4 *  4 ->  15.956
 5 *  5 ->  24.924
 6 *  6 ->  35.976
 7 *  7 ->  48.929
 8 *  8 ->  63.953
 9 *  9 ->  81.006
10 * 10 ->  90.502
11 * 11 ->  91.155
12 * 12 ->  89.974
13 * 13 ->  88.384
14 * 14 ->  86.649
15 * 15 ->  84.887
16 * 16 ->  83.185
17 * 17 ->  81.614
18 * 18 ->  80.221
19 * 19 ->  79.030
20 * 20 ->  78.042

データを与えた1～9までの範囲では、比較的いい値になっていますが、データの範囲外では大きくずれることがわかります。これは活性化関数として±∞で定数に近づく双曲線正接関数を用いたからです。同様に、シグモイド関数を使った場合でも、外挿すると大きくずれるので注意してください。

TensorFlow：Eager Modeで高速化する方法

参考サイト

TensorFlow：ニューラルネットワークによる乗算の学習

参考サイト

作成者

Archives

カテゴリ