Skip to content Skip to sidebar Skip to footer

Eager Execution: Gradient Computation

I m wondering why is this very simple gradient computation not working correctly. It is actually generating a [None, None] vector. Obviously, this is not the desired output. import

Solution 1:

There are two minor issues with the code snippet you posted:

  1. The a + b computation is happening outside the context of the tape, so it is not being recorded. Note that GradientTape can only differentiate computation that is recorded. Computing a + b inside the tape context will fix that.

  2. Source tensors need to be "watched". There are two ways to signal to the tape that a tensor should be watched: (a) explicitly invoking tape.watch, or (b) Using a tf.Variable (all variables are watched), see documentation

Long story short, two trivial modifications to your snippet do the trick:

import tensorflow as tf
tf.enable_eager_execution()

a = tf.constant(0.)
with tf.GradientTape() as tape:
    tape.watch(a)
    b = 2 * a
    c = a + b
da, db = tape.gradient(c, [a, b])
print(da)
print(db)

Hope that helps.

Post a Comment for "Eager Execution: Gradient Computation"