Reproducibility in Keras

I bet I am not the only one that tried to figure this out! Of course I google’d around but really could not find a solution to make a Keras model expel the same results every time I execute the notebook cell with the compile and fit of the model. I read and applied the recipe in the official Keras documentation but still my code was returning different results. The solution came when I read the details in the Tensorflow documentation for tf.random.set_seed. It turns out that in order to make a Keras model compilation and fit methods reproducible we need to wrap our model in a function to leverage what the TensorFlow documentation says about functions:

Note that tf.function acts like a re-run of a program in this case. When the global seed is set but operation seeds are not set, the sequence of random numbers are the same for each tf.function

The solution is to reset the seed in the cell that calls the function that instantiate the model as described in the example below!

Let’s start with seeding all the generators at the beginning of a notebook…

# for reproducability use this block at the beginning of any notebook!!
# following this https://keras.io/getting_started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development
# VERY IMPORTANT.. these 2 lines mustbe before importing Tensorflow
# There are 4 different random number generators that need to be "seeded"
import os
os.environ['PYTHONHASHSEED'] = '0'

import numpy as np
import tensorflow as tf
import random as rn
# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(123)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
rn.seed(123)

# The below set_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(1234)

Now lets create a simple Keras Sequential Model and data

from tensorflow import keras
from tensorflow.keras import layers

# create a simple regression model
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(1, activation="sigmoid"))
model.build(input_shape=[10,3])

# some data
x = tf.ones((3, 3))
y = np.array([3,2,1])

To monitor the randomization of the weights, I reused a code I found in an excellent repo from the NVIDIA folks to track how the seeds impact the weights.

# this function is used to track the value of the weights
# captured from https://github.com/NVIDIA/framework-determinism
def summarize_keras_trainable_variables(model, message):
  s = sum(map(lambda x: x.sum(), model.get_weights()))
  print("summary of trainable variables %s: %.13f" % (message, s))
  return s

Now if we monitor what happens with the weights during compilation and fit, we will see the following results

tf.random.set_seed(1234)
adm = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss='mse',
               optimizer=adm,
               metrics=['accuracy'])
summarize_keras_trainable_variables(model, "before training")
history = model.fit(x,y, epochs=10, batch_size=1,
                     validation_split=0.1, verbose=0)
summarize_keras_trainable_variables(model, "after training")
summary of trainable variables before training: -1.3297259211540
summary of trainable variables after training: -1.2190857985988

if we repeat the code again we will get the following results

summary of trainable variables before training: -1.2190857985988
summary of trainable variables after training: -1.0630745515227

Notice how the las value of the first run became the first value of the second run. This is expected as described in the TensorFlow documentation. Obviously we do not want this so let’s wrap the model into a function and see what happens!

# wrap the model in a function!
def MLP():
    # create a simple regression model
    model = keras.Sequential()
    model.add(layers.Dense(2, activation="relu"))
    model.add(layers.Dense(3, activation="relu"))
    model.add(layers.Dense(1, activation="sigmoid"))
    model.build(input_shape=[10,3])
    return model

and run a new block of code that include the instantiation of the model using the function

# compile and fit model and observe weigths
tf.random.set_seed(1234)
model = MLP()
adm = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss='mse',
               optimizer=adm,
               metrics=['accuracy'])
summarize_keras_trainable_variables(model, "before training")
history = model.fit(x,y, epochs=10, batch_size=1,
                     validation_split=0.1, verbose=0)
summarize_keras_trainable_variables(model, "after training")

This time the results are:

summary of trainable variables before training: -1.3297259211540
summary of trainable variables after training: -1.2190857985988

and if we run the code again, we should get the same values!!!

summary of trainable variables before training: -1.3297259211540
summary of trainable variables after training: -1.2190857985988

This is what I was looking for and in a nutshell the recipe consist on:
1) Use the code suggested by the Keras documentation at the beginning of the notebook.
2) Wrap the models in a function.
3) Reset the TF seed before instantiating the Model, compiling and fitting it.

I hope this will help you to get your TF models reproducible. Have a great day!!