Keras - Models


A model in keras is just a group of layers.

A complete example that goes through creating, training, and evaluating a keras model:

from sklearn.datasets import load_iris
from sklearn.preprocessing import LabelBinarizer
from sklearn.utils import shuffle
import keras

# Load dataset
iris = load_iris()
data = iris.data
enc = LabelBinarizer()
target = enc.fit_transform(iris.target)
X, y = shuffle(data, target, random_state=0)

# Make model
inputs = keras.Input(shape=(4,))
x = keras.layers.Dense(5, activation='relu')(inputs)
outputs = keras.layers.Dense(3, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs)

# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train
model.fit(x=X, y=y, batch_size=8, epochs=150, validation_split=0.3)

# Evaluate
loss, accuracy = model.evaluate(X, y)

# Predict
predictions = model.predict(X)  # softmax gives us a probability of each category

Model

The Model class requires two things, the inputs to a model and the outputs to a model. There is an optimal name parameter

There are two ways to instantiate a Model class:

  • Functions API
  • By subclassing the Model class

Function API method:

import tensorflow as tf
inputs = tf.keras.Input(shape=(3,))
x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(inputs)
outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

Save And Load

Save tf.keras

Saving a model whole is super easy.

model.save('my_model')

This will create a directory called my_model with assets, saved_model.pb, and variables as contents. This only works if using tf.keras instead of native keras. See below if using native keras.

To load it again:

model = tf.keras.models.load_model('my_model')
Save keras

Saving the model as a single HDF5 is an option, however some items are not saved, such as custom layers and external losses and metrics. These can be quite annoying to add back later if you get a model from someone else.

H5:

model.save('model.h5')
model = tf.keras.models.load_model('model.h5')
Other Save/Load Functions
  • model.get_weights()
  • model.set_weights(weights)
  • model.save_weights('file_path.h5')
  • model.load_weights('file_path.h5')
  • model.to_json()
  • model = tf.keras.models.model_from_json(config)
  • new_model = tf.keras.models.clone_model(model)

Sequential

The Sequential class allows you to add layers sequentially to a model.

model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(16,)))  # Add input layer that accepts a feature vector of length 16
model.add(tf.keras.layers.Dense(6))  # Adds a layer containing 6 neurons

Summary

Model.summary() can be used to summarize your model, but outputting the layers

Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 3)]               0
_________________________________________________________________
dense (Dense)                (None, 4)                 16
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 25
=================================================================
Total params: 41
Trainable params: 41
Non-trainable params: 0
_________________________________________________________________

Training

Keras training APIs involve compiling, fitting, evaluating, and predicting using a model.

Compile

Prepares the model for training (does a lot of hidden stuff).

Model.compile(
    optimizer="rmsprop",
    loss=None,
    metrics=None,
    loss_weights=None,
    weighted_metrics=None,
    run_eagerly=None,
    steps_per_execution=None,
    **kwargs
)
  • Optimizer: Adam is the most popular optimizer.
  • Loss: The neural network will try to minimize this value via the optimization algorithm. There are a large number of loss functions. Most notably (note, for keras, you can use a string instead of the tensorflow functions by snake-casing the function as a string):
    • Classification (Categorical) Data:
      • BinaryCrossentropy: Used when there are only two possible labels (0 and 1)
      • CategoricalCrossentropy: Used when there are two+ possible classes. Expects labels to be encoded via one-hot representation.
      • SparseCategoricalCrossentropy: Sibling to CategoricalCrossentropy. Expects an integer encoding instead of one-hot. Integers are distinct classes, similarity via closeness is not assumed.
    • Regression (Continuous) Data:
  • Metrics: List of metrics to output during training and returned during fitting. ['accuracy'] is the most common metric
  • Loss Weights: If a list of losses is given as the lost function, you can specify how heavily waited each loss function is. For example [10, 1] would weight the first loss function 10 times heavier than the second loss function.
Fit

Used to train a model.

Model.fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose=1,
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_batch_size=None,
    validation_freq=1,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
)
  • Verbose: 0 = silent, 1 = progress bar per epoch, 2 = one line output per epoch
  • Callbacks: List of callbacks. My documentation on callbacks can be found here.
  • Validation Split: Float between 0 and 1. Fraction of training data to use for validation. Uses the ending % BEFORE shuffling.
  • Validation Data: Data to use for validation. Data should be in (x_val, y_val) format. Do not use with validation_split.
  • Class Weight: Dictionary mapping class indices (integers) to weight (float). Useful for unbalanced data (where there are more samples of one class than another)
  • Sample Weight: Weigh samples differently. 1D numpy array of sample size is expected.
  • Validation Frequency: How often, in terms of epochs, to validate the data.
  • Generator Specific Arguments:
    • Steps Per Epoch: Number of batches required to declare an epoch. Needed for generators. Same idea for validation_steps.
    • Max Queue Size: Number of samples to queue for a generator. Defaults to 10.
    • Workers: Number of workers used for generators. Defaults to 1.
    • Use Multiprocessing: Use process-based threading for generators.

Returns a History object that can be used for plotting.

Evaluate

Like fit, but without the training. Used to find the loss and metric values for the model.

Model.evaluate(
    x=None,
    y=None,
    batch_size=None,
    verbose=1,
    sample_weight=None,
    steps=None,
    callbacks=None,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
    return_dict=False,
)
  • Return Dict: Return the loss and metric results as a dictionary instead of a list. Key is the name of the metric. If False, a list (or single value) is returned.
Predict

Used to predict data. A batch is expected.

Model.predict(
    x,
    batch_size=None,
    verbose=0,
    steps=None,
    callbacks=None,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
)

Numpy array of predictions is returned