Basics of Tensorflow

Tensorflow has gained a lot of attention in the last few years for deep learning models. It provides simple APIs for the creation of neural networks which can be effectively scaled and hosted on different platforms. Tensor is a mathematical term with definition as:

In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space.

Wikipedia

In common sense, a tensor is a multidimensional array. However, Tensorflow is a python framework that has a comprehensive set of tools and libraries to create machine learning projects. Here the relationship of the tensor with TensorFlow is in the sense that most of the input, intermediate, and output data is in the form of tensors. In this blog, we will learn the basic operations of TensorFlow and build a simple neural network for linear regression.

Tensorflow constant and variables

#import tensorflow as tf
import tensorflow as tf
# Declare a constant, we need to import constant from tensorflow
sample_constant=tf.constant(20)
print(sample_constant.dtype)
# Now we can perform some operations using tensorlfow in this constant
#Similarly, we can create variables also
A1=tf.Variable([1,2,3,4])
print(A1)
#Above created variable can be printed using the numpy conversion
print(A1.numpy())
B1=A1.numpy()
B1=tf.Variable(B1)
print(B1)
view raw 2.py hosted with ❤ by GitHub

Tensorflow uses its own type of constants and variable also has a different method to declare, define and initialize them. The main reason is that Python vectorization and broadcasting are used in multiple operations of neural networks which are not possible with standard constants and variables. Constants and variables can be declared using the constant and Variable functions in the TensorFlow package. See the above code snapshot TensorFlow is imported as tf and constants and variables are declared.

Zeros and Ones

When performing various operations in the working of the neural networks, we have to take matrices with all elements as zero or one initialized. Tensorflow has zeros and ones function that can be used to create them in a single line of code. See the code snapshot below, tf.ones are used for creating a matrix of 3 by 2 with all ones having a data type of int32. Similarly, the matrix can be created for the zeros also. Let’s say we wish to create a matrix of either ones and zeros and its shape should be similar to an already available tensor, we can use ones_like function provided in the tensor package. In the code below, A1 and A23 are two constant tensors and B1 and B23 ones tensors were created using the shape of A1 and A23. A similar operation can be performed using zeros_like.

#Ones function
A1=tf.ones([3,2], tf.int32)
print(A1.numpy())
#Zeros function
B1=tf.zeros([3,2])
print(B1)
A1=tf.constant([1,2,3,4])
A23=tf.constant([[1,2,3], [4,5,6]])
#create ones tensor and perform element wise multiplication
B1=tf.ones_like(A1)
B23=tf.ones_like(A23)
view raw first.py hosted with ❤ by GitHub

Ones and zeros functions are useful when creating weight and bias tensors for forward propagation in the neural networks. Apart from ones and zeros, we can also initialize a tensor of any shape with any value using tf.fill method.

#fill method
c33=tf.fill([3,3], 7)
print(c33.numpy())
view raw fill.py hosted with ❤ by GitHub

Operations

Implementation of a neural network using TensorFlow requires various operations to be performed between the constants and the variables. In this section, some of the common operations will be discussed in brief with code.

Add

The addition of tensors is a simple yet repetitive process a programmer may need to perform. The addition in TensorFlow is performed element-wise and can be done using the following code.

#create two tensors
a1=tf.fill([3,3], 7)
a2=tf.fill([3,3], 3)
#Add the two tensors
a3=tf.add(a1,a2)
#print final tensor
print(a3.numpy())
view raw add.py hosted with ❤ by GitHub

a3 will add each element of both tensors and store it in the resultant tensor. See the result of the print command below. Seven plus three resulted as ten for each element of a3 tensor.

Multiply and Matrix Multiplication

Multiplication of tensors can be performed in two ways, the first is the element-wise multiplication just like add method and the second is the matrix multiplication. Both can be easily conducted using Tensorflow, see the code below.

#Ones function
A1=tf.ones([3,2], tf.int32)
#create ones tensor and perform element wise multiplication
B1=tf.ones_like(A1)
C1=tf.multiply(A1,B1)
# Matrix multiplication
#create feature value
feat_value=tf.constant([[1,12],[2,13],[3,14]])
'''' shape is tf.Tensor(
[[ 1 12]
[ 2 13]
[ 3 14]], shape=(3, 2), dtype=int32)'''
parameters=tf.constant([[100],[200]])
''' Shape is tf.Tensor(
[[100]
[200]], shape=(2, 1), dtype=int32)'''
#create predictions by matrix multiplication
pred=tf.matmul(feat_value, parameters)
print(pred)
view raw multiply.py hosted with ❤ by GitHub

The shape of tensors needs to keep in mind when performing multiplication operations. In element-wise multiplication shape of both tensors should be exactly the same. However, for matrix multiplication, the columns of the first tensor should be the same as the number of rows of the second tensor.

Reduce Functions

In the implementation of the neural network, we may need to reduce a tensor into a single value. Thus, this single value will reflect the complete tensor for next layers or output of a neural network.

#we can reduce the value of tensor
feat_value=tf.constant([[1,12],[2,13],[3,14]])
pred=tf.reduce_sum(feat_value)
print(pred)
''' OUTPUT
tf.Tensor(45, shape=(), dtype=int32)
'''
#we can reduce at any dimension also
print(tf.reduce_sum(feat_value, 0))
# This is reduce sum for zero dimension means column wise
''' OUTPUT
tf.Tensor([ 6 39], shape=(2,), dtype=int32)
'''
print(tf.reduce_sum(feat_value, 1))
# This is reduce sum for first dimension means row wise
''' OUTPUT
tf.Tensor([13 15 17], shape=(3,), dtype=int32)
'''
view raw reduce.py hosted with ❤ by GitHub

In the code above, I have taken an example of reduce_sum, however, there are many functions related to reduce which are listed in the table below.

FunctionImplementation
reduce_all()Computes Logical AND across dimensions of tensors.
reduce_any()Computes Logical OR across dimensions of tensors.
reduce_euclidean_norm()Computes the Euclidean norm of elements across dimensions of a tensor.
reduce_max()Finds maximum across tensor dimension
reduce_min()Finds minimum across tensor dimension
reduce_mean()Computes the mean of elements across dimensions of a tensor.
reduce_prod()Computes the products of elements across dimensions of a tensor.
reduce_std()Computes the standard deviation of elements across dimensions of a tensor.

Gradient

When updating the weights of tensors in the neural network, we need to find the minimum, maximum, or optimal value of loss or other functions used. The gradient can help us to identify that because gradient equals to zero is optimal value, change in gradient is greater than zero we can further minimize the value and, lastly, the gradient is smaller than zero we can maximize the value. In simple language, gradient provides the rate of change in a variable depending on the change in another variable.

# Define x
x = tf.Variable(6.0)
# Define y within instance of GradientTape
with tf.GradientTape() as gt:
gt.watch(x)
y = tf.multiply(x, x)
#Evaluate the gradient of y at x = 6
g = gt.gradient(y, x)
print(g.numpy())
''' OUTPUT
12.0
'''
view raw gradient.py hosted with ❤ by GitHub

In the above code, we have used context manager and gradient tape function to watch a variable for its gradient change. With any change in x it monitors the change in y. For more details please visit GradientTape page.

Reshape

The neural networks are widely used for image processing, however, an image is stored in the form of numeric values ranging between 0 to 255 in a matrix format. But, the neural network only accepts a linear input so we need to reshape the input image into a one dimension tensor so that it can be fed to the neural network. Reshape function is useful to do so. In the code below, we created a simple grayscale image using random numbers and then converted this image into a linear tensor. The image below the code shows a pictorial representation of the idea for 2*2 image.

#we need to reshape a picture so that it can be feed to neural network
#lets say we have a 28*28 grayscale image
image=tf.random.uniform([28,28], maxval=255, dtype='int32')
image_reshape=tf.reshape(image, [28*28,1])
#this is required when we input an image to the neural network
view raw reshape.py hosted with ❤ by GitHub

Random Number Generations

The optimal method of initializing a weight or bias tensor is to use a random process distribution. The TensorFlow provides various random number generators which can be directly used to generate a tensor using values from a random distribution. In the code of reshape function, we have created an image using uniform random distribution where the maximum value was 255. Similarly, we can generate random numbers using categorical, fixed_unigram_candidate_sampler, gamma, poisson, normal, stateless_binomial and Others. Please visit this page for all random number generator functions.

Losses in Neural Network

The loss actually helps to understand how accurate the neural network is for any prediction task. It also shows the path for the training process to train the weights. All weight updates are based on the loss value generated from the predicted output from the target output. The losses are basically divided into two types of problems which are classification and regression. Classification problems mostly use BinaryCrossentropy, CategoricalCrossentropy, and CosineSimilarity. On the other hand, regression problem uses MeanAbsoluteError, MeanSquaredError, and Huber. For more detail, you can visit the documentation page of TensorFlow for the loss here.

Optimizers

As the name suggests, these methods optimize the weight and bias tensors of a neural network to minimize the loss we learned in the previous block. More detailed comparisons and working of optimizers can be read from this article on medium. The basic optimizer is the gradient descent which updates the weights and bias based on the gradient of loss value using the equation below.

W = W - α Δw

However, the most commonly used optimizer is the ADAM optimizer. Tensorflow documentation provides details about every optimizer on this page.

Linear Regression using TensorFlow

Now let’s use the above knowledge and create a simple model to train the intercept and slope variables of linear regression, see the code below. The auto-mpg dataset was downloaded from Kaggle and two columns were extracted from the data set as numpy array (line: 2,3). Intercept and slope of linear regression were created as TensorFlow variables with an initial value of 0.2 both (line: 6,7). A linear_regression function is defined which returns the value based on the linear regression equation (line: 10, 11). A loss function is defined which calculates the loss using mse and returns it (line: 14 -18). The Adam optimizer is used which tried to minimize the loss by manipulating the values of intercept and slope over an iteration of 1000. Loss and final values of intercept and slope were printed, see the output image below the code. We can see that loss is decreased gradually.

#extract mpg and horsepower from the dataset
mpg_1=np.array(mpg['mpg'], np.float32)
horsepower=np.array(mpg['horsepower'], np.float32)
#define the intercept and slope
intercept=tf.Variable(0.2, np.float32)
slope=tf.Variable(0.2, np.float32)
#create a linear regression using y=mx+b
def linear_regression(intercept, slope, features=horsepower):
return slope*features+intercept
#create a loss function
def loss_function(intercept, slope, target=mpg_1, features=horsepower):
#create predictions
pred=linear_regression(intercept, slope, features)
loss= tf.keras.losses.mse(target, pred)
return loss
#create an instance of optimizer
opt=tf.keras.optimizers.Adam()
#minimize the loss using epochs
epochs=1000
for i in range(epochs):
opt.minimize(lambda: loss_function(intercept, slope), var_list=[intercept, slope])
print(np.array(loss_function(intercept, slope)))
print(np.array(intercept), np.array(slope))
view raw linear.py hosted with ❤ by GitHub

Conclusion

In this blog, we have discussed some of the basics of TensorFlow which we require to build neural network models. In the end, we have created a simple linear regression model to find the optimal value of intercept and slope. This blog’s intention was to discuss some of the basics we need to learn before using high-level APIs to develop a neural network for complex applications.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s