How to create deep and dynamic custom Tensorflow models ? (Part-1)

So, I was recently trying to implement a custom architecture that involved a lot of layers and it was tedious and I eventually got bored of keeping track of my layers and it’s parameters. Anyone who likes to code does not like to hard code anything and initially when I started learning Tensorflow it really took some time to understand how to dynamically create a model based on the input given to the code.

Now, this by no means is a perfect code that will work for all types of input, but the technique here is extremely robust and can be modified easily. I am going to divide this into 3 parts, for ease of understanding. These parts are divided into two blog posts, so that people who are familiar with the config file concept can skip to part-2 (Although I still recommend reading the second subsection which describes the config file structure).

  1. Understanding config file (Part-1)
  2. Preparing the config file (Part-1)
  3. Creating the model (Part-2)

Understanding config file

As a developer, I am sure that you must have heard about the concept of config file. If not let me explain it with an example.

In simple words a config file is where you store some initial parameters or settings that is needed by the code. Let me give you guys an example, a client asks you to develop a code that performs data manipulation operations on SQL DB using Python and they also present a requirement that the same piece of code must be usable by them for their own company SQL DB to which they cannot give you the access to.

You guys are pretty smart and create a class which takes database connection information as parameters during initialization of the class. Along with this you also created a UI so that the client can use it with ease. You presented the solution the client was happy with the functionality but still struggled to use the code. The client did not like the fact that the connection settings had to be initialized whenever they tried to perform some operations in the DB. In order to make it easy for the client, you create a config file and add connection settings to your UI interface which will store these connection settings in the config file.

Now that we are clear on the purpose of a config file, let’s think of how we can use this to develop our model. Main components of model include the layers, so we need some information about the layers. Each type of layer has different parameters so we need the config file to be flexible with respect to parameter names. Anything else .. ?? For now I don’t think so let’s dive in and figure out a structure for our config file.

Preparing the config file

Armed with the information we discussed above let’s try to decide a structure for our config file. We will have multiple layers in our model, so there needs to be some sort of list/array. We also need to store bunch of information about each layer; a python dictionary might be helpful.

How about something like this ..

# Architecture Parameters
arch_parameters = [
    {
        "layer_type" : "input",
        "layer_namescope" : "model_ip",
        "input_width" : IMG_WIDTH,
        "input_height" : IMG_HEIGHT,
        "input_channels" : CHANNELS
    },
    {
        "layer_type" : "conv",
        "layer_namescope" : "model_block_1_conv3_1",
        "layer_num_filters" : 32,
        "layer_kernel_size" : 3,
        "layer_activation" : "relu",
        "layer_strides" : 1,
        "layer_padding" : "same"
    },
    {
        "layer_type" : "conv",
        "layer_namescope" : "model_block_1_conv3_2",
        "layer_num_filters" : 32,
        "layer_kernel_size" : 3,
        "layer_activation" : "relu",
        "layer_strides" : 1,
        "layer_padding" : "same"
    },
    {
        "layer_type" : "maxpool",
        "layer_pool_size" : (2, 2),
        "layer_namescope" : "model_block_1_max_pool_1",
        "layer_padding" : "valid"
    },
    {
        "layer_type" : "BatchNorm",
        "layer_namescope" : "model_block_1_BatchNorm_1"
    },
    {
        "layer_type" : "conv",
        "layer_namescope" : "model_block_1_conv1_1",
        "layer_num_filters" : 32,
        "layer_kernel_size" : 1,
        "layer_activation" : None,
        "layer_strides" : 1,
        "layer_padding" : "valid"     
    },
    {
        "layer_type" : "relu",
        "layer_namescope" : "model_block_1_relu_1"
    },
    # End of Block 1 
    {
        "layer_type" : "dense",
        "layer_units" : 10,
        "layer_namescope" : "model_dense_final",
        "layer_activation" : "softplus"
    }    
]

Let’s examine the above config file. Each dictionary in the array has an attribute called “layer_type”. This is what defines what parameters need to be added for each layer type. So for “layer_type”: “conv” we have the parameters that are needed by the Conv2D function. Similarly for “layer_type”: “dense” the parameters added are the parameters that the Dense function needs.

The basic idea here is to have a main attribute in our case “layer_type” and then other attributes that are defined by that main attribute. Using this structure a lot of other layers can be added to the config file as shown.

This config file can now be used as an input to our Model class in tensorflow. Everytime we need to make some changes to the model we can make the change here in this config file and the code should be able to handle it on it’s own. This makes it readable and easy to spot mistakes in our layer parameters, especially in case of deep networks.

Kudos !! We have completed the first part of the tutorial. Part-2 will have the code on how to use this config file and also the Github link to the code so that you can try it yourself.

I know the tutorial is incomplete but I hope this tutorial has helped in some ways. If you like the tutorial, please like and share it with your friends and acquaintances or anyone you think that needs to know this. If you have any questions, put them in the comments section. If you think there is a mistake let me know and I will look into it. If you have any better way to code, please put a link to it in the comments section or reach out to me via email it will be helpful for all of us.

Wishing you all happy coding !! See you in the next tutorial. Ciao πŸ™‚

Published by

3 responses to “How to create deep and dynamic custom Tensorflow models ? (Part-1)”

  1. Good read! Especially about the config file. I know many of the beginners are not aware about exactly what it is & why it is.

    Liked by 2 people

    1. Thanks for the comment. Good to know that it was helpful. πŸ™‚

      Liked by 1 person

  2. […] the previous post we covered the fundamentals needed to understand the code. If you have not yet read it I recommend […]

    Like

Leave a comment