I built the same model with TensorFlow and PyTorch | Which Framework is better?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
which deep learning framework is the best tensorflow is the most used framework in production pytorch is the most popular framework amongst researchers when comparing google trends they are close together tensorflow came out earlier and was much more popular all the time however pytorch recently took the lead here for the very first time based on github stars tensorflow is still much more popular with three times as many stars as torch also in production systems tensorflow is still much more widely used and provides a great ecosystem for example with tensorflow lite for mobile and edge devices or tensorflow.js for web applications however pytorch is trying to catch up in production systems and for example the new pytorch live library looks very promising for mobile deployment so which one should you use as always the answer is not simple and depends on your requirements both frameworks have its pros and cons and both are great so in this video i created the same convolutional neural net with both frameworks in the process you'll learn how to use each framework how the api looks like and hopefully get a better feel for the trade-offs between each of these so in the end you can make the best choice for your next project the neural network we're going to build is a simple convolutional neural net to classify images it consists of multiple convolutional layers followed by linear classification layers and value actuation functions in between we load a data set build and train the model which should have the exact same architecture in both frameworks and then we evaluate the model on a test set so this is what a typical deep learning application looks like if you want to learn more i have free courses for both tensorflow and pythorg on my channel that bring you from beginner to advanced but now without further ado let's get started tensorflow 1 was released in 2015 by google and quickly became a big success in 2019 tensorflow 2 was released which brought a major rework of the api and made it much more beginner friendly tensorflow 2 now offers two different kinds of apis the sequential api suited for beginners and the subclassing api for experts the sequential api is much more popular and this is what we're going to use in this video it's based on the keras api another deep learning library that has been fully integrated into tensorflow 2. it provides a high level api very similar to the scikit-learn library and abstracts away a lot of the difficult stuff so let's see how to use it i use a google call up here so we don't have to worry about installation and get a free gpu as well the first thing to do is to change the runtime type and select gpu since we use a gpu tensorflow automatically performs all relevant operations on the gpu and we don't have to worry about this at all in our code on the other hand later in pytorch you will see that we do have to take care of this ourselves now one thing to note is that some parts of the video are shown with increased speed so it does not represent the actual training time of course but speed is roughly on the same level for both frameworks so let's import all modules we need so here we import tensorflow and a few modules from tensorflow.keras we also use maplotlip to plot some images as dataset we use one of the built-in datasets the cipher10 dataset it consists of colored images of 10 different classes loading it automatically gives us training and testing images together with the corresponding labels the only thing we do with it is normalize it to have values between 0 and 1. let's plot the images which can simply be done with matplotlib and here we can see a grid of 25 different images now let's build the model this is the architecture we want to build it consists of three convolutional layers with max pooling in between followed by two linear layers for classification in the end and of course we also want actuation functions in between as mentioned before we use the sequential api where we can add all the layers that we want we start by adding a convolutional layer where we define the output size the kernel size for the first layer we also specify the input shape and we can also simply specify an activation function by using the corresponding name as string in this case relu we add the next layer in the same way this time a max pooling layer then let's add two more convolutional and one more max pooling layers after all convolutions we add a flattening layer and then a dense layer again with a relu actuation function a dense layer is simply a fully connected linear layer and at the very end we add another dense layer with 10 outputs for our 10 classes but note that here we don't use an actuation function and instead return the raw values after the final layer before we can use the model we have to call model.compile this needs an optimizer and the loss function for the optimizer we can again simply use a string in this case the adam optimizer for the loss we create an object of categorical cross entropy loss and notice that we use from logits equals true because we don't have an actuation function in the last layer for the optimizer we could also create an actual object to have more control over parameters but we keep it as simple as possible we can also pass in a list of metrics that should be tracked during training like here the accuracy now training is done in one line by calling model.fit with the training images and labels we can specify the number of epochs and by specifying validation split tensorflow automatically splits the training images into training and validation sets for us to have an evaluation on the validation data during training i can already tell you that the whole training won't be as simple as that with pytherg later now we can inspect the loss and accuracy for both training and validation sets let's keep those numbers in mind and compare it later with pytorch the fit method also gave us an history object back which we can use for example to plot the accuracy for all epochs this is another neat feature in tensorflow that attracts these metrics automatically evaluating the model on the test data is another simple one-liner we call model.evalue with the test images and labels and can then print the test loss and test accuracy and that's it before we jump to the pie church code i'd like to give a quick shout out to today's sponsor which is intel intel is helping developers like me build deploy and scale ai solutions with a vast amount of hardware and software tools for example their intel ai software portfolio lets you build upon the existing ai software ecosystem and you can get 10 times or 100 times optimizations for popular frameworks like python or tensorflow i also love their one api ai analytics toolkit to get end-to-end performance for ai workloads or the openvino toolkit to deploy high performance inference applications from devices to the cloud their goal is to make it as seamless as possible for everyone so check out their toolkits and resources with the link in the description thanks again so much to intel for sponsoring this video pytorch was first released in 2016 by facebook's ai research lab and quickly became very popular among scientists and researchers it requires you to write more code which you will notice in a moment however its api is very well designed with a lot of people saying it feels more pythonic compared to tensorflow once you know your way around it it allows you to easily modify the code on a lower level and gives you a lot of flexibility while still feeling not too complicated pytorch also comes pre-installed in collab so we can import it right away we're going to use a few modules from torch torch vision and torch nn in this case we have to take care of managing operations on the correct device ourselves so we check if we have a gpu available since gpu is turned on in this collab the device name is cuda in this case here i define the batch size as hyper parameter 32 was also the default value under the hood in tensorflow to load the data set we define a transforms object this will transform the images to a pi torch tensor and also applies normalization as training data set we can also use the cipher 10 from the built-in data sets module then we create a data loader object this is the heart of the pytorch data loading utility and provides optimized loading and iterations over the data set then we do the same for the test data loader the only difference here is that train and shuffle parameters are set to false like before i have some helper code to plot an image grid with matplotlib to get access to the data set we create an iter object and access the first batch by calling data dot next now let's create the model every model in pi church is implemented with a class that inherits from nn.module and it needs to implement the init and the forward function in the init function we simply create all the layers we want so all the convolutional layers all linear layers and the max pooling layers one thing to note compared to tensorflow is that here we also need to specify the correct input shapes for each layer so you first have to check how each layer affects the tensor size the forward function then defines what code will be executed in the forward pass here we basically call all the layers after each other we also apply the value actuation function when needed by calling f.value and we use torch.flatten before the linear layers this should give the same model architecture as before we need more code but we also get more flexibility with this object-oriented approach for example if we want to lock information or dump data during the forward pass we could very easily add this code here after creating the class we then create a model object and since we use a gpu we also have to push the object to the gpu by calling dot to device we also need a loss and an optimizer like before here we cannot simply use a string for the atom optimizer and need to create an actual object from the correct api class and now let's do the training remember how this was a one-liner with model.fit in tensorflow well here we have to put in some more work and write the training loop ourselves but again this gives you more flexibility if you want to add custom behavior for a basic pytorch training loop we need the following i first store the number of iterations per epoch and then we have an outer loop over the number of epochs we use 10 like before then we keep track of the running loss so this will be the average loss for each epoch and then we have an inner loop over the train loader this iterates over all the batches of the training data set and returns the images and the labels again we have to push it to the gpu here to have both the model and the data on the same device then we call the model and pass the inputs to it this essentially executes the forward function under the hood and thus the forward pass with the outputs and the original labels we then calculate the loss and then we have to do the backward pass and call optimizer.step loss.backward performs the back propagation and optimizer.step performs the update steps for the weights we also have to empty the optimizer gradients before each iteration i won't go into more detail here but basically these are all the steps needed for a training loop then we can update the running loss and after each epoch print the average epoch loss to see some information during training again this needs more code but the steps should be clear and allow for much more customization now before we go to the evaluation i want to mention that of course tensorflow also allows customization on a lower level remember the subclassing api i mentioned in the beginning this is basically what you have to use if you want a similar level of customization in tensorflow i also need to mention that with tensorflow we got automatic training and validation splitting during training just with the validation split argument here we would have to implement this ourselves by using a third validation data loader and applying this in the training loop i did not do this here and hence the training was done on the full training data set this is the one thing that is different and could skew the results slightly and should be pointed out and now the last step evaluation for this we set the model to evolve mode and we say with torch dot no great this basically turns off gradient tracking since we no longer need it here and then we do a similar loop as before but now over the test loader again we pass the tensors to the correct device do the forward pass by calling the model and then we get the actual class labels by calling torch.max we then calculate the number of correct predictions and in the end we can calculate and print the final test accuracy let's look at the training loss and the test accuracy for both frameworks side by side pi charge test accuracy was 0.72 and tensorflow test accuracy was 0.69 training loss in pi torch was 0.57 and training loss in tensorflow was 0.61 both times pi charge is slightly better but again the results could be slightly skewed since i did not use a validation split in pytorch but overall i'd say performance wise both frameworks are on the same level alright and that's it i hope this gave you a great overview of both frameworks let me know in the comments which one is your favorite and again i have courses for both libraries on my channel and i put the links in the description below if you enjoyed the video please leave me a like and then i hope to see you in the next video bye
Info
Channel: Patrick Loeber
Views: 86,139
Rating: undefined out of 5
Keywords: Python
Id: ay1E1f8VqP8
Channel Id: undefined
Length: 13min 33sec (813 seconds)
Published: Sun Apr 24 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.