Skip to main content

Coursera Deep Learning Course 2

Training/Dev/Test Set
what is training set / dev set / test set
In traditional methodology/ when we have small size data we can take 60-20-20 ratio to get training set-validation set/dev set -test set.
Now, when we have big data it is fine that the dev set or test set to be less than 10 or 20 percent of your data. Or even 98-1-1 ratio is also fine.
One rule of thumb is : Test set and Dev set should come from same distribution.

Bias and Variance
Bias means the high error rate in training. I may be due to underfitting. For this we can change neural network architecture like network size and number of iterations. Varaince means error rate in Dev set . This may be due to Over fitting of the data . This can be avoided by increasing number of data and regularization.
Bias - Variance trade off means balancing both without the increase in other. Regularization is used to reduce the variance . It may hurt bias and bias may increase a little but not much if we have  a bigger network.

L2 Regularization - For variance problem
It is used for avoiding overfitting in the network. (Add pics of equation).
In Neural network L2 regularization is also called Forbenius form.
It is also known as 'weight decay' . Explain the reason.
Add equations. () . In this we are adding an extra term called 'regularization parameter'- called 'lamda'. So when tuning hyper parameters we should consider this one also.

Why L2 Regularization, how it helps :
* It penalizes the neural network for having larger weights.
* The main idea is having a bigger network will cause overfitting in the data
* Hence L2 regularization maps the weights to zero or in more clear terms it reduces the effect of weights, thus making the network small
* L2 regularization is a very powerful technique and it is mostly used in most of the deep learning works
* When we plot the cost of gradient descent against the number of iterations, if we are using regularization then we can see the drop in cost function monotonically.
* Explain with image how it is reduced (Add):

Drop out regularization
This is another very powerful regulaization method. We can do drop out regularization in different ways. One of them is inverted drop out. In this
some of the the hidden units and its connections are removed from the etwork using one probability .
(Images of how it works)
Another thing in prcatical is for different training sets make the different nodes zero. That is called drop out.

Drop out
In drop out , no hyper parameters are added into the Cost function. We are just eliminating the random nodes. Main use of cost function is in Computer vision. Because in computer vision, there is not much data available. So scientists guess there will be overfitting and so they are adding the drop ot layer strictly. 

Data augmentation
If the neural network is overfitting then one way to avoid this is to add more data. But for example. in computer vision the amount of data available will be less and hence we can perform different operations on images like flipping horizontally etc. to increase the training data set. This is called data augmentation.

Early stopping
Early stopping refers to stop the training of neural network early so that weights of the network will be small. Since we are initiating the weights small, after a small number of steps the weights will be equal to zero only, so if we are stopping the training there , it will be similar to l2 regularization and it helps to reduce overfitting. But this is not a good way since it breaks the orthogonality rule of the DNN , that is separate actions for separate functions. In the course Andrew NG prefers L2 regularization more, although finding the lamda is a costly procedure.

Comments

Popular posts from this blog

A Rule Based Question Answering System in Malayalam corpus Using Vibhakthi and POS Tag Analysis

INTRODUCTION The main goal of Question Answering system is to process requests in natural language form and to provide the accurate short answers to them. Most of the web Browsers we are using today handles QA tasks as information retrieval. So instead of retrieving the precise answers we get all documents similar to our query. Rather than keyword based queries natural language expressions would be processed by efficient QA systems. Mainly there are two types of QA systems: closed domain question answering systems and open domain question answering system . Also questions can be of different forms: factoid, list, definition, description . Here we focus on factoid type question answering. In Malayalam no efficient question answering systems exist now. Other than keyword processing we need natural language processing techniques for the QA system in Malayalam. Hence this work is important in Malayalam NLP related works. Importance of Karaka Thoery and Vibhakthis for Indian Language ...

List of Computer Vision APIs

Computer Vision APIs Different computer vision tools and APIs are : Google CV Watson VR Amazon R Microsoft CV Clarif.ai Cloudsight Scale https://www.scaleapi.com/image-annotation Imagga vize.ai https://vize.ai/ http://www.recognize.im/ Moodstocks ( http://www.moodstocks.com/pricing/ ) * Kooaba ( http://www.kooaba.com/en/plans_a... ) * IQ Engines ( https://www.iqengines.com/pricing/ ) * LTU technologies ( http://www.ltutech.com/ ) Camfind - Image recognition back-end for the popular app CamFind. Take advantage of the leading image recognition platform through an easy to use web API. Recognize API | Mashape - Vufind Recognize is a real-time image recognition API for classification and monetization of photos and videos. Recognize uses object recognition to uncover meaning and metadata of photos and videos for contextual image commerce and advertising. Kooaba - Our cloud-based image recognition solutions mak...