## NPTEL An Introduction to Artificial Intelligence Week 12 Assignment Answers 2024

1. Choose the CORRECT statement(s) –

- According to the Universality Theorem, we need at least 2 hidden layers to represent any continuous function to arbitrary closeness.
- With large training datasets, SGD converges faster compared to Batch Gradient Descent
- tanh is simply a scaled linear transformation of the sigmoid function.
- A 2-layer perceptron with a step activation function can represent XOR.

Answer :-

2. What is the value of ∂L/∂y?

Answer :-

3. What is the value of ∂L/∂w_{1} ?

Answer :-

4. Now, suppose that we have a 2 x 3 matrix M, and we use 2×2 average pooling with stride 1 to get x = [x1, x2]. What is ∂L/∂M12 ? (M12 is the element of M in the 1st row and 2nd column). Assume w1 = 0.5 and w2 = 1.

Answer :-

5. Suppose we use a neural network to classify images of handwritten digits (0-9). Which of the following activation functions is most suitable at the output layer?

- Softmax
- ReLU
- Tanh
- Sigmoid

Answer :-

6. Consider the following perceptron unit which takes two boolean (0/1) variables x and y as input and computes the boolean variable z according to the following function. Here w1 is the weight corresponding to x and w2 is the weight corresponding to y, T is the threshold of the perceptron.

Which of the following statement(s) is/are TRUE ?

- If w1 = 1, w2 = 1 and T = 1.5, then z = x ∧ y
- If w1 = 1, w2 = 1 and T = 0.5 then z = x ∨ y
- If w1 = 0, w2=-1 and T = 0.5 then z = ¬ y
- If w1 = -1, w2 = 1 and T = 0.5 then z = x →y

Answer :-

7. How many times does the size of the input decrease (i.e. what is the value of hbd/h’b’d’ where the initial input is of size h x b x d and the output is of size h’ x b ’x d’) after passing through a 3×3 MaxPool layer with stride 3? Size of the input is defined as height x breadth x depth.

(Round your answer off to the closest integer)

Answer :-

8. Which of the following statement(s) is(are) TRUE about DQNs for Atari games?

- They learn to play the games by watching videos of expert human players
- The DQN only has access to the video frames on the game screen and the reward function. It cannot directly observe the mechanics of the game simulator.
- They learn to play the games by random exploration
- The DQN only considers one video frame of the game at a time when playing the game

Answer :-

9. Which of the following is/are TRUE regarding Neural Networks ?

- Deep Neural Networks do not require humans experts to do feature engineering and can learn features on their own using training data
- Deep Neural Networks work well in settings when training data is scarce
- For the same number of parameters, in practice, neural networks that are tall + thin do better than networks that are fat + short.
- There is a certain class of continuous functions that can only be computed by tall + thin neural networks and not fat + short ones no matter how many neurons are available

Answer :-

10. Which of the following is/are TRUE ?

- Deep Neural Networks can learn to enhance the bias present in their training data
- Deep Neural Networks are inspired by human brains and hence learn human interpretable functions
- Deep Neural Networks are susceptible to adversarial attacks
- Deep Neural Networks are robust to all kinds of noises in input instances

Answer :-