feed forward neural network fails to classify due to dimensionality of biases

Question

feed forward neural network fails to classify due to dimensionality of biases

asked Jan 31, 2022 in Education by JackTerrance

I'm making a basic feedforward neural network to solve the XOR gate problem. Standard settings: input layer + hidden layer + output layer, the constant learning rate of 0.01 and the number of epochs is 500. Sigmoid activation all the way. Stochastic/Gradient descent for backpropagation. the hidden layer has 2 neurons. The input and output data: input = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]] output = [[0.0], [1.0], [1.0], [0.0]] Now here's the problem: I know bias is a (column) vector, and you complete a cycle (forward + back) on a sample data. The predictions after training look like this: ( 0.4954120458511844 ) ( 0.5081637529087711 ) ( 0.5153967874989785 ) ( 0.5653967874989785 ) Compared to when I set bias as a matrix (number of rows is input.rows) and instead train full sample data per cycle, the predictions are: ⎛ 0.18379659987542804 ⎞ ⎜ 0.8220424701617579 ⎥ ⎜ 0.8217815808742437 ⎥ ⎝ 0.18653256456589742 ⎠ which are the correct ones? I can post full code here, but I am certain the problem is from biases I just don't know why? EDIT As I said in comments, the reason may be from Backpropagation part (Stochastic Gradient Descent) Here's the full code (yes it's in Swift, don't ask why) and I am using Surge Matrix library It's LONG THOUGH: import Surge // XOR TABLE DATA let inputDataAsArray: [[Double]] = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]] let outputDataAsArray: [[Double]] = [[0.0], [1.0], [1.0], [0.0]] let inputData: Matrix = Matrix(inputDataAsArray) let outputData: Matrix = Matrix(outputDataAsArray) var inputData_samples : Array> = Array() var outputData_samples : Array> = Array() for i in 0..([inputDataAsArray[i]])) outputData_samples.append(Matrix([outputDataAsArray[i]])) } let size = inputData.rows let neurons = 2 // NUMBER OF NEURONS IN HIDDEN LAYER var weights0 : Matrix = random(rows: inputData.columns, columns: neurons) var biases0 : Matrix = Matrix(rows: 1, columns: neurons, repeatedValue: 0.0) var weights1 : Matrix = random(rows: neurons, columns: outputData.columns) var biases1 : Matrix = Matrix(rows: 1, columns: outputData.columns, repeatedValue: 0.0) print("Running...") let alpha = 0.01 let loops = size * 500 var sampleIndex = 0 for i in 0..(uncheckedBounds: (lower: 0, upper: size - 1))) let a0 = inputData_samples[j] let output = outputData_samples[j] let z1: Matrix = a0 * weights0 + biases0 let a1: Matrix = sigmoidMatrix(x: z1) // LAYER 2 let z2 : Matrix = a1 * weights1 + biases1 let a2 : Matrix = sigmoidMatrix(x: z2) // let cost = cross_entropy(size: size, a: a2, y: output) // BACKPROPAGATION // LAYER 2 var dz2 : Matrix = subtractMatrix(x: a2, y: output) let dw2 : Matrix = divideMatrix(x: transpose(a1) * dz2 , y: size) let db2 : Matrix = divideMatrix(x: dz2, y: size) // LAYER 1 dz2 = dz2 * transpose(weights1) let dz1 : Matrix = sub(y: 1.0, x: a0) * transpose(a0) * dz2 // multiply(x: part1, y: sub(y: 1.0, x: part2)) let dw1 : Matrix = divideMatrix(x: transpose(a0) * dz1 , y: size) let db1 : Matrix = divideMatrix(x: dz1, y: size) weights0 = subtractMatrix(x: weights0, y: mul(alpha, x: dw1)) biases0 = subtractMatrix(x: biases0, y: mul(alpha, x: db1)) weights1 = subtractMatrix(x: weights1, y: mul(alpha, x: dw2)) biases1 = subtractMatrix(x: biases1, y: mul(alpha, x: db2)) } for sample in inputData_samples{ let z1: Matrix = sample * weights0 + biases0 let a1: Matrix = sigmoidMatrix(x: z1) let z2 : Matrix = a1 * weights1 + biases1 let a2 : Matrix = sigmoidMatrix(x: z2) print(a2.description) } Select the correct answer from above options

1 Answer

answered Jan 31, 2022 by JackTerrance

Best answer

AND and OR are linearly separable but XOR’s outputs are not linearly separable. Therefore, we have to introduce another hidden layer to solve it. It turns out that each and every node in the hidden layer represents one of the simpler linearly separable logical operations (AND, OR, NAND) and the output layer will act as another logical operation that was fed by the outputs from the previous layer. To understand what logic our network uses to come up with results, we need to analyze it’s weights (and biases). We do that with model.get_weights(layer.W) to get the weights vector and model.get_weights(layer.W) to get the biases vector. You can refer to the link for the code: https://towardsdatascience.com/tflearn-soving-xor-with-a-2x2x1-feed-forward-neural-network-6c07d88689ed If you wish to know more about Neural Network then visit this Neural Network Tutorial.

feed forward neural network fails to classify due to dimensionality of biases

1 Answer

Related questions