in Education by
How is the convolution operation carried out when multiple channels are present at the input layer? (e.g. RGB) After doing some reading on the architecture/implementation of a CNN I understand that each neuron in a feature map references NxM pixels of an image as defined by the kernel size. Each pixel is then factored by the feature maps learned NxM weight set (the kernel/filter), summed, and input into an activation function. For a simple greyscale image, I imagine the operation would be something to adhere to the following pseudo-code: for i in range(0, image_width-kernel_width+1): for j in range(0, image_height-kernel_height+1): for x in range(0, kernel_width): for y in range(0, kernel_height): sum += kernel[x,y] * image[i+x,j+y] feature_map[i,j] = act_func(sum) sum = 0.0 However, I don't understand how to extend this model to handle multiple channels. Are three separate weight sets required per feature map, shared between each color? Referencing this tutorial's 'Shared Weights' section: http://deeplearning.net/tutorial/lenet.html Each neuron in a feature map references layer m-1 with colors being referenced from separate neurons. I don't understand the relationship they are expressing here. Are the neurons kernels or pixels and why do they reference separate parts of the image? Based on my example, it would seem that a single neurons kernel is exclusive to a particular region in an image. Why have they split the RGB component over several regions? Select the correct answer from above options

1 Answer

0 votes
by
 
Best answer
Answering your first question, in such a case, you have one 2D kernel per input channel (plane). So you perform each convolution (2D Input, 2D kernel) separately and you sum the contributions which give the final output feature map. Referring to your second question, yes they share the same weights between each color. If you consider a given output feature map, you have 3 x 2D kernels (i.e one kernel per input channel). Each 2D kernel shares the same weights along the whole input channel (R, G, or B here). So the whole convolutional layer is a 4D-tensor (nb. input planes x nb. output planes x kernel width x kernel height). Why have they split the RGB component over several regions? They split so that they can have separate input plane and weights. Interested in learning Artificial Intelligence? Learn more from this AI Course!

Related questions

0 votes
    What is the difference between 1D, 2D and 3D convolutions in CNN? Please explain with examples. Select the correct answer from above options...
asked Jan 24, 2022 in Education by JackTerrance
0 votes
    I was wondering if you creative minds out there could think of some situations or applications in the web environment ... AI in games. Select the correct answer from above options...
asked Jan 26, 2022 in Education by JackTerrance
0 votes
    I'm learning the difference between the various machine learning algorithms. I understand that the implementations of ... for that? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
0 votes
    I read a few books and articles about Convolutional neural network, it seems I understand the concept but I don ... please help thanks. Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    I am trying to understand the role of the Flatten function in Keras. Below is my code, which is a simple two ... flatten it? Thanks! Select the correct answer from above options...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    I'm looking for some examples of robot/AI programming using Lisp. Are there any good online examples available ... in nature)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm teaching a kid programming, and am introducing some basic artificial intelligence concepts at the moment. To begin ... and boxes)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I am searching for information on algorithms to process text sentences or to follow a structure when creating sentences ... be great. Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm looking to try and write a chess AI. Is there something I can use on the .NET framework (or maybe ... making a chess game? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm writing a game that's a variant of Gomoku. Basically a tic tac toe on a huge board. Wondering if anyone ... [self put randomly]; } Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm Working on document classification tasks in java. Both algorithms came highly recommended, what are the ... Processing tasks? Select the correct answer from above options...
asked Feb 2, 2022 in Education by JackTerrance
0 votes
    I am a little confused about the Hill Climbing algorithm. I want to "run" the algorithm until I found the ... question is too simple. Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    Everybody. I am entirely new to the topic of classification algorithms, and need a few good pointers about where to ... Hints, anyone? Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    I just started with machine learning. I want to know about the applications of machine learning. I know we ... recent applications. Select the correct answer from above options...
asked Jan 26, 2022 in Education by JackTerrance
0 votes
    What is the role of Flatten in Keras. I am executing the code below and it's a two layered network. The ... output is already flat? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
...