An Efficient way to Calculate loss function batch-wise?

Question

An Efficient way to Calculate loss function batch-wise?

asked Jan 29, 2022 in Education by JackTerrance

I am using autoencoders to do anomaly detection. So, I have finished training my model and now I want to calculate the reconstruction loss for each entry in the dataset. so that I can assign anomalies to data points with high reconstruction loss. This is my current code to calculate the reconstruction loss But this is really slow. By my estimation, it should take 5 hours to go through the dataset whereas training one epoch occurs in approx 55 mins. I feel that converting to tensor operation is bottlenecking the code, but I can't find a better way to do it. I've tried changing the batch sizes but it does not make much of a difference. I have to use the convert to tensor part because K.eval is throwing an error if I do it normally. python for i in range(0, encoded_dataset.shape[0], batch_size): y_true=tf.convert_to_tensor(encoded_dataset[i:i+batch_size].values,np.float32) y_pred=tf.convert_to_tensor(ae1.predict(encoded_dataset[i:i+batch_size].values),np.float32) # Append the batch losses (numpy array) to the list reconstruction_loss_transaction.append(K.eval(loss_function( y_true, y_pred))) I was able to train in 55 mins per epoch. So I feel prediction should not take 5 hours per epoch. encoded_dataset is a variable that has the entire dataset in main memory as a data frame. I am using Azure VM instance. K.eval(loss_function(y_true,y_pred) is to find the loss for each row of the batch So y_true will be of size (batch_size,2000) and so will y_pred K.eval(loss_function(y_true,y_pred) will give me an output of (batch_size,1) evaluating binary cross-entropy on each row of y _true and y_pred Select the correct answer from above options

1 Answer

answered Jan 29, 2022 by JackTerrance

Best answer

The answer depends on how the loss function is implemented. Both the functions i.e. ae1.predict and K.eval(loss_function) will produce perfectly valid and identical results in TensorFlow under the hood. You could take the average of the loss over the batch before taking the gradient w.r.t. the loss, or take the gradient w.r.t. a vector of losses. The gradient operation in TensorFlow will perform the averaging of the losses for you. You could define your own loss if Keras implements the loss function with reduce_mean that is built into the loss function. If you're using square loss i.e. replacing 'mean_squared_error' with lambda y_true, y_pred then the tf.square(y_pred - y_true) function will produce a square error instead of mean squared error(no difference to the gradient). Interested in learning Artificial Intelligence? Click to learn more Artificial Intelligence Course!

Related questions

0 votes

Q: How to log Keras loss output to a file

When you run a Keras neural network model you might see something like this in the console: Epoch 1/3 6/1000 [. ... to a file. Thanks! Select the correct answer from above options...

asked Jan 31, 2022 in Education by JackTerrance

0 votes

Q: How big should batch size and number of epochs be when fitting a model in Keras?

I am training on 970 samples and validating on 243 samples. How big should batch size and number of epochs be ... on data input size? Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: Instance Normalisation vs Batch normalisation

I understand that Batch Normalisation helps in faster training by turning the activation towards unit Gaussian ... normalization. Select the correct answer from above options...

asked Jan 29, 2022 in Education by JackTerrance

0 votes

Q: How to load a model from an HDF5 file in Keras?

How to load a model from an HDF5 file in Keras? What I tried: model = Sequential() model.add(Dense(64, ... list index out of range Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: 'Conda' is not recognized as an internal or external command

I installed Anaconda3 4.4.0 (32 bit) on my Windows 7 Professional machine and imported NumPy and Pandas on Jupyter ... I make it work? Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: Why do we have to normalize the input for an artificial neural network?

It is a principal question, regarding the theory of neural networks: Why do we have to normalize the input for ... is not normalized? Select the correct answer from above options...

asked Jan 27, 2022 in Education by JackTerrance

0 votes

Q: Detecting patterns in waves

I'm trying to read an image from electrocardiography and detect each one of the main waves in it (P wave, QRS ... some ideas? Thanks! Select the correct answer from above options...

asked Feb 8, 2022 in Education by JackTerrance

0 votes

Q: What is the difference between np.mean and tf.reduce_mean?

In the MNIST beginner tutorial, there is the statement accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) tf ... (x,1)? Select the correct answer from above options...

asked Feb 8, 2022 in Education by JackTerrance

0 votes

Q: How to get Tensorflow tensor dimensions (shape) as int values?

Suppose I have a Tensorflow tensor. How do I get the dimensions (shape) of the tensor as integer values? I ... 'Dimension' instead. Select the correct answer from above options...

asked Feb 8, 2022 in Education by JackTerrance

0 votes

Q: How to get most informative features for scikit-learn classifiers?

The classifiers in machine learning packages like liblinear and nltk offer a method show_most_informative_features(), which ... lot! Select the correct answer from above options...

asked Feb 4, 2022 in Education by JackTerrance

0 votes

Q: How to approach a number guessing game (with a twist) algorithm?

I am learning programming (Python and algorithms) and was trying to work on a project that I find interesting. ... is impossible). Select the correct answer from above options...

asked Feb 2, 2022 in Education by JackTerrance

0 votes

Q: Plotting decision boundary for High Dimension Data

I am building a model for binary classification problem where each of my data points is of 300 dimensions (I am ... the 300 dim space? Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: Scikit-learn's LabelBinarizer vs. OneHotEncoder

What is the difference between the two? It seems that both create new columns, in which their number is equal to ... they are in. Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: ValueError: Wrong number of items passed - Meaning and suggestions?

I am receiving the error: ValueError: Wrong number of items passed 3, placement implies 1, and I am struggling to ... 'sigma'] = sigma Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance

0 votes

Q: How to tell which Keras model is better?

I don't understand which accuracy in the output to use to compare my 2 Keras models to see which one is better. ... - val_acc: 0.7531 Select the correct answer from above options...

asked Feb 1, 2022 in Education by JackTerrance