in Education by
I have trouble understanding the difference (if there is one) between roc_auc_score() and auc() in scikit-learn. I'm trying to predict a binary output with imbalanced classes (around 1.5% for Y=1). Classifier model_logit = LogisticRegression(class_weight='auto') model_logit.fit(X_train_ridge, Y_train) Roc curve false_positive_rate, true_positive_rate, thresholds = roc_curve(Y_test, clf.predict_proba(xtest)[:,1]) AUC's auc(false_positive_rate, true_positive_rate) Out[490]: 0.82338034042531527 and roc_auc_score(Y_test, clf.predict(xtest)) Out[493]: 0.75944737191205602 Somebody can explain this difference? I thought both were just calculating the area under the ROC curve. Might be because of the imbalanced dataset but I could not figure out why. Thanks! Select the correct answer from above options

1 Answer

0 votes
by
 
Best answer
When we need to check or visualize the performance of the multi-class classification problem, we use the AUC (Area Under The Curve) ROC (Receiver Operating Characteristics) curve. AUC is not always area under the curve of an ROC curve. Area Under the Curve is an (abstract) area under some curve, so it is a more general thing than AUROC. With imbalanced classes, it may be better to find AUC for a precision-recall curve. See sklearn source for roc_auc_score: def roc_auc_score(y_true, y_score, average="macro", sample_weight=None): def _binary_roc_auc_score(y_true, y_score, sample_weight=None): fpr, tpr, tresholds = roc_curve(y_true, y_score, sample_weight=sample_weight) return auc(fpr, tpr, reorder=True) return _average_binary_score( _binary_roc_auc_score, y_true, y_score, average, sample_weight=sample_weight) In the above code, this first get a roc curve and then calls auc() to get the area. I guess your problem is the predict_proba() call. For a normal predict() the outputs are always the same: For example: import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.metrics import roc_curve, auc, roc_auc_score est = LogisticRegression(class_weight='auto') X = np.random.rand(10, 2) y = np.random.randint(2, size=10) est.fit(X, y) false_positive_rate, true_positive_rate, thresholds = roc_curve(y, est.predict(X)) print auc(false_positive_rate, true_positive_rate) # 0.857142857143 print roc_auc_score(y, est.predict(X)) # 0.857142857143 If you change the above for this, you'll sometimes get different outputs: false_positive_rate, true_positive_rate, thresholds = roc_curve(y, est.predict_proba(X)[:,1]) # may differ print auc(false_positive_rate, true_positive_rate) print roc_auc_score(y, est.predict(X)) Hope this answer helps.

Related questions

0 votes
    I have a simple NN model for detecting hand-written digits from a 28x28px image written in python using Keras: ... that actually means? Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I'm Working on document classification tasks in java. Both algorithms came highly recommended, what are the ... Processing tasks? Select the correct answer from above options...
asked Feb 2, 2022 in Education by JackTerrance
0 votes
    I'm learning the difference between the various machine learning algorithms. I understand that the implementations of ... for that? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
0 votes
    I just started with machine learning. I want to know about the applications of machine learning. I know we ... recent applications. Select the correct answer from above options...
asked Jan 26, 2022 in Education by JackTerrance
0 votes
    What is the role of Flatten in Keras. I am executing the code below and it's a two layered network. The ... output is already flat? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
0 votes
    Every time I use binary_crossentropy there's ~80% acc and when I use categorical_crossentrop there's ~50% acc. And I ... should I use? Select the correct answer from above options...
asked Jan 24, 2022 in Education by JackTerrance
0 votes
    I want to save the history to a file, in Keras I have model.fit history = model.fit(Q_train, W_train, ... =(Q_test, W_test)) Select the correct answer from above options...
asked Jan 24, 2022 in Education by JackTerrance
0 votes
    I'm looking for some examples of robot/AI programming using Lisp. Are there any good online examples available ... in nature)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm teaching a kid programming, and am introducing some basic artificial intelligence concepts at the moment. To begin ... and boxes)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    In the MNIST beginner tutorial, there is the statement accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) tf ... (x,1)? Select the correct answer from above options...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    What is the difference between informed and uninformed searches? Can you explain this with some examples? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    From what I've read so far they seem very similar. Differential evolution uses floating point numbers instead, and ... of both. Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    I am looking for an open source neural network library. So far, I have looked at FANN, WEKA, and OpenNN. Are the ... , and ease of use. Select the correct answer from above options...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    Like lots of you guys on SO, I often write in several languages. And when it comes to planning stuff, (or ... to this being possible? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm working through my AI textbook I got and I've come to the last homework problem for my section: "Implement the ... in C# or Java? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
...