Alternative title: Benchmarking
Overview by class of model
Classification
- Accuracy
- Precision
- Recall (Sensitivity)
- F1 Score
Regression
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE)
Clustering
Time-series
- Mean Absolute Percentage Error (MAPE)
- Mean Absolute Scaled Error (MASE)
- Symmetric Mean Absolute Percentage Error (SMAPE)
NLP
- Perplexity
Generative model
- Negative log-likelihood
Accuracy
Accuracy measures the proportion of correct predictions made by the model. It is calculated as the number of true positives (TP) and true negatives (TN) divided by the total number of samples:
Accuracy = (TP + TN) / (TP + TN + FP + FN)
where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives.
Precision
Precision measures the proportion of true positives among all positive predictions made by the model. It is calculated as the number of true positives (TP) divided by the sum of true positives (TP) and false positives (FP):
Precision = TP / (TP + FP)
Negatives and positives
Actual Negative (0) | Actual Positive (1) | |
---|---|---|
Predicted Negative (0) | True negative | False negative |
Predicted Positive (1) | False positive | True positive |
False Negative Rate: FN / (TP + FN) - woman is pregnant, but doctors says she isn’t.
False Positive Rate: FP / TN + FP - doctors says woman is pregnant, but she isn’t.