Alternative title: Benchmarking

Overview by class of model

Classification

Accuracy
Precision
Recall (Sensitivity)
F1 Score

Regression

Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Mean Absolute Error (MAE)

Clustering

Time-series

Mean Absolute Percentage Error (MAPE)
Mean Absolute Scaled Error (MASE)
Symmetric Mean Absolute Percentage Error (SMAPE)

NLP

Perplexity

Generative model

Negative log-likelihood

Accuracy

Accuracy measures the proportion of correct predictions made by the model. It is calculated as the number of true positives (TP) and true negatives (TN) divided by the total number of samples:

Accuracy = (TP + TN) / (TP + TN + FP + FN)

where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives.

Precision

Precision measures the proportion of true positives among all positive predictions made by the model. It is calculated as the number of true positives (TP) divided by the sum of true positives (TP) and false positives (FP):

Precision = TP / (TP + FP)

Negatives and positives

	Actual Negative (0)	Actual Positive (1)
Predicted Negative (0)	True negative	False negative
Predicted Positive (1)	False positive	True positive

False Negative Rate: FN / (TP + FN) - woman is pregnant, but doctors says she isn’t.

False Positive Rate: FP / TN + FP - doctors says woman is pregnant, but she isn’t.

Louis' Notes

Explorer

Measurement