Introduction to Non-parametric Statistical Significance Tests in Python
In applied machine learning, we often need to determine whether two data samples have the same or different distributions.
We can answer this question using statistical significance tests that can quantify the likelihood that the samples have the same distribution.
If the data does not have the familiar Gaussian distribution, we must resort to nonparametric version of the significance tests. These tests operate in a similar manner,
but are distribution free, requiring that real valued data be first transformed into rank data before the test can be performed.
In this tutorial, you will discover nonparametric statistical tests that you can use to determine if data samples were drawn from populations with the same or different distributions.
After completing this tutorial, you will know:
The Mann-Whitney U test for comparing independent data samples: the nonparametric version of the Student t-test.
The Wilcoxon signed-rank test for comparing paired data samples: the nonparametric version of the paired Student t-test.
The Kruskal-Wallis H and Friedman tests for comparing more than two data samples: the nonparametric version of the ANOVA and repeated measures ANOVA tests.