Kolmogorov–Smirnov Test with Python

May 26, 2018

•

2 min read

Previously we have seen tests that can be used only with normally distributed data. But what if we don’t have normality assumptions about the data? In this article, we will cover one of the most popular among nonparametric tests — the Kolmogorov-Smirnov test(K-S test).

Statistic

At first, let’s introduce a statistic of K-S test.

It means we go through each point of the empirical distribution function of our sample and calculate the absolute difference between it and a corresponding value of population distribution function. The maximum of those differences is the value of the statistic.

Now let’s make a simulation. Exists gamma-distributed population and we have a sample from it. What is the K-S statistic?

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

view raw k_s_statistic.ipynb hosted with ❤ by GitHub

Kolmogorov distribution function

As you may remember from the previous article, we use appropriated distribution to get critical value. Then we can make a conclusion. The Kolmogorov distribution it is the tricky one, but anyway :) The Kolmogorov distribution is the distribution of the random variable, where B(t) is the Brownian bridge

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

view raw kolmogorov.ipynb hosted with ❤ by GitHub

The tricky part is to get critical value for the specified significance level from this distribution. Therefore, I will take the critical value for the significance level of 0.05 and a sample size of 80 from the table, it equal to 0.152. The null hypothesis is rejected if:

Radzion.com

Software engineer, entrepreneur and content creator

🇬🇪 Georgia

Projects

Increaser

BooksConcepts

RadzionKit

Contacts

Content

Dev Channel

Personal Channel

Twitter

Blog