testing: KS-Test with discrete distributions

I want to run a Kolmogorov-Smirnov test to check if my sample comes from a discrete-uniform distribution. More specifically, I use KS-Test in context of Benford's Law, which assume that third or forth digits of numbers should follow a discrete-uniform distribution.

Basically, my sample looks like this:

x = c(0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,8,9,9)

I've noticed the disc_ks_test Function of KSgeneral package for using KS-Test for discrete distributions. I've noticed as well, that the more common ks.test function of dgof package is now able to test discrete distributions as well (based on the paper by Arnold/Emerson). My problem is, that both tests provide different test-statistics (and p-Values) and i am not sure which one is correct. In this example:

ks.test(x,ecdf(0:9))
D = 0.1381, p-Value = 0.8181

and

disc_ks_test(x,ecdf(0:9))
D = 0.038095, p-Value = 0.9996

So, I calculated the test by hand in Excel and figured out where the functions differ:

Calculations of D in Excel

I am pretty sure, that for continous distributions the test-statistic D is the Sepremum (or Maximum) of the last two columns of the Excel spreadsheet (that's what ks.test is doing). disc_ks_test just take the maximum of Abs(F0-Fn) as test-statistics, but the results for real data are much more consistent with results of other tests (Chi-Square). Now I wonder, which R function is correct or if there is an theoretical explanation, why the test-statistics of KS-test is calculated different if I test on discrete distributions.

testing

mardi 18 février 2020

KS-Test with discrete distributions

Aucun commentaire:

Enregistrer un commentaire