Write a function that will allow the user to input a vector of numerical values, with no missing values "the data", and a vector of 1's and 2's, representing two different groups that you want to compare. "the treatments". The number of 1's and 2's does not need to be equal. You may assume for now that treatment 2 has a higher mean than treatment 1.
The function will create the randomization distribution of differences, and plot them in a histogram. It will use the distribution to calculate the p-value -- the chance that the observed difference (or higher) could have occurred by chance. It will print the observed difference and the p-value, both rounded to 4 digits, using text:
"The observed difference is xxxx and the p-value is xxxx"
Using these two vectors I have determined how to get the differences but do not know how to put it into a function and implement a randomization test.
dat<- c(1,4,2,5,2,4,8,6,9,7) trt <- c(1,1,1,1,1,2,2,2,2,2)
How to find the observed difference: obsdiff <- mean(dat[trt == 2]) - mean(dat[trt == 1])
How to 'shuffle the treatments': trtsh <- sample(trt, size = length(trt))
How to find a difference simulated under the null hypothesis, i.e., difference in means for shuffled treatment 2 minus treatment 1: simdiff <- mean(dat[trtsh == 2]) - mean(dat[trtsh == 1])
The p-value using these vectors should be .011
Aucun commentaire:
Enregistrer un commentaire