jeudi 2 mars 2017

Statistical testing for unique combinations from a data frame

For the example data frame below I want to perform statistical tests (e.g. t-test) for all unique DRUG - ADR combinations. For this I need the following:

1) a vector of X for each unique DRUG - ADR combination

2) If my DRUG - ADR of interest is D1 - A1, I want to test the vector of X (here 34) with the vectors:

  • D1 - all A's that are not A1 (in the example D1-A2, x = 37)
  • A1 - all D's that are not D1 (in the example D4-A1, x = 65)

This procedure should loop through all records in the data frame and should disregard the ID variable since one ID can have several DRUG - ADR combinations. Obviously, my dataset is much larger and the resulting vectors from X will contain more than 1 value

dat <- data.frame(ID=c(1,2,3,4,4,4,5,6,6,7),
                  DRUG=c("D1","D2","D2","D3","D3","D3","D5","D1","D4","D2"),
                  ADR=c("A1","A2","A3","A6","A7","A8","A4","A2","A1","A5"),
                  X=c(34,76,34,45,2,41,56,37,65,12))


   ID DRUG ADR  X
1   1   D1  A1 34
2   2   D2  A2 76
3   3   D2  A3 34
4   4   D3  A6 45
5   4   D3  A7  2
6   4   D3  A8 41
7   5   D5  A4 56
8   6   D1  A2 37
9   6   D4  A1 65
10  7   D2  A5 12

Looking forward to your suggestions!

Aucun commentaire:

Enregistrer un commentaire