vendredi 31 juillet 2020

Generating test data in R

I am trying to generate this table as one of the inputs to a test.

        id                 diff          d
 1:      1                    2 2020-07-31
 2:      1                    1 2020-08-01
 3:      1                    1 2020-08-02
 4:      1                    1 2020-08-03
 5:      1                    1 2020-08-04
 6:      2                    2 2020-07-31
 7:      2                    1 2020-08-01
 8:      2                    1 2020-08-02
 9:      2                    1 2020-08-03
10:      2                    1 2020-08-04
11:      3                    2 2020-07-31
12:      3                    1 2020-08-01
13:      3                    1 2020-08-02
14:      3                    1 2020-08-03
15:      3                    1 2020-08-04
16:      4                    2 2020-07-31
17:      4                    1 2020-08-01
18:      4                    1 2020-08-02
19:      4                    1 2020-08-03
20:      4                    1 2020-08-04
21:      5                    2 2020-07-31
22:      5                    1 2020-08-01
23:      5                    1 2020-08-02
24:      5                    1 2020-08-03
25:      5                    1 2020-08-04
        id                 diff          d

I have done it like this -

input1 = data.table(id=as.character(1:5), diff=1)
input1 = input1[,.(d=seq(as.Date('2020-07-31'), by='days', length.out = 5)),.(id, diff)]
input1[d == '2020-07-31']$diff = 2

diff is basically the number of days to the next weekday. Eg. 31st Jul 2020 is Friday. Hence diff is 2 which is the diff to the next weekday, Monday. For the others it will be 1.

  • Is there a more R idiomatic way of doing this ?

I personally dont like that I had to generate the date sequence for each of the ids separately or the hardcoding of the diff that I have to do in the input for 31st July. Is there a more generic way of doing this without the hardcoding?

Aucun commentaire:

Enregistrer un commentaire