I ask for help for the following: csv file: 7.5 GB with 185 million lines.
So far, I've done the following:
library(caTools)
library(data.table)
library(dplyr)
dados_treino <- fread('train.csv')
vetor_TF <- sample.split(dados_treino, SplitRatio = 0.70)
At this point, R Studio returns error:
Can not allocate vector size 7.5 GB
The intent is to split the object into training and test data.
I ask for help for: 1) able to use the command sample (it may be of a different package from CATOOLS); 2) apply the vector constructed in the two sets of data
Follow the link to the data: download data
I am using a computer with 16 GB RAM and Intel i7 processor
Aucun commentaire:
Enregistrer un commentaire