mercredi 12 août 2020

Efficient algorithm for large pairwise comparissons

I have a question regarding handling very large vectors in R. Let's assume that I have a dataset with a certain (continous) variable of interest and a factor variable which gives certain levels for a category (e.g. cities, counties, neightbourhood, etc) and I like to examine the distribution of the differences of the continous variable and their statistical differences between all the factors. If the number of categories is very large, the amount of pairwise comparisson that I have to test might be virtually infinite. For instance, let's assume I have 500 categories, then the number of comparissons is factorial(500) . Is there a way to make such analysis preventing R to run out of memory? Or which is the most efficient way to do so?

Kind regards in advance.

Aucun commentaire:

Enregistrer un commentaire