I'm looking for a large collection of code which I could use for authorship attribution using Latent Dirichlet Allocation. I'm not really sure how much code I need.
It would be great if the code is relatively pure author wise and there is a reasonable amount of code for each author, i.e large blocks of code written by a single person.
Thank you.
Aucun commentaire:
Enregistrer un commentaire