dimanche 17 avril 2016

Collection of code for source code authorship identification

I'm looking for a large collection of code which I could use for authorship attribution using Latent Dirichlet Allocation. I'm not really sure how much code I need.

It would be great if the code is relatively pure author wise and there is a reasonable amount of code for each author, i.e large blocks of code written by a single person.

Thank you.

Aucun commentaire:

Enregistrer un commentaire