This dataset was extracted from a set of metadata files harvested from the DataCite metadata store (http://search.datacite.org/ui) during December 2015. Metadata records for items with a resourceType of dataset were collected. 1,647,949 total records were collected.
This dataset contains four files:
1) readme.txt: a readme file.
2) language-results.csv: A CSV file containing three columns: DOI, DOI prefix, and language text contents
3) language-counts.csv: A CSV file containing counts for unique language text content values.
4) language-grouped-counts.txt: A text file containing the results of manually grouping these language codes.