diff --git a/resources/arxmliv-embeddings-082017.md b/resources/arxmliv-embeddings-082017.md index d40197ccb8dc5f3e97c1966b5a1f1ab46f800b6f..d769253e05e11200d75e5c4cd9b55d8cf90f86d4 100644 --- a/resources/arxmliv-embeddings-082017.md +++ b/resources/arxmliv-embeddings-082017.md @@ -19,11 +19,11 @@ articles, the right of distribution was only given (or assumed) to arXiv itself. ### Contents - A 5 billion token model for the arXMLiv 08.2017 dataset - - `glove.arxmliv.5B.300d.zip` and `vocab.arxmliv.zip` - - 300 dimensional GloVe word embeddings for the arXMLiv 08.2017 dataset - `token_model.zip` - - subset word embeddings - - `glove.subset.zip` + - 300 dimensional GloVe word embeddings for the arXMLiv 08.2017 dataset + - `glove.arxmliv.5B.300d.zip` and `vocab.arxmliv.zip` + - 300d GloVe word embeddings for individual subsets + - `glove.subsets.zip` - the main arXMLiv dataset is available separately [here](/resources/arxmliv-dataset-082017/) #### Token Model Statistics