updated 2019 embeddings handling apostrophe tokenization better
Compare changes
+ 97
− 77
@@ -48,7 +48,7 @@ articles, the right of distribution was only given (or assumed) to arXiv itself.
@@ -59,7 +59,7 @@ Please cite the main dataset when using the word embeddings, as they are generat
@@ -71,7 +71,7 @@ Please cite the main dataset when using the word embeddings, as they are generat
@@ -80,6 +80,7 @@ Please cite the main dataset when using the word embeddings, as they are generat
@@ -91,8 +92,8 @@ Please cite the main dataset when using the word embeddings, as they are generat
@@ -110,23 +111,23 @@ python2 eval/python/word_analogy.py --vocab_file vocab.arxmliv.txt --vectors_fil
@@ -139,81 +140,92 @@ python2 eval/python/distance.py --vocab_file vocab.arxmliv.txt --vectors_file gl
@@ -221,80 +233,88 @@ python2 eval/python/distance.py --vocab_file vocab.arxmliv.txt --vectors_file gl
\ No newline at end of file