Dataset
Early Slavic word embeddings [Data set]
- Documentation:
- Word embeddings trained on the lemmatised TOROT Treebank, using Word2Vec and the following parameters: sg = True min_count = <1,3,5> window = <3,5> vector_size = <100,200,300> epochs = 5 One model was trained for each combination of the parameters enclosed in angled brackets (< >). The release contains both the full models (.model) and the plain vector files (_vectors.txt). The models are named according to the parameters they were trained with. Note that these are the result of very preliminary experiments and no systematic evaluation of their quality was carried out, so use with caution.
Actions
Access Document
- Files:
-
-
(Version of record, zip, 438.6MB, Terms of use)
-
- Publisher copy:
- 10.5281/zenodo.8414137
- Publication website:
- https://doi.org/10.5281/zenodo.8414137
Authors/Creators
+ Economic and Social Research Council
More from this funder
- Funder identifier:
- https://ror.org/03n0ht308
- Grant:
- 2266900
- Publisher:
- Zenodo
- Publication date:
- 2023
- Digital storage location:
- https://doi.org/10.5281/zenodo.8414137
- DOI:
- Language:
-
English
- Keywords:
- Subjects:
- Pubs id:
-
2389295
- Local pid:
-
pubs:2389295
- Deposit date:
-
2026-03-14
- ARK identifier:
Terms of use
- Copyright holder:
- Nilo Pedrazzini
- Copyright date:
- 2023
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record