AI Collection

Dataset

Early Slavic word embeddings [Data set]

Documentation:: Word embeddings trained on the lemmatised TOROT Treebank, using Word2Vec and the following parameters: sg = True min_count = <1,3,5> window = <3,5> vector_size = <100,200,300> epochs = 5 One model was trained for each combination of the parameters enclosed in angled brackets (< >). The release contains both the full models (.model) and the plain vector files (_vectors.txt). The models are named according to the parameters they were trained with. Note that these are the result of very preliminary experiments and no systematic evaluation of their quality was carried out, so use with caution.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Pedrazzini, N. (2023). Early Slavic word embeddings [Data set]. Zenodo.

MLA Style

Pedrazzini, N. Early Slavic Word Embeddings [Data Set]. Zenodo, 2023.

Chicago Style

Pedrazzini, N. 2023. Early Slavic Word Embeddings [Data Set]. Zenodo.
Print

Access Document

Files:: 8414137.zip

(Version of record, zip, 438.6MB, Terms of use)

Publisher copy:: 10.5281/zenodo.8414137

Publication website:: https://doi.org/10.5281/zenodo.8414137

Authors/Creators

+ Pedrazzini, N More by this author/creator

Institution:: University of Oxford
Division:: HUMS
Department:: Linguistics Philology & Phonetics
Oxford college:: St Hugh's College
Role:: Creator
ORCID:: 0000-0003-3757-2961

+ Economic and Social Research Council More from this funder

Funder identifier:: https://ror.org/03n0ht308
Grant:: 2266900

Publisher:: Zenodo
Publication date:: 2023
Digital storage location:: https://doi.org/10.5281/zenodo.8414137
DOI:: 10.5281/zenodo.8414137

Language:: English
Keywords:: Slavic languages

linguistics

history

computer science

natural language processing

philosophy
Subjects:: old church slavonic

early slavic

language model

word2vec

word embeddings

historical corpora
Pubs id:: 2389295
Local pid:: pubs:2389295
Deposit date:: 2026-03-14
ARK identifier:: ark:/29072/ora_ebcc67bcbaf941c3bf3dfdc74b74005f

Terms of use

Copyright holder:: Nilo Pedrazzini
Copyright date:: 2023

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP