Resources

Some resources for computational work on Russian morphology and phonology

Russians digitize everything and put it online. This makes corpus work on the language easy.

Probably the most useful starter resource for someone interested in Russian morphophonology is the 1977 classic Zaliznjak’s dictionary (Grammaticeskij slovar’ russkogo jazyka [A grammatical dictionary of the Russian language], Moscow: Russkij Jazyk). This is available in a number of formats online:

  • A reverse list of forms. Sort of like a rhyming dictionary.
  • A full list of Zaliznjak’s Paradigms (TXT in RAR file): Close to 90,000 inflected forms of Russian words, with stress marked. Automatically generated from Zaliznjak’s dictionary by Andrei Usachev. Contains a few errors (some ungrammatical short forms of adjectives are given, and paradigm gaps are sometimes filled incorrectly) but otherwise quite useful.
  • Online database version: Enter a word, and this database returns all of the grammatical codes of the 1977 original, including stress type, declension class, etc.
  • Downloadable version. This is a Windows .exe file, but you can extract a .dbf file that contains all of the information in the online version. DBF files can then be imported into R in any OS.

Frequency and corpus searches

  • Ruscorpora: A searchable web corpus of Russian texts and spoken speech.
  • Serge Sharoff’s frequency and lemma lists; includes things like bi-, tri- and tetragram lists (orthographic strings of various length, ordered by frequency of occurence). You probably cannot use this unless you can read Russian or are at least comforable with Cyrillic.
  • Frequency Dictionaries:Russian lg page/English translation Some small frequency dictionaries of Russian, including a lemma frequency list for the 5000 most frequent words and some information about average word length and so on.
  • Yandex. The dominant Russian search engine. By default, it searches for Russian words in all case forms, so you get estimated lemma counts.

Miscellaneous dictionaries

  • Academic Dictionaries: Online dictionaries including Ozhegov, Efremova, Vasmer’s etymological dictionary, Dahl, and many, many others (did you know philatelists had their own dictionary?). Comprehensive, accurate, and UTF-8 encoded.
  • Downloadable Dictionaries: A collection of links of downloadable dictionaries. Not all are accurate or complete (for example, Zaliznjak’s dictionary does not appear in its complete form, but you can reconstruct the information from the parts that are there).
  • Lyokhin & Petrov’s Dictionary of Loanwords: Searchable online version.
  • Rosenthall’s Spelling Reference: Everything you needed to know about the quirks of Russian orthography and punctuation. Some useful discussions of vowel and consonant alternations represented in the orthography but not in spoken speech.
  • Akhmanova’s Dictionary of Linguistic Terminology. In Russian.

Linguistic document preparation

LyX and LaTeX

Bibliography and texmf files

  • My texmf directory with all the LaTeX packages and styles I use for linguistics work.
  • My .bib file–mostly this is about phonology, Russian, and morphology; 4800+ entries. Zipped.

Working with the International Phonetic Alphabet

  • IPA_SIL An IPA keyboard layout for Mac OS that I find to be the most intuitive and user-friendly. The .zip file includes documentation. This layout was originally distributed by sil.org but seems to have been retired.
  • Using IPA fonts and keyboard layouts Meant to be pragmatic, not comprehensive. If you want to learn all about Unicode or legacy fonts, your best bet is to search the internet superhighway on your own.

Trees

Praat

An introduction to Praat’s basic functions, for the Sound & Language course.

Misc