by Serge Winitzki

Contents

Motivation

This work is an effort of Alexy Khrabrov and Serge Winitzki to standardize the rendering of Russian text in Latin letters.

The idea is to provide enough alternative ways of rendering the Russian letters so that people can use different letter combinations according to their liking. However, our major concerns are consistency of coding and the ability to unambiguously and automatically render latinized Russian texts into various "native" forms (using any existing coding schemes such as KOI-8, PC or Macintosh) and vice versa.

At present time, people are using some kind of latinized phonetic representation of Russian, and different representations disagree with each other. This makes the conversion of such phonetically represented Russian texts into a native coding rather tedious. The development of a standard coding scheme will especially facilitate electronic communication in Russian language between those limited to text-mode terminals and keyboards and those using a "native" Russian encoding and a graphical representation ("Russian fonts").

Overview

The Russkaya Latinica coding scheme was designed with you, the user, in mind. What seems most intuitive to you is most probably implemented in our scheme. In fact, we are sure that when you typed Russian text using Latin letters you used almost exactly the same coding as the one we propose. The words which are difficult to transliterate are the ones Russkaya Latinica helps you with. Russkaya Latinica isn't just another set of difficult rules to memorize. It's a flexible standard accomodating a wide range of intuitive perceptions, and you definitely won't have to drastically change your typing. At the same time, it's rigorous in the sense that every Russian word can be represented exactly and unambiguously.

To make it all work for people, I created a software package to convert texts from Russkaya Latinica to native encodings and back, as well as between various existing native schemes; currently, we support KOI-8, KOI-7, DOS866 ("alternative"), DOS1252 (MS-Windows native), and Macintosh encodings.

Getting started

The standard we propose has only a few new rules you have to remember.
  • The letter "tshcha" is can be represented by a q or a w or a combination sj (which by itself does not occur in Russian; for reasons described below, there is an option to switch off translation of sj by "tshcha"). We hope you find at least one of those alternatives intuitive enough. If you are used to typing w for v, the translator program should have an option to accomodate this alternative; however, we suggest sticking to v.
  • The letter "oborotnoe E" is e' or e`, as in: e'tot. Also, a perhaps more intuitive symbol @ may be used for both uppercase and lowercase e', as in: @KRAN NE RABOTAET.
  • The hard sign is ~ and the soft sign is ' (or `). For example: ob~ezdit'.
  • You can use the symbols ~, ' and ` for the hard and the soft sign in both uppercase and lowercase words, since they are "malleable" (see below) and will become uppercase in all-uppercase words. For example, in the text "V~EZD TOL'KO DLYA PERSONALA", the hard and the soft signs will be translated to uppercase.

    That's it! Now you know enough to use Russkaya Latinica. Most probably, the phonetic code that feels right to you is compatible with our rules.

    Examples of usage

    Let us show a few examples to illustrate the proposed scheme.

    Bystraja Ryzhaja Lisa Prygnula Cherez Lenivuju Sobaku.
    (translation of "A Quick Brown Fox Jumped Over the Lazy Dog")

    [Russian text]
    Odin brityj anglichanin finiki zheval kak morkov'.

    [Russian text]
    Kazhdyj ohotnik zhelajet znat', gde sidit fazan.

    [Russian text]
    Zaqiqajuqajasya zhenwina zhevala szhizhennyj zhen'shen'.


    [Russian text]

    V'juwijsya plyuw ne zakryval vida s verandy na roskoshnyj plyazh,
    raskinuvshijsya na beregu zaliva. Vglyadyvajas' v ob~ektiv binoklya,
    maj'or Wukin skazal:
    - Chto-to malovato chajek segodnya. Mozhet, e'to iz-za holodov?
    - Kakie tam ewyo chajki! - ne otryvajas' ot zhurnala, prognusavil
    Jeryomenko. - Nam by e'togo nashego negodyaja za zhabry sxvatit'...


    Here is an example of a phonetic coding used by Alex Kaplan (INFO-RUSS). (Note that the below text is almost perfectly compatible with Russkaya Latinica, even though its author has never heard of it. The only problem is with the letter "tshcha".)

    Direktrisa Federal'noj migracionnoj sluzhby g-zha T. Regent izdala prikaz, zapreschajuschij predostavljat' status bezhencev (tochnee, vynuzhdennyh pereselencev) "licam chechenskoj nacional'nosti". Korrespondentu "Moskovskih novostej" ona zajavila, chto prikaz prinjat "pod davleniem sverhu".
    Zakony i konvencii, imejuschie silu zakona, kotorye etim narusheny, mozhno perechisljat' dolgo, i etot perechen' nachinaetsja s Konstitucii RF, zapreschajuschej dazhe "trebovat' ot grazhdan opredelenija ili ukazanija svoej nacional'noj [etnicheskoj] prinadlezhnosti". Zabavno, chto eta stat'ja mozhet byt' priostanovlena pri rezhime ChP (v otlichie ot takih dejstvitel'no fundamental'nyh svobod, kak pravo chastnoj sobstvennosti). Vprochem, ChP ne vvodilos'.

    We hope that you will like Russkaya Latinica enough to start using it. The more people stick to the standard, the easier it will become to communicate.

    Some advanced features

    Malleability

    The symbols for e' and the soft and hard signs (@, ' and ~) are lowercase by default, but they are transformed to uppercase when surrounded by uppercase letters. The precise definition of what "surrounded" means is found here. Malleability makes it easier to write all-uppercase captions.

    Caveat: The symbols @, ' and ~, when used alone, stand for lowercase letter e' and the lowercase soft and hard signs. If a stand-alone uppercase "E'" is needed, one cannot use a stand-alone "@" because that would be lowercase.

    The standalone uppercase soft and hard signs are never used in a Russian text. Should one need to use them, there is a special way to do this. All malleable symbols become uppercase when preceded by ^, and become lowercase when preceded by _.

    The malleability is akin to the property of two-letter combinations to become uppercase if the first letter is uppercase. For example, both SH and Sh mean the same uppercase letter "sha".

    Escape character

    The backslash character \ is used as an escape character that prevents other symbols from sticking together. For instance, if one writes *major, the letter j will stick to o to form the letter yo. To circumvent this, you can use "j'" instead of "j": maj'or. However, a more straightforward approach is to write maj\or. Generally, a backslash will produce no output but will break digraphs. It also escapes itself, i.e. \\ produces a single backslash.

    Another use of the backslash is to prevent quotes (` and ') from being translated as the soft sign (use \` and \' for quotes; at the beginning of word, the quote characters ` and ' don't have to be escaped, since no Russian word begins with a soft sign). Actually, all malleable characters - @ ~ ' ` - can be escaped using backslash (preventing their translation).

    How to combine Russian and English text

    It is important to be able to include some real Latin letters or a whole section in another language in a Russian text. This is accomplished yet again by the backslash. A backslash followed by one space works as a Russian/English toggle. For example:
    - E'to-to kak raz \ fine, \ - skazal ya. - E'to \ OK with me.\
    Note that the \ combination produces no output, not even a space, so you should add the spaces if necessary.

    After a backslash-space combination and until the next one, no translation whatsoever is performed on the text. This may be useful for entering English, TeX commands or other text that uses backslashes and special characters. If you need to enter native coded Russian TeX commands... well, you probably don't need to use Russkaja Latinica for that purpose.

    What to pay attention to

    When using an intuiutive phonetical scheme of their own, people usually don't pay attention to unambiguous representation of letters. Common examples include: using "ts" instead of a single letter "ce"; "ye" for a single letter "e" (for example, *Yeltsin); "y" for "i kratkoe"; "yo" for both "i kratkoe + o" as well as for a single letter "yo" (as in *major); "sch" for "tshcha" (as in *veschi); and "g" or "j" for "zh". While phonetically obvious, such coding cannot be unambiguously translated into a native Russian coding without tedious editting by hand. A user would probably feel unsure about transliterating these letters anyway, and we suggest to use the Russkaya Latinica standard in these difficult cases.

    Here are some suggestions on how to handle common difficulties of spelling.


    [HOME]Back to home page