- Adding rendered images for words on Memrise (Python script)
- Rendering text to PNG image files (Python script)
- Devanāgarī resources
- Tocharian resources
This is a collection of resources I have created/compiled for several (image based) Memrise courses, that teach alphabets such as Devanāgarī.
Adding rendered images for words on Memrise (Python script)
This is a script with a Graphical User Interface (GUI) that allows you to access the editing page of Memrise course through a browser window and render the words of the course's levels, adding those rendered images to the words in a (pre-existing) image column. The rendering engine can be selected through the GUI and can be either Pango or XeLaTeX. The script only processes those course levels in the browser window that are open/unfolded.
usage: render_words_on_memrise.py
Besides Python 3, the script relies on further third party software, in order to access Memrise through a browser window and to render the images. Therefore, the following programs need to be available on your system and callable by the script (by either having their executables in the working directory of the script, or by having their executables included in the operating system's PATH variable).
- For access to Memrise through a browser window, either of the following programs needs to be available:
- ChromeDriver for use with the Google Chrome browser.
- geckodriver for use with the Firefox browser.
- For rendering an item to a PNG image file that is subsequently uploaded to Memrise, the following programs need to be available:
- ImageMagick.
- Additionally, in case XeLaTeX is chosen as the rendering engine, a TeX distribution and Ghostscript are required.
Rendering text to PNG image files (Python script)
This is a script that generates PNG image files for a set of characters, as specified in a JSON file. It calls ImageMagick for creating the PNGs and uses either Pango or XeLaTeX to render the characters.
usage: render_strings_to_png [-h] --engine {pango,xelatex}
JsonSpecificationFile
The same prerequisites as for the previous script apply.
The JSON specification has two main attributes: settings
and subsets
. They are further specified below. The top level structure of the JSON file is therefore:
{
"settings": {...},
"subsets": {...}
}
This contains specific settings for the data set:
- The name for this character set (which will be used to create an output directory)
- The default font (optional) to be used for rendering
- A skeleton for the Pango argument string to be fed to ImageMagick (in the case that the Pango rendering mode is selected at the command line).
- A skeleton for the XeLaTeX code to be executed to an intermediate PDF (in the case that the XeLaTeX rendering mode is selected at the command line).
For the latter two, the character string to be rendered will be inserted into the skeleton string at the point marked by {0}
and the font name will be inserted at the point marked by {1}
.
{
"name": "Devanāgarī",
"defaultFont": "Siddhanta",
"pango": "pango:<markup><span font_family=\"{1}\" size=\"192000\"> {0} </span></markup>",
"xelatex": "\\documentclass{{minimal}}\n\\usepackage{{fontspec}}\n\\usepackage{{xcolor}}\n\\setmainfont[Script=Devanagari]{{{1}}}\n\\begin{{document}}\n{0}\n\\end{{document}}"
}
This contains subsets of character data, all to be rendered to individual subfolders. Each subset is an array of character string objects.
Each character string object is identified by a name
and can have multiple renditions
, i.e. an array of multiple rendering specifications (see more below). It can have an optional attribute alts
to record any alternative names in an array of strings (for Memrise), but the script at present does nothing with this information.
{
"vowels":
[
{
"name": "a",
"renditions": [
{ "utf8": "अ" },
{ "utf8": "अ", "font": "Siddhanta2" }
]
},
{
"name": "ā",
"renditions": [
{ "utf8": "आ" },
{ "utf8": "आ", "font": "Siddhanta2" }
]
}
],
"consonants": [...]
}
As can be seen above, the rendering specification in its most basic form contains only a UTF-8 string. A font
can be specified, but if it is absent, the defaultFont
specified in the settings
is simply fallen back on.
Instead of a UTF-8 string, the specification can also contain an explicit pango
or xelatex
code string to be used instead (in their respective modes). In the case of Pango rendering mode, also the attributes pango-flip
and pango-flop
are available and if set to true, will be set as flags for the rendering call to Pango (which will flip the image in the vertical or horizontal direction, respectively).
{
"name": "a-mirrored",
"renditions":
[
{
"utf8": "अ",
"pango-flop": true,
"xelatex": "\\reflectbox{अ}"
}
]
},
{
"name": "a-red",
"renditions":
[
{
"pango": "<span color=\"red\">अ</span>",
"xelatex": "{\\color{red}अ}"
}
]
}
Note that – depending on the rendering mode – at least one of the attributes utf8
, pango
or xelatex
should be present.
The associated Memrise course is Sanskrit devanāgarī.
- This PDF contains a digitized version of the first chapter of Jan Gonda's A Concise Elementary Grammar of the Sanskrit Language (2nd edition, 2006, ISBN-13 978-08173-5261-5), in which the devanāgarī script is introduced. Unfortunately, even the most recent printed edition of this book is only a facsimile of the 1966 original. For that reason, some of the devanāgarī characters are either hard to read or awkwardly typeset (see also the original on google books). For that reason I found it reasonable to reproduce that part, using modern technology (like Unicode fonts) so that it can be properly read.
- The fonts used for generating the images for Memrise (as well as in the document mentioned above) are mainly those of the Siddhanta font family, created by Mihail Bayaryn (Міхаіл Баярын) and available from his site under a Creative Commons license.
- Honourable mention for the program Itranslator 2003, which is a great help for transliterating ASCII text in ITRANS notation to devanāgarī text. Its font Sanskrit 2003 helped to create an earlier version of the course.
The associated Memrise course is Tocharian Brahmi script.
- The course image is a scan of a Tocharian manuscript, taken from CEToM THT 94
- One set of images for the characters come from the Tocharisches Elementarbuch (Band 1) by Wolfgang Krause and Werner Thomas (1960), page 41.
To make these images bigger while smoothing the edges slightly, one could use ImageMagick:magick fileIn.png -bordercolor none -border 10 -antialias -resize 250% -gaussian-blur 4x2 -trim fileOut.png
- Another set of images are taken from Lee Wilson's proposal to encode the Tocharian script in Unicode, which seems to have been approved. Once included in Unicode, the Tocharian script should receive a proper font in Google's Noto project.