Automatic translation of i18n messages for your products with AI.
Large language models are particularly good at languages. Let's use them to quickly make our products fully localized for international audiences.
- Support multiple target languages as long as the LLM support
- Idempotency:
- Only translated missing keys in target language, do nothing otherwise.
- Keys order is preserved as the input. This is important for version control.
- Models: OpenAI GPT. Others coming soon.
- Batch support to fit into LLM context size
- Support other file formats: properties, gettext
WIP
Before running vertens, you need have an OpenAI key configured as an environment variable OPENAI_API_KEY.
Given an input file in this format:
{
"key1": "Message 1",
"key2": "Message 2",
}
The following command translate input.json to fr, de languages. The result for each language is written into its own file in the output directory.
vertens --language fr --language de <path/to/input.json> <path/to/output_directory>
You can also view other parameters with help
vertens --help
i18next is a popular i18n solution, especially for ReactJS.
Vertens currently support only JSON format which has top level keys, like
{ "key1": "Message 1", "key2": "Message 2", }
but not deep nested keys, like:
{ "key1" : { "key11": "" } }
You can write a script to transform between these formats so that you can use with Vertens.
We can translate the top level keys format as:
vertens --language fr ./lang-en.json ./lang-fr.json
You can also specify a placeholder value to specify which messages to be translated if its key is already present in the target language file.
vertens --language fr --placeholder __STRING_NOT_TRANSLATED__ ./lang-en.json ./lang-fr.json
This is typically useful if you are using a tool like i18next-scanner which is very cool. It scans messages to translate or remove (if no longer used)
A loop in Bash would serve this purpose
#!/usr/bin/env bash
for lang in fr de
do
vertens --language $lang ./lang-en.json ./lang-$lang.json
done
If you have a huge translation file, and you want to test the vertens without translating all of them, you can use the --sample-size parameter to pick only a small portion of the file to translate.