The Machine Readable Glossary Generator (MRG) tool is part of the Terminology Engine version 2 ( TEv2) Toolbox developed by eSSIF-Lab and governed by the Trust Over IP (ToIP) Concepts and Terminology Working Group (CTWG). A detailed description of the tool, its purpose and related concepts can be found at the MR Glossary Generation page whilst its structure at this page.
This README assumes the reader is familiar with these concepts. It focuses on how someone can download, install, and use the MRG.
The MRG helps terminology creators make a Machine Readable Glossary from a set of curated texts that are curated in a particular scope, and a selection of terms curated in other scopes. See the TEv2 architecture for its position in the toolbox.
The MRG generator is NOT an authoring tool. Authoring and curating terms is authoring and curating the curated texts that the MRG generator uses as input for creating an MRG.
An MRG that is created with this tool is typically used as the foundation to create and format additional content, e.g. human-readable glossaries, term resolution links or widgets in documents and websites, etc.
The MRG generator will be used by terminology creators and curators to generate an MRG. It can also be used in a CD/CI pipe to automatically generate an MRG as part of a GitHub action or similar.
For MRG generation to work, the following artefacts need to be present:
- The Scope Administration File (SAF);
- Access to (already existing) MRGs insofar as they contain terms that are to be included in the MRG that the generator creates;
- The curated texts that document the terms (or other artifacts) that are to be included in the MRG that the generator creates).
The MRG will run on a curator's machine in its own docker container. It will connect to one or more GitHub repositories where the curated files reside, and to the repositories where MRGs reside from which terms need to be imported.
The Scope Administration File (SAF) of the primary (or local) scope repository contains the instructions concerning how to create the MRG (e.g. which versions to use, which terms to include, etc.). The MRG generator follows these instructions to build an MRG from the local terms and any terms from remote scopes (i.e. other repositories) that the SAF specifies.
Once run it will generate the MRG in directory the user selects on their local machine.
This will usually be the glossaries
directory in the local clone of the scope
that the curator is currently editing, i.e. the GitHub local directory,
e.g. /Users/foo/tev2/glossaries
or C:/Users/foo/work/tev2/glossaries
Full details of terminology construction can be found at the following page
As of October 2022 the specification of the tool, term construction and other key concepts are still under construction so this might change the implementation and these instructions might also need to change with them.
There are some things you need to do to prepare yourself for generating MRGs:
- Ensure that the generator can access the various GitHub repositories that it needs;
- Ensure that you can run Docker containers;
- Ensure that you have the (most recent) version of the MRG generator tool as a docker image.
You need to work with GitHub, as terminologies are developed and shared (curated) there. Also, the MRG tool uses the GitHub APIs to fetch terminology artefacts. So you will need
- a GItHub account, so you can get access to the various repositories;
- a GitHub personal access token, which ensures you can benefit from the higher GitHub API rate limits (anonymous access is limited to only 50 requests per hour which for scopes with more than 50 files would make generation impossible)
If you don't have one, you can sign up for a GitHub account. You can use the simiplest (free) kind. You will need to supply your username to the MRG generator so it can use this account to access the GitHub API in your name.
The simplest way for you to get a personal access token on GitHub is by using this direct link. You will need to supply the token to the MRG generator so it can access the GitHub API in your name.
You can generate many such tokens, but you only need one for the generator. When creating or refreshing the token for the MRG generator, choose the following access settings:
You may want to save the token for later (re)use. However, you can also always generate/refresh the token if that is needed (GitHub will notify you a few days before it expires).
In order to locally run the MRG generator, you need to be able to run docker containers. Thus you need to install Docker Desktop on your local machine. Make sure you have a relatively recent version - older versions may not work the way we describe things here.
You should find this in Applications (Mac) or the Start Menu (Windows) depending on how you installed the software. It might take a minute or two to start but when the whale turns green then it has started and is ready to use.
In order to locally run the MRG generator, you need the (latest, most recent) Docker image that contains the MRG generator, which you can then run in a container. First, you browse the Trust Over IP MRG Packages to find the latest version of the CTWG MRG Generator.
Copy the docker pull command on this page to download the correct version (the version number may differ from what is shown in the above figure).
Paste this command in to a Terminal window (Mac) or a command prompt (Windows)
docker pull ghcr.io/trustoverip/ctwg-mrg-gen:latest
This will download a new image to your Docker Desktop as below.
An MRG is generated within the context of a scope-directory that resides in a GitHub repository. The scope-directory is the directory that contains the Scope Administration File (SAF) and the curated texts. If terms are being imported from other scope directories (in the same, or other repositories), then these external scopes will have been defined in the SAF and the appropriate versions selected. Further explanations can be found here.
Generating an MRG consists of:
- Starting the MRG generator in a Docker container;
- Start your webbrowser and instruct the MRG generator to create an MRG
- Obtain/view the MRG output.
When things go wrong, you can check the various logs.
You must have completed the prerequisites, and have started the Docker Desktop and downloaded the MRG generator docker image (instructions are above). Then complete the following steps to start the MRG generator in its docker container. Then, it will run as a web service that you can use/call multiple times, e.g. to generate multiple MRGs, as follows:
- Hover over the Docker image in Docker Desktop and click the
Run
button on the right-hand side. A smaller window will appear. Don't click run yet but instead selectOptional Settings
- Now another window will appear that contains fields you need to fill in:
-
under 'Optional settings', you type the name of the container as you like it, e.g.
ctwg-mrg
. -
under 'Ports, you type the port number of where you can access the tool on localhost, e.g.
8083
. This means that you can later browse tolocalhost:8083/ctwg/mrg
to make the tool run. -
under 'Volumes', there are rows that consist of two fields, the left one specifying a directory on your local machine, and the right one specifying a directory on the (virtual) machine in the docker container. The idea is that when the MRG generator writes the MRG in the directory of the docker container, it will be automatically transferred to the local directory, so it becomes available for you to do with as you like. So here is how you fill in the fields
- the left field ('Host path') specifies a directory on your local machine, e.g.
C:\git\my-repodir\glossaries
- the right field ('Container path') MUST contain the text
/glossaries
, as that is the path in the container where the MRG generator will put the generated MRG.
- the left field ('Host path') specifies a directory on your local machine, e.g.
-
under 'Environment variables, you see two rows with fields
Variable
andValue
.- in the first field (with Variable=
gh_user
), you enter your GitHub username (e.g.:RieksJ
, orsih
) in theValue
field. - in the second field (with Variable=
gh_token
), you enter your GitHub access token (something likeghp_v3fSgDIjlsXYZncjEzDQ1bLnwdl2YJOaF
(see the section Enable GitHub Access above on how to get such a token if you need one)
- in the first field (with Variable=
-
Click Run
This will start up a Docker container and when you click Containers
on your Docker Desktop and you should see something like:
Depending on how much of the required software needs to be, or has already been downloaded, and also depending on the speed of your Internet connection, it may take anything from 15 seconds to a minute for the generator to be ready. It is important to check the generator is ready before accessing it.
3.1.1 Starting the MRG generator in a Docker container running local code for local development {#3.1.1}
Build the docker image locally:
docker build -t ctwg-mgr-local -f Dockerfile .
- Hover over the Docker image in Docker Desktop and click the
Run
button on the right-hand side. A smaller window will appear. Don't click run yet but instead selectOptional Settings
- Now another window will appear that contains fields you need to fill in:
-
Enter the following details:
- Container name: ctwg-mgr-local
- Ports: 8083
- Volumes:
- First
- Host path: C:\git\my-repodir\glossaries
- Container path: /glossaries
- Second
- Host path: C:\git\my-repodir\ctwg-mrg-gen (or wherever you have cloned the ctwg-mrg-gen repo)
- Container path: /app
- First
- Environment variables:
- First
- Variable: gh_user
- Value: RieksJ
- Second
- Variable: gh_token
- Value: <Your_Gihub_Token>
- First
-
Click the Run buttom.
This has been tested using Chrome, but should work with most modern browsers
- Navigate to
http://localhost:8083/ctwg/mrg
in your browser. Note that it's not just localhost - you need to specify the complete path.
There are three fields to fill out:
-
Scope directory location This is the URL at which the scope directory (scopedir) is located; it is typically a directory in a (remote!) GitHub repository. This directory must contain the SAF of the scope you want to generate an MRG for. It must also contain the so-called
curatedir
that contains the curated texts (terms). It would typically be something likehttps://github.com/essif-lab/framework/tree/master/docs/tev2
. -
Scope Administration File (SAF) This is the filename (not the location) of the SAF that is located in the scopedir. It MUST be called
saf.yaml
(as shown in the diagram, see also https://essif-lab.github.io/framework/docs/tev2/spec-files/saf). -
Scope version tag This is the tag (name) of the glossary that should be generated. It must have been defined in the SAF. Typical values of it could be
latest
, orv3.1
or so.
Once these are filled out click the Generate
button.
Generation of an MRG takes a bit of time, but not all that long. If it takes too long, you can watch the progress in the Docker Desktop log (see [next step]{#3.4}).
When generation is complete, your browser will show the file that has been generated:
The same file will have been written to the local directory that you specified in the Host path
field (in the step where you started the MRG generator in a container):
The latter can be added to GitHub and then pushed in to the remote repository.
Some useful logging is output to a console and this can be viewed in Docker Desktop.
- From the
Containers
screen in Docker Desktop, click on the three vertical dots on the right-hand side by the CTWG container. It has a tooltip sayingShow container actions
- Click
View Details
This shows the log output from the running container. You might see output still being produced but
when you get a screen similar to the one below that contains
Started MRGWebApp ...
then the container is up and running
After the generation of an MRG is complete (which may take a while, your log would look something like this: