Skip to content

MaastrichtU-IDS/knowledge-collaboratory

Repository files navigation

Knowledge Collaboratory

Deploy frontend to GitHub Pages Test production API Run backend tests CodeQL analysis

Services to query the Nanopublications network using Translator standards to retrieve the Knowledge Collaboratory graph, a collection of drug indications annotated using preferred identifiers (usually from MONDO, CHEBI, DrugBank, etc).

A website to enable users to easily annotate, publish and browse biomedical claims from natural language to a structured format recognized by the Translator. To publish new claims the users login with their ORCID, and submit the sentence they want to annotate, optionally providing an additional link to specify the provenance of this statement.

The model for annotating drug indications claims built for the LitCoin competition is used to automatically extract potential entities and relations. The SRI NameResolution API is then used to retrieve standard identifiers for each entities. And the BioLink model is used to define the relations between entities.

The extracted entities and relations are then displayed to the users on the website, and users can change the automatically generated claim to better reflect the emitted statement before publishing it to the Nanopublication network.

Backend built with FastAPI, and RDFLib.

Frontend built with NextJS, ReactJS, and Material UI

🌐 Public deployments

πŸ“₯️ Requirements

πŸš€ Production deployment

Checkout the docker-compose.prod.yml file for more details about the deployment.

  1. Create a .env file with your production settings:
ORCID_CLIENT_ID=APP-XXX
ORCID_CLIENT_SECRET=XXX
OPENAI_APIKEY=sk-XXX
FRONTEND_URL=https://collaboratory.semanticscience.org
  1. Deploy the app with production config:
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

🐳 Local development

Requirements: python >=3.9 with hatch, and node >=16 with yarn

⚠️ Create a .env file with your development settings in the backend folder of this repository:

ORCID_CLIENT_ID=APP-XXX
ORCID_CLIENT_SECRET=XXXX
OPENAI_APIKEY=sk-XXX
FRONTEND_URL=http://localhost:4000
DATA_PATH=./data

Start the backend

Start the FastAPI python backend from the backend/ folder:

cd backend
hatch run dev

Start the frontend

Then install and start the react frontend with nextjs from the frontend/ folder:

cd frontend
yarn
yarn dev

Now you can open your browser and interact with these URLs:

βœ… Tests

2 sets of tests are available: integration tests to test local changes, and production tests to test the API deployed in production

You can run the integration tests locally:

hatch run test tests/integration -s

And the tests against the APIs deployed in production:

hatch run test:ids -s
hatch run test:itrb-ci -s
hatch run test:itrb-test -s
hatch run test:itrb-prod -s

πŸ”§ Maintenance

⏫ Upgrade TRAPI version

Get the latest TRAPI YAML: https://github.com/NCATSTranslator/ReasonerAPI/blob/master/TranslatorReasonerAPI.yaml

For the OpenAPI specifications: change the TRAPI_VERSION_TEST in backend/app/config.py

For the reasoner_validator tests:

  1. Change TRAPI_VERSION_TEST in backend/app/config.py

  2. In pyproject.toml upgrade the version for the reasoner-validator

  3. Run the tests:

hatch run test tests/integration -s

🐳 Docker Compose files and env vars

There is a main docker-compose.yml file with all the configurations that apply to the whole stack, it is used automatically by docker-compose.

And there's also a docker-compose.override.yml with overrides for development, for example to mount the source code as a volume. It is used automatically by docker-compose to apply overrides on top of docker-compose.yml.

These Docker Compose files use the .env file containing configurations to be injected as environment variables in the containers.

They also use some additional configurations taken from environment variables set in the scripts before calling the docker-compose command.

It is all designed to support several "stages", like development, building, testing, and deployment. Also, allowing the deployment to different environments like staging and production (and you can add more environments very easily).

They are designed to have the minimum repetition of code and configurations, so that if you need to change something, you have to change it in the minimum amount of places. That's why files use environment variables that get auto-expanded. That way, if for example, you want to use a different domain, you can call the docker-compose command with a different DOMAIN environment variable instead of having to change the domain in several places inside the Docker Compose files.

Also, if you want to have another deployment environment, say preprod, you just have to change environment variables, but you can keep using the same Docker Compose files.

πŸ”— Links

Livestream logs:

Project bootstrapped with https://github.com/tiangolo/full-stack-fastapi-postgresql

πŸ—ƒοΈ Current resources in the Collaboratory