DCAT-AP Dataset Relationship Indexer. Indexing linked data and relationships between datasets.
- Features:
- index a distribution or a SPARQL endpoint
- extract and index distributions from a DCAT catalog
- extract a DCAT catalog from SPARQL endpoint and index distributions from it
- generate a dataset profile
- show related datasets based mainly on DataCube and SKOS vocabularies
- indexing sameAs identities and related concepts
For DCAT-DRY service only:
docker build . -t dcat-dry
docker run -p 80:8000 --name dcat-dry dcat-dry
For the full environment use docker-compose:
docker-compose up --build
CPython 3.8+ is supported.
Install redis server first. In following example we will assume it runs on localhost, port 6379 and DB 0 is used.
Setup postgresql server as well. In the following example we will assume it runs on localhost, port 5432, DB is postgres and user/password is postgres:example
You will need some libraries installed: libxml2-dev libxslt-dev libleveldb-dev libsqlite3-dev and sqlite3
Run the following commands to bootstrap your environment
git clone https://github.com/eghuro/dcat-dry cd dcat-dry poetry install --with robots,gevent --without dev # Start redis and postgres servers # Export environment variables export REDIS_CELERY=redis://localhost:6379/1 export REDIS=redis://localhost:6379/0 export DB=postgresql+psycopg2://postgres:example@localhost:5432/postgres # Setup the database alembic upgrade head # Run concurrently celery -A tsa.celery worker -l debug -Q high_priority,default,query,low_priority -c 4 gunicorn -w 4 -b 0.0.0.0:8000 --log-level debug app:app nice -n 10 celery -l info -A tsa.celery beat
In general, before running shell commands, set the FLASK_APP
and
FLASK_DEBUG
environment variables
export FLASK_APP=autoapp.py export FLASK_DEBUG=1
To deploy:
export FLASK_DEBUG=0 # Follow commands above to bootstrap the environment
In your production environment, make sure the FLASK_DEBUG
environment
variable is unset or is set to 0
, so that ProdConfig
is used.
To open the interactive shell, run
flask shell
By default, you will have access to the flask app
.
To run all tests, run
flask test
# Prepare couchdb
curl -X PUT http://admin:password@127.0.0.1:5984/_users curl -X PUT http://admin:password@127.0.0.1:5984/_replicator curl -X PUT http://admin:password@127.0.0.1:5984/_global_changes
# Migrate database
alembic upgrade head
To start batch scan, run
flask batch -g /tmp/graphs.txt -s http://10.114.0.2:8890/sparql
Get a full result
/api/v1/query/analysis
Query a dataset
/api/v1/query/dataset?iri=http://abc