Skip to content

LDBCouncil Social Network Benchmark data generator for GRAKN.AI

License

Notifications You must be signed in to change notification settings

sklarman/ldbc-snb

 
 

Repository files navigation

LDBC_LOGO LDBC-SNB Data Generator

The LDBC-SNB Data Generator (DATAGEN) is the responsible of providing the data sets used by all the LDBC benchmarks. This data generator is designed to produce directed labeled graphs that mimic the characteristics of those graphs of real data. A detailed description of the schema produced by datagen, as well as the format of the output files, can be found in the latest version of official LDBC SNB specification document

ldbc_snb_datagen is part of the LDBC project (http://www.ldbc.eu/). ldbc_snb_datagen is GPLv3 licensed, to see detailed information about this license read the LICENSE.txt.

Datasets

Publicly available datasets can be found at the LDBC-SNB Amazon Bucket. These datasets are the official SNB datasets and were generated using version 0.2.6. They are available in the three official supported serializers: CSV, CSVMergeForeign and TTL. The bucket is configured in "Requester Pays" mode, thus in order to access them you need a properly set up AWS client.

Community provided tools

Grakn test setup

In order to run the SNB generator you need these pre-requisites:

  • 7zip,
  • Grakn,

and the PATH environmental variable should contain the bin directory of your Grakn distribution. Finally, start the Grakn engine and the SNB data for the small graph can be loaded by executing either of the Grakn loading scripts:

./runGraknREST.sh

./runGraknMigrator.sh localhost:4567 SNB

./runGraknGraql.sh

Grakn REST loader

This script runs the snb data generator with serialisers that send the insert queries directly to the Grakn engine REST API. If you need to load to a remote engine instance you can use these parameters in the params.ini file:

  • grakn.engine.uri
  • grakn.engine.keyspace

Grakn Migrator loader

This script runs the snb generator to create CSV files. These CSV files are then imported using the migrator. The migrator script takes two arguments: the address of the engine instance and the keyspace to load that data in.

There are two optional arguments to the runGraknMigration.sh script to increase the system load:

./runGraknMigrator.sh localhost:4567 SNB numberActiveTasks batchSize

These two options are passed directly to the migration client by the script.

Grakn Graql loader

This script runs the snb data generator with serialisers that execute match and insert queries directly. You can use a different keyspace by change the parameter in the params.ini file:

  • grakn.engine.keyspace

NB: this script is only capable of loading data from a single machine.

About

LDBCouncil Social Network Benchmark data generator for GRAKN.AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 97.9%
  • Java 0.9%
  • CSS 0.6%
  • Shell 0.3%
  • Batchfile 0.2%
  • Python 0.1%