Skip to content

fhir-fli/mimic-fhir

 
 

Repository files navigation

mimic-fhir

  • A version of MIMIC-IV-on-FHIR (original repo here). The scripts and packages in the repository will generate the MIMIC-IV FHIR tables in PostgreSQL, validate in HAPI fhir, and export to ndjson.
  • Also know that there are specific instructions for MIMIC-IV and MIMIC-IV-ED to be loaded into your local Postgres. The specific instructions are at MIMIC-IV guide and MIMIC-IV-ED guide. You can follow those instructions, but I've included it all here, but I want to ensure I give credit where it is due. (Note: When following other instructions please use the same db name across both guides ie. mimiciv)

Prerequisites

# update
sudo apt update

# install git and wget
sudo apt install git wget

# clone repo
git clone https://github.com/fhir-fli/mimic-fhir.git && cd mimic-fhir/mimic-code

Postgresql

# install
sudo apt install postgresql postgresql-contrib

#get into postgres
sudo -i -u postgres
postgres@desktop:~$ psql
# For this, the user needs to be the same as the username you are using on the current computer you're using
# replace '${PASSWORD}' with your actual password, but leave the single quotes around it
postgres=# CREATE USER grey CREATEDB password '${PASSWORD}';

postgres=# exit
postgres@desktop:~$ exit

Download the data and structure it in Postgresql

  • Note in the below, <USERNAME> is your physionet username
  • It's a fair amount of data and it can take some time, 30-40 minutes is not unusual
  • If that doesn't work, go to the bottom of these websites, and you can copy the commands
  • mimiciv
  • mimic-iv-ed
wget -r -N -c -np --user <USERNAME> --ask-password https://physionet.org/files/mimiciv/2.2/
wget -r -N -c -np --user <USERNAME> --ask-password https://physionet.org/files/mimic-iv-ed/2.2/

# move the actual data files
mv physionet.org/files/mimiciv mimiciv 
mv physionet.org/files/mimic-iv-ed mimicived

# delete the rest
rm -r physionet.org/

# creates the database itself
createdb mimiciv
psql -d mimiciv -f mimic-iv/buildmimic/postgres/create.sql

# take note of the mimiciv version you're on and change the directory accordingly, this one takes a while
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=..mimiciv/2.2 -f mimic-iv/buildmimic/postgres/load_gz.sql

# The first time you do this, the scripts delete ("drop" in sql parlance) things before you create them to remove old versions. This produces a warning, you can safely ignore it

# I get a number of Notices about constraints not existing for this one
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=..mimiciv/2.2 -f mimic-iv/buildmimic/postgres/constraint.sql

# Also notices about indexes not existing
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=mimiciv/2.2 -f mimic-iv/buildmimic/postgres/index.sql

# We're basically just going to repeat with the mimic ED data
psql -d mimiciv -f mimic-iv-ed/buildmimic/postgres/create.sql
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=../mimicived/2.2/ed -f mimic-iv-ed/buildmimic/postgres/load_gz.sql

# In the mimic-iv-ed directory, the constraints.sql has the schema listed as mimic_ed, instead of mimiciv_ed, which is the schema in the other files. In this repo I've changed, but if you go with the original repo, you'll probably have to change it
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=mimicived/2.2/ed -f mimic-iv-ed/buildmimic/postgres/constraint.sql

# Same notices about indexes not existing
psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=mimicived/2.2/ed -f mimic-iv-ed/buildmimic/postgres/index.sql

# validate that the setup is correct
psql -d mimiciv -f mimic-iv-ed/buildmimic/postgres/validate.sql
psql -d mimiciv -f mimic-iv/buildmimic/postgres/validate.sql

Conversion

  • Generate the FHIR tables by running create_fhir_tables.sql found in the folder mimic-fhir/sql
  • IMPORTANT:
    • this takes a long time and requires a lot of space. I kept running out of space when I tried to do it at first.
    • Realistically, you probably need 2TB of FREE space on the device you're using
    • This is a lengthy process. Just so you can know what you should expect, I ran this on a machine with:
      • AMD® Ryzen 9 3900xt 12-core processor × 24
      • 64 GB RAM
    • It took ~12 hours
cd ../sql
psql -d mimiciv -f create_fhir_tables.sql
psql -d mimiciv -f validate_fhir_tables.sql

HAPI FHIR for use in validation/export

  • The first step in validation/export is getting the fhir server running. In our case we will use HAPI FHIR.
  • The nice folks at kind-lab made a fork of the hapi jpa starter server
cd ../.. && git clone https://github.com/kind-lab/hapi-fhir-jpaserver-starter.git

createdb hapi_r4
  • They created a *.env* file already in the mimic-fhir directory.
  • Change the SQLUSER and SQLPASS. Those should be the same as you set them at the beginning of this process.
  • Choose the paths you're going to use for the MIMIC_JSON_PATH, FHIR_BUNDLE_ERROR_PATH, MIMIC_FHIR_LOG_PATH
  • You'll need to make sure java and maven are installed for this next section
  • The application.yaml file in the hapi-fhir-jpaserver-starter project also needs the username and password changed at the beginning to the same ones you have been using
  • then run:
cd hapi-fhir-jpaserver-starter
mvn jetty:run
  • The initial loading of hapi fhir will be around 10-15 minutes, subsequent loads will be faster

PY_MIMIC_FHIR

  • Configure py_mimic_fhir package for use
  • Post terminology to HAPI-FHIR using py_mimic_fhir
git clone https://github.com/kind-lab/mimic-profiles.git
  • Ensure the environment variable MIMIC_TERMINOLOGY_PATH is set and pointing to the latest terminology files mimic-profiles/input/resources
cd ../mimic-fhir
export $(grep -v '^#' .env | xargs)
pip install -e .
  • there are some packages that seem to be required before you can run the terminology post command
pip install google-cloud
pip install google-cloud-pubsub
pip install psycopg2-binary
pip install pandas-gbq
pip install fhir
pip install fhir-resources
  • Run the terminology post command in py_mimic_fhir:
python py_mimic_fhir terminology --post
  • Load reference data bundles into HAPI-FHIR using py_mimic_fhir
  • Initialize data on the HAPI-FHIR server, so patient bundles can reference the data resources
  • The data tables for medication and organization only need to be loaded in once to your HAPI-FHIR server. To ensure these resources are loaded in, the first time you run mimic-fhir you must run
python3 py_mimic_fhir validate --init
  • Validate mimic-fhir against mimic-profiles IG
  • After step 6 has been run once, you can proceeded to this step to validate some resources! In your terminal (with all the env variables) run:
python3 py_mimic_fhir validate --num_patients 5
  • Any failed bundles will be written to your log folder specified in .env

  • Export mimic-fhir to ndjson

    • Using the py_mimic_fhir package you can export all the resources on the server to ndjson
python3 py_mimic_fhir export --export_limit 100
  • export_limit will reduce how much is written out to file. It limits how many binaries are written out. Each binary ~1000 resources. So in this case the limit of 1 will output 1000 resources into ndjsons
  • The outputted ndjson will be written to the MIMIC_JSON_PATH folder specified inthe .env

Useful wiki links

About

A version of MIMIC-IV in FHIR

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 63.6%
  • HTML 31.0%
  • Python 2.7%
  • Shell 0.9%
  • PLpgSQL 0.7%
  • TeX 0.7%
  • Other 0.4%