Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update support for local TES executor with GA4GH TES Plugin #4608

Merged

Conversation

lbeckman314
Copy link
Contributor

@lbeckman314 lbeckman314 commented Dec 21, 2023

Overview

This PR includes updates to the GA4GH TES plugin to allow successfully communication between it and a local TES executor, specifically by passing the contents of the working directory's bin directory as TES inputs.

Nextflow's default behavior is to add the contents of the bin directory to a process and automatically update the $PATH variable to allow workflows to run with dependencies:

Nextflow will automatically add the directory bin into the PATH environmental variable. So therefore any executable in the bin folder of a Nextflow pipeline can be called without the need to reference the full path.

Related Issues/PRs

Tests

Tested using the ./gradlew :plugins:nf-ga4gh:test command in the IntelliJ terminal:

./gradlew :plugins:nf-ga4gh:test

BUILD SUCCESSFUL in 4s
29 actionable tasks: 2 executed, 27 up-to-date
Gradle Test Results

Funnel

An updated Funnel release candidate is being created that includes updated support for this feature. While that's being created its possible to pull in the updates and compile Funnel locally by running the following:

git clone https://github.com/ohsu-comp-bio/funnel -b feature/nextflow-tests

cd funnel

make

./funnel server --Server.HTTPPort 8000 --LocalStorage.AllowedDirs $HOME run

Funnel's documentation page is also being updated and will reflect the latest steps for interoperability with Nextflow and Azure.

Nextflow

To install the changes in this PR to a local instance of Nextflow run the following:

git clone https://github.com/nextflow-io/nextflow -b tes-update-1.1

cd nextflow

make compile

This will create a new launch.sh file that can be used in the nf-canary tests below.

Add the following to your nextflow.config in order to use the GA4GH TES plugin:

plugins {
  id 'nf-ga4gh'
}

process.executor = 'tes'
tes.endpoint = 'http://localhost:8000'

nf-canary

nf-canary is a "A minimal Nextflow workflow for testing infrastructure" that can be used to check compliance with Nextflow workflows.

To use it to test the changes included in this PR run the following:

git clone https://github.com/seqeralabs/nf-canary

cd nf-canary

alias nextflow=~/nextflow/launch.sh  # <--- Change this line to match your local nextflow directory

nextflow run main.nf -c nextflow.config

A successful run will show all tests passed (including one test meant to ignore a simulated test "failure"):

N E X T F L O W  ~  version 23.08.0-edge
Launching `main.nf` [pedantic_archimedes] DSL2 - revision: da205cac0c
Uploading local `bin` scripts folder to $HOME/nf-canary/work/tmp/0d/7632584dabcba30575be6513304d96/bin
executor >  tes [http://localhost:8000] (12)
[e3/ae33dc] process > NF_CANARY:TEST_SUCCESS            [100%] 1 of 1 ✔
[a8/2b4e53] process > NF_CANARY:TEST_CREATE_FILE        [100%] 1 of 1 ✔
[fe/920b2d] process > NF_CANARY:TEST_CREATE_FOLDER      [100%] 1 of 1 ✔
[0c/3104c6] process > NF_CANARY:TEST_INPUT (1)          [100%] 1 of 1 ✔
[83/333e1c] process > NF_CANARY:TEST_BIN_SCRIPT         [100%] 1 of 1 ✔
[-        ] process > NF_CANARY:TEST_STAGE_REMOTE       -
[e6/1152f4] process > NF_CANARY:TEST_PASS_FILE          [100%] 1 of 1 ✔
[92/7f8c7b] process > NF_CANARY:TEST_PASS_FOLDER        [100%] 1 of 1 ✔
[81/c3ba84] process > NF_CANARY:TEST_PUBLISH_FILE       [100%] 1 of 1 ✔
[cd/ce8fb7] process > NF_CANARY:TEST_PUBLISH_FOLDER     [100%] 1 of 1 ✔
[8a/c31c60] process > NF_CANARY:TEST_IGNORED_FAIL       [100%] 1 of 1, failed: 1 ✔
[fb/ae45eb] process > NF_CANARY:TEST_MV_FILE            [100%] 1 of 1 ✔
[65/39eb16] process > NF_CANARY:TEST_MV_FOLDER_CONTENTS [100%] 1 of 1 ✔
[8a/c31c60] NOTE: Process `NF_CANARY:TEST_IGNORED_FAIL` terminated for an unknown reason -- Likely it has been terminated by the external system -- Error is ignored

@lbeckman314 lbeckman314 changed the title Add initial support for bin in working directory Update support for local TES executor with GA4GH TES Plugin Dec 21, 2023
@@ -113,7 +113,8 @@ class TesFileCopyStrategy implements ScriptFileCopyStrategy {
copy.remove('PATH')
// when a remote bin directory is provide managed it properly
if( remoteBinDir ) {
result << "cp -r ${remoteBinDir}/* \$PWD/nextflow-bin/\n"
result << "mkdir -p \$PWD/nextflow-bin/\n"
result << "cp -r \$PWD/bin/* \$PWD/nextflow-bin/\n"
Copy link
Contributor Author

@lbeckman314 lbeckman314 Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the probably biggest change and should likely be revised as it changes existing behavior of the bin $PATH by moving the bin filepath from ${remoteBinDir}/dir to the working directory of the process.

result.path = fileName ? "$WORK_DIR/bin/$fileName" : "$WORK_DIR/bin/${realPath.getName()}"
}
else {
result.path = fileName ? "$WORK_DIR/$fileName" : "$WORK_DIR/${realPath.getName()}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to confirm backwards compatibility with this (see Nextflow's test suite).

Path remoteBinDir = executor.getRemoteBinDir()
Files.newDirectoryStream(remoteBinDir).each { path ->
body.addInputsItem(inItem(path, null ,true))
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where each file in bin is added as a TES input.

@bentsherman
Copy link
Member

Thanks for contributing Liam. I see the main change is to upload the bin directory scripts to each task directory instead of the main work directory. This is not ideal because then the scripts could be uploaded many times. The TES executor is already uploading the bin directory to the work directory, so is it not possible to provide that path as input files to the TES task? Does Funnal require all of the input files to be in the same directory?

@lbeckman314
Copy link
Contributor Author

lbeckman314 commented Feb 1, 2024

Hi Ben! Ah, that should be possible to provide the path of the bin directory as TES input files to Funnel. We will take another pass at Funnel and update this PR to revert that behavior by today.

@@ -190,6 +191,9 @@ class TesTaskHandler extends TaskHandler {
body.addInputsItem(inItem(scriptFile))
body.addInputsItem(inItem(wrapperFile))

Path remoteBinDir = executor.getRemoteBinDir()
body.addInputsItem(inItem(remoteBinDir, remoteBinDir.toString(), true))

Copy link
Contributor Author

@lbeckman314 lbeckman314 Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here to only pass the bin directory as a TES Input (passes the directory itself as opposed to individual files, need to confirm that this is compatible with the TES Schema).

Copy link
Contributor Author

@lbeckman314 lbeckman314 Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with @kellrott that passing a directory as a TES input is compatible with the schema (as long as all items in that directory are recursively added as TES inputs by the executor).

def result = new TesInput()
result.url = realPath.toUriString()
result.path = fileName ? "$WORK_DIR/$fileName" : "$WORK_DIR/${realPath.getName()}"
log.trace "[TES] Adding INPUT file: $result"
result.path = isBin ? realPath : result.path
Copy link
Contributor Author

@lbeckman314 lbeckman314 Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will result in the bin directory being mounted in the container at the same path as Nextflow's main working directory.

All other "non-bin" TES inputs will be mounted at /work by default in the container.

@@ -113,6 +113,7 @@ class TesFileCopyStrategy implements ScriptFileCopyStrategy {
copy.remove('PATH')
// when a remote bin directory is provide managed it properly
if( remoteBinDir ) {
result << "mkdir \$PWD/nextflow-bin/\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line resolves an error we were encountering when attempting to run the .command.run file within the container:

root@<container id>:/work $ ./.command.run
cp: cannot create regular file '/work/nextflow-bin/': Not a directory

@bentsherman
Copy link
Member

@lbeckman314 have you tested the current changes with Funnel? If everything is working then I'll merge it into my PR

@lbeckman314
Copy link
Contributor Author

lbeckman314 commented Feb 9, 2024

Hi @bentsherman! Yes the changes are tested against nf-canary with Funnel both locally and in Github Actions.

No other tests beyond the workflows in nf-canary have been tested yet, though we can add any other workflows that you recommend!


Local Test ✅

Local Test Results

Nextflow Test Results

Local Test Script

# Change TEST_DIR to the desired local test directory
export TEST_DIR=$HOME/nextflow-tests 
mkdir -p $TEST_DIR
cd $TEST_DIR

# Install Funnel
brew install funnel@0.11
funnel server run --LocalStorage.AllowedDirs $TEST_DIR > funnel.log 2>&1 &

# Install Nextflow
git clone https://github.com/ohsu-comp-bio/nextflow -b feature/test-1.1-bin
cd nextflow
make compile
cd -

# Install nf-canary 
git clone https://github.com/ohsu-comp-bio/nf-canary
cd nf-canary

# Add TES Plugin to Nextflow config
cat <<EOF >> nextflow.config
  plugins {
    id 'nf-ga4gh'
  }
  process.executor = 'tes'
  tes.endpoint = 'http://localhost:8000'
EOF

# Run nf-canary
../nextflow/launch.sh run main.nf 2>&1 | tee nextflow-test-results.txt

# Stop Funnel
pkill funnel
cd -

Environment

Github Actions Test ✅

Funnel Nextflow Github Actions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants