Skip to content

Commit

Permalink
Update TES executor to TES API v1.1 (#4195) [ci fast]
Browse files Browse the repository at this point in the history
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Liam Beckman <lbeckman314@gmail.com>
Signed-off-by: Venkat Malladi <vmalladi@microsoft.com>
Co-authored-by: Liam Beckman <lbeckman314@gmail.com>
Co-authored-by: Venkat Malladi <vmalladi@microsoft.com>
Co-authored-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
  • Loading branch information
4 people authored May 10, 2024
1 parent b9bf641 commit 7b32c2d
Show file tree
Hide file tree
Showing 43 changed files with 2,070 additions and 1,039 deletions.
65 changes: 35 additions & 30 deletions docs/executor.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,56 +121,61 @@ By default, Flux will send all output to the `.command.log` file. To send this o
:::{warning} *Experimental: may change in a future release.*
:::

:::{versionchanged} 23.07.0-edge
Support for automatic upload of the `bin` directory was added.
:::

:::{versionchanged} 24.04.0
Support for process output directories and output globs was added.
:::

The [Task Execution Schema](https://github.com/ga4gh/task-execution-schemas) (TES) project by the [GA4GH](https://www.ga4gh.org) standardization initiative is an effort to define a standardized schema and API for describing batch execution tasks in a portable manner.

Nextflow supports the TES API via the `tes` executor, which allows the submission of workflow tasks to a remote execution backend exposing a TES API endpoint.

To use this feature, define the following variables in the workflow launching environment:
The pipeline processes must specify the Docker image to use by defining the `container` directive, either in the pipeline script or the `nextflow.config` file. Additionally, the pipeline work directory must be accessible to the TES backend.

```bash
export NXF_MODE=ga4gh
export NXF_EXECUTOR=tes
export NXF_EXECUTOR_TES_ENDPOINT='http://back.end.com'
To enable this executor, add the following settings to your Nextflow configuration:

```groovy
plugins {
id 'nf-ga4gh'
}
process.executor = 'tes'
tes.endpoint = '<endpoint>'
```

It is important that the endpoint is specified without the trailing slash; otherwise, the resulting URLs will not be normalized and the requests to TES will fail.
The default endpoint is `http://localhost:8000`. It is important that the endpoint is specified without the trailing slash; otherwise, the resulting URLs will not be normalized and the requests to TES will fail.

You will then be able to run your workflow over TES using the usual Nextflow command line. Be sure to specify the Docker image to use, i.e.:
The TES API supports multiple forms of authentication:

```bash
nextflow run rnaseq-nf -with-docker alpine
```
```groovy
// basic
tes.basicUsername = '<username>'
tes.basicPassword = '<password>'
:::{note}
If the variable `NXF_EXECUTOR_TES_ENDPOINT` is omitted, the default endpoint is `http://localhost:8000`.
:::
// API key
tes.apiKeyParamMode = '<mode>' // 'query' or 'header'
tes.apiKeyParamName = '<param-name>'
tes.apiKey = '<key>'
// OAuth
tes.oauthToken = '<token>'
```

:::{tip}
You can use a local [Funnel](https://ohsu-comp-bio.github.io/funnel/) server using the following launch command line:
You can deploy a local [Funnel](https://ohsu-comp-bio.github.io/funnel/) server using the following command:

```bash
./funnel server --Server.HTTPPort 8000 --LocalStorage.AllowedDirs $HOME run
```

(tested with version 0.8.0 on macOS)
:::

:::{warning}
Make sure the TES backend can access the Nextflow work directory when data is exchanged using a local or shared file system.
:::{note}
While the TES API is designed to abstract workflow managers from direct storage access, Nextflow still needs to access the shared work directory used by your TES endpoint. For example, if your TES endpoint is located in Azure and uses Azure Blob storage to store the work directory, you still need to provide the necessary Azure credentials for Nextflow to access the Blob storage.
:::

### Known Limitations

- Automatic deployment of workflow scripts in the `bin` folder is not supported.

:::{versionchanged} 23.07.0-edge
Automatic upload of the `bin` directory is now supported.
:::

- Process output directories are not supported. For details see [#76](https://github.com/ga4gh/task-execution-schemas/issues/76).

- Glob patterns in process output declarations are not supported. For details see [#77](https://github.com/ga4gh/task-execution-schemas/issues/77).

(google-batch-executor)=

## Google Cloud Batch
Expand Down
22 changes: 22 additions & 0 deletions plugins/nf-ga4gh/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# GA4GH plugin for Nextflow

This plugin implements the support for GA4GH APIs for Nextflow. Currently only supports the [Task Execution Service (TES) API](https://github.com/ga4gh/task-execution-schemas).

## SDK Generation

[Swagger Codegen](https://github.com/swagger-api/swagger-codegen) was used to generate the Java SDK for TES based on the [TES OpenAPI specification](https://github.com/ga4gh/task-execution-schemas/blob/develop/openapi/task_execution_service.openapi.yaml).

The easiest way to generate the Java SDK is with the [Docker image](https://github.com/swagger-api/swagger-codegen#swagger-codegen-cli-docker-image):

```bash
# download the TES OpenAPI spec
wget https://github.com/ga4gh/task-execution-schemas/raw/v1.1/openapi/task_execution_service.openapi.yaml

# convert the spec to JSON
python3 -c 'import sys, yaml, json; y=yaml.safe_load(sys.stdin.read()); print(json.dumps(y, indent=2, default=str))' \
< task_execution_service.openapi.yaml \
> task_execution_service.openapi.json

# generate the Java SDK
docker run -v ${PWD}:/local swaggerapi/swagger-codegen-cli-v3 generate -i /local/task_execution_service.openapi.json -l java -o /local
```
5 changes: 3 additions & 2 deletions plugins/nf-ga4gh/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,12 @@ dependencies {
compileOnly 'org.pf4j:pf4j:3.10.0'

api 'javax.annotation:javax.annotation-api:1.3.2'
api 'io.swagger:swagger-annotations:1.5.15'
api 'io.swagger.core.v3:swagger-annotations:2.0.0'
api 'com.squareup.okhttp:okhttp:2.7.5'
api 'com.squareup.okhttp:logging-interceptor:2.7.5'
api 'com.google.code.gson:gson:2.10.1'
api 'joda-time:joda-time:2.9.9'
api 'io.gsonfire:gson-fire:1.8.3'
api 'org.threeten:threetenbp:1.3.5'

testImplementation(testFixtures(project(":nextflow")))
testImplementation "org.apache.groovy:groovy:4.0.21"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,20 @@
*/

/*
* task_execution.proto
* No description provided (generated by Swagger Codegen https://github.com/swagger-api/swagger-codegen)
* Task Execution Service
*
* OpenAPI spec version: version not set
* OpenAPI spec version: 1.1.0
*
*
* NOTE: This class is auto generated by the swagger code generator program.
* https://github.com/swagger-api/swagger-codegen.git
* Do not edit the class manually.
*/


package nextflow.ga4gh.tes.client;

import java.io.IOException;

import java.util.Map;
import java.util.List;

Expand Down
Loading

0 comments on commit 7b32c2d

Please sign in to comment.