Replies: 4 comments 1 reply
-
A package registry would be a great thing for WDL. We could host the principle version at LUMC. If the packages are packaged as Ping @mlin, since he just added a packager to miniwdl. |
Beta Was this translation helpful? Give feedback.
-
@rhpvorderman @DavyCats I think this is a really interesting Idea. I wonder if there is an opportunity to build on existing tools like https://dockstore.org, or embed support directly for |
Beta Was this translation helpful? Give feedback.
-
Some form of versioned wdl package management would be very welcome - the suggestion above looks nice. If possible I would like to see a system that could prebuild these imports into the imports.zip and still be satisfied on the end server. This would allow us to prebuild the library structure ahead of time without exposing the cluster to external networks/systems. As an aside, a pattern we have been using is for each high-level workflow to provide a lightweight entrypoint.wdl and a zip of the entire workflow and it's dependencies. These bundles are exposed to the user/system as predefined workflows. The zip can contain multiple packages managed and versioned in different git repositories. Version management and zip building is managed via npm (using a private npm registry and a package.json) and a few small helper scripts. This way all the tasks and workflows are fully reusable and can be built into different, versioned workflows upfront. Advantages to this approach:
Drawbacks to this approach:
The key thing for us was to prebuild the imports.zip and not rely on the clusters ability to access remote resources. It is very likely that there was a better way to manage this.... @DavyCats suggestion above is more elegant and may alleviate many of our issues - especially if we can prebuild. |
Beta Was this translation helpful? Give feedback.
-
I added some comments in the initial proposal based on @rhpvorderman proposal of a WDL package specification (#499) and @microbioticajon comment above. |
Beta Was this translation helpful? Give feedback.
-
This is something which has been brought up before (#226), but there seems to have been very little discussion on this since then. Although the recent discourse about an extended library might also be related (#488).
I think it might be useful to have some system like PyPI or dockerhub in which versioned (collections of) WDL files can be stored and from which these specific versions can be imported. Something of this nature could currently be considered supported through the use of http URIs. However, having a standardized form for indicating versions could make it easier to write and read WDL files (as all the versioned imports would follow the same conventions you wouldn't have to decode the urls of various different sources) and allow for a standard protocol of localizing these WDL files for offline usage or archival purposes.
Packages
The packages themselves could essentially just be zip files (like those produced by http://github.com/biowdl/wdl-packager). They would contain one or more WDL files, potentially including sub directories and supplementary files (eg. a license or readme).
If a WDL inside the package imports a WDL file using the "file://" protocol then the path will point to some location within the same zip file.
Syntax
A syntax like used for docker images might be a versatile and straight forwards approach to take for handing import statements. Amongst the import statements one might be able to say something like the following to import a wdl file from some package:
Registry
The registry would have to provide at least the following two endpoints:
GET /v1/r/<organization>/<package>/tags/<tag>
: returns the digest the tag points toGET /v1/r/<organization>/<package>/digests/<digest>
: returns the zip file containing the packageAn implementation could add further endpoints to retrieve additional details, such as lists of available tags and digests. But these two endpoints must always be available in order for the execution engines to be able to download packages from any registry.
Localization or "installing" packages
Execution engines could localize the packages to a directory defined by some environment variable (or some default location like
~/.wdl_packages
). When a workflow imports something from a package, the execution engine could then download that package's zip, unpack it in an appropriate subdirectory and then import it from there.eg.
Beta Was this translation helpful? Give feedback.
All reactions