diff --git a/config.yaml b/config.yaml index 6cf3ca0c..294f8579 100644 --- a/config.yaml +++ b/config.yaml @@ -42,7 +42,7 @@ episodes: - 02-shell_to_cwl.Rmd - 03-dependency_graphs.Rmd - 04-reusing_tools.Rmd - - debug.md + - addendum-01-debug.Rmd - more_info.md # Information for Learners diff --git a/episodes/02-shell_to_cwl.Rmd b/episodes/02-shell_to_cwl.Rmd index 4fe761d3..b935f5b0 100644 --- a/episodes/02-shell_to_cwl.Rmd +++ b/episodes/02-shell_to_cwl.Rmd @@ -151,6 +151,8 @@ However, CWL syntax requires only that each field is properly defined, it does n :::::::::::::::::::::::::::::::::::::: challenge +### Changing the output string 🌶 + What do you need to change to print a different text on the command line? ::::::::::::::::::::::::::::::::::::::::::::: @@ -158,6 +160,7 @@ What do you need to change to print a different text on the command line? :::::::::::::::::::::::::::::::::::::: solution + To change the text on the command line, you only have to change the text in the `hello_world.yml` file. For example @@ -170,6 +173,8 @@ message_text: Good job! :::::::::::::::::::::::::::::::::::::: challenge +### Updating the arguments 🌶🌶 + How can one add the `-e` argument to the echo command to interpret backslashes? ::::::::::::::::::::::::::::::::::::::::::::: @@ -188,6 +193,8 @@ arguments: :::::::::::::::::::::::::::: challenge +### Redirecting cwltool stdout and stderr 🌶🌶 + Rerun the `echo.cwl` script but point stdout and stderr to different files. What is the difference between the stdout and stderr from the `echo.cwl` script? @@ -196,7 +203,7 @@ What is the difference between the stdout and stderr from the `echo.cwl` script? :::::::::::::::::::::::::::: hint -### Hint: Redirecting stdout and stderr +### Hint: Redirecting CLI stdout and stderr Use the redirectors `1>` and `2>` to redirect stdout and stderr to different files respectively @@ -212,7 +219,7 @@ $ cwltool echo.cwl hello_world.yml 1>echo_stdout.txt 2>echo_stderr.txt :::::::::::::::::::::::::::: challenge -### Specifying the outputs of the tool as an actual output +### Specifying the stdout of the tool as a CWL output 🌶🌶🌶 Using [this tutorial][capturing_stdout_tutorial] as a guide @@ -228,7 +235,7 @@ How does this change the output of the cwltool command? :::::::::::::::::::::::::::: hint -### OutputBinding +### Hint - stdout output binding Copy the 'outputBinding' from the tutorial 'verbatim'. @@ -458,6 +465,8 @@ delete the `cache` directory anytime, if you need to reclaim the disk space. :::::::::::::::::::::::::::::::::::::: challenge +### Directly embed a Commandlinetool into a file 🌶🌶 + How could one embed the fastqc tool description directly into the workflow? :::::::::::::::::::::::::::::::::::::: @@ -487,7 +496,7 @@ steps: :::::::::::::::::::::::::::::::::::::: -:::::::::::::::::::::::::::::::::::::: discussion +:::::::::::::::::::::::::::::::::::::: callout ### Embedding Tool Descriptions diff --git a/episodes/03-dependency_graphs.Rmd b/episodes/03-dependency_graphs.Rmd index 15e33468..38107599 100644 --- a/episodes/03-dependency_graphs.Rmd +++ b/episodes/03-dependency_graphs.Rmd @@ -5,19 +5,23 @@ exercises: 0 --- ::::::::::::::::::::::::::::: questions + - How can we expand to a multi-step workflow? - Iterative workflow development - Workflows as dependency graphs - How to use sketches for workflow design? + ::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::: objectives + - Explain that a workflow is a dependency graph - Use cwlviewer online - Generate Graphviz diagram using cwltool - Exercise with the printout of a simple workflow; draw arrows on code; hand draw a graph on another sheet of paper - Recognise that workflow development can be iterative i.e. that it doesn't have to happen all at once - Understand the flow of data between tools + ::::::::::::::::::::::::::::: @@ -82,7 +86,6 @@ the `mapping_reads` step definition with a `ResourceRequirement` to allocate a m ::::::::::::::::::::::: - The newly added `mapping_reads` step also need an input not provided by any of our other steps, therefore an additional workflow-level input is added: a directory that contains the reference genome necessary for the mapping. @@ -102,7 +105,7 @@ This `ref_fruitfly_genome` is added in the `inputs` field of the workflow and in ::::::::::::::::::::::::::::: challenge -### Exercise: Draw the workflow +### Challenge: Draw the workflow 🌶 Draw the connecting arrows in the following graph of our workflow. Also, provide the outputs/inputs of the different steps. You can use for example Paint or print out the graph. diff --git a/episodes/04-reusing_tools.Rmd b/episodes/04-reusing_tools.Rmd index 5fa5bdb4..462dd4b5 100644 --- a/episodes/04-reusing_tools.Rmd +++ b/episodes/04-reusing_tools.Rmd @@ -29,7 +29,7 @@ The last step of our workflow is counting the RNA-seq reads for which we will us :::::::::::::::::::::::::::::: challenge -### Find the featureCounts tool in the bio-cwl-tools library +### Find the featureCounts tool in the bio-cwl-tools library 🌶 Find the `featureCounts` tool in the [bio-cwl-tools library][bio-cwl-tools]. Have a look at the CWL document. Which inputs does this tool need? And what are the outputs of this tool? @@ -39,6 +39,7 @@ Have a look at the CWL document. Which inputs does this tool need? And what are :::::::::::::::::::::::::::::: solution The `featureCounts` CWL document can be found in the [GitHub repo][featurecounts-cwl]. + It has three inputs: - annotations - A GTF or GFF file containing the gene annotations @@ -62,13 +63,13 @@ so the tool should be located at `bio-cwl-tools/subread/featureCounts.cwl`. ::::::::::::::::::::::::::::: challenge -### Add the featureCounts tool to the workflow +### Add the featureCounts tool to the workflow 🌶🌶 Please copy the `rna_seq_workflow_2.cwl` file to create `rna_seq_workflow_3.cwl`. Add the `featureCounts` tool to the workflow as a workflow step. -**Bonus**: +**Bonus**: 🌶🌶🌶 Similar to the `STAR` tool, this tool also needs more RAM than the default. diff --git a/episodes/addendum-01-debug.Rmd b/episodes/addendum-01-debug.Rmd new file mode 100644 index 00000000..c863b373 --- /dev/null +++ b/episodes/addendum-01-debug.Rmd @@ -0,0 +1,283 @@ +--- +title: "Debugging Workflows" +teaching: 0 +exercises: 0 +--- + +:::::::::::::::::::::::::::: questions + +- "How can I check my CWL file for errors?" + +- "How can I get more information to help with solving an error?" + +- "What are some common error messages when using CWL?" + +::::::::::::::::::::::::::: + +::::::::::::::::::::::::::: objectives +- Check a CWL file for errors + +- Output debugging information + +- Interpret and fix commonly encountered error messages +keypoints: + +- Run the workflow with the `--validate` option to check for errors + +- The `--debug` option will output more information + +- 'Wiring' errors won't necessarily yield an error message + +::::::::::::::::::::::::::: + + +### A Firm Reality Check + +When working on a CWL workflow, you will probably encounter errors. There are many different ways for errors to occur. + +It is always very important to check the error message in the terminal, because it will give you information on the error. +This error message will give you the type of error as well as the line of code that contains the error. + +We will showcase some of the common errors in this episode. + +As a first step to check if your CWL script contains any errors, you can run the workflow with the `--validate` flag. + +```bash +cwltool --validate /path/to/cwl_script.cwl +``` + +It is possible for a valid script to still generate an error. + + +If you encounter an error, the best practice is to re-run the workflow with the `--debug` flag. +This will provide you with extensive information on the error you encounter. + +```bash +cwltool --debug /path/to/cwl_script.cwl /path/to/cwl_input.yaml +``` + + +## Syntax Errors + +When writing a piece of code, it is very easy to make a mistake in your YAML syntax. + +Some very common YAML errors are: + +### Tabs + +Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs. + Please download and run [this example][tab-error] which includes a tab character. + +```bash +$ cwltool tab-error.cwl workflow_input.yml +``` + +::::::::::::::::::::::::::: spoiler + +``` +ERROR Tool definition failed validation: +while scanning for the next token +file:///tab-error.cwl:5:1: found character '\t' that cannot start any token +``` + +::::::::::::::::::::::::::: + +## Field Name Typos + +Typos in field names. It is very easy to forget for example the capital letters in field names. + +Errors with typos in field names will show `invalid field`. + +__rna_seq_workflow_fieldname_fail.cwl__ + +```yaml +`r xfun::file_string('files/debug/rna_seq_workflow_fieldname_fail.cwl')` +``` + + +__Validate command__ + +```bash +cwltool --validate rna_seq_workflow_fieldname_fail.cwl +``` + +:::::::::::::::::::::::::::::: spoiler + +### CWLTOOL VALIDATE ERROR MESSAGE + +``` +ERROR Tool definition failed validation: +rna_seq_workflow_fieldname_fail.cwl:1:1: Object `rna_seq_workflow_fieldname_fail.cwl` is not valid + because + tried `Workflow` but +rna_seq_workflow_fieldname_fail.cwl:35:1: the `outputs` field is not valid because +rna_seq_workflow_fieldname_fail.cwl:36:3: item is invalid because +rna_seq_workflow_fieldname_fail.cwl:38:5: invalid field `outputsource`, expected one of: + 'label', 'secondaryFiles', 'streamable', 'doc', 'id', + 'format', 'outputSource', 'linkMerge', 'pickValue', 'type' +``` + +::::::::::::::::::::::::::::::: + + +::::::::::::::::::::::::::::::: callout + +Using an IDE can help warn of incorrect fields before needing to validate via the command-line tool + +::::::::::::::::::::::::::::::: + + +## Variable Name Typos + +Typos in variable names. + +Similar to typos in field names, it is easy to make a mistake in referencing to a variable. +These errors will show `Field references unknown identifier.` + + +__rna_seq_workflow_varname_fail.cwl__ + +```yaml +`r xfun::file_string('files/debug/rna_seq_workflow_varname_fail.cwl')` +``` + +__Validate command__ + +```bash +$ cwltool --validate rna_seq_workflow_varname_fail.cwl +``` + +::::::::::::::::::::::: spoiler + +### CWLTOOL VALIDATE ERROR MESSAGE + +``` +ERROR Tool definition failed validation: +rna_seq_workflow_varname_fail.cwl:8:1: checking field `steps` +rna_seq_workflow_varname_fail.cwl:29:3: checking object + `rna_seq_workflow_varname_fail.cwl#index_alignment` +rna_seq_workflow_varname_fail.cwl:31:5: checking field `in` +rna_seq_workflow_varname_fail.cwl:32:7: checking object + `rna_seq_workflow_varname_fail.cwl#index_alignment/bam_sorted` + Field `source` references unknown identifier + `mapping_reads/alignments`, tried + file:///.../rna_seq_workflow_varname_fail.cwl#mapping_reads/alignments +``` + +:::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::: callout + +Using an IDE, a simple Ctrl+F on a variable can help you see where that variable is +present throughout a CWL code. Only one occurence of a variable might mean it has been spelt differently elsewhere. + +::::::::::::::::::::::::::::::: + + +## Wiring error + +Wiring errors often occur when you forget to add an output from a workflow's step to the `outputs` section. + +This doesn't cause an error message, but there won't be any output in your directory. +To get the desired output you have to run the workflow again. + +Best practice is to check your `outputs` section before running your script to make sure all the outputs you want are there. + +::::::::::::::::::::::::::::::: callout + +All file / directory outputs of a workflow or tool will be placed into a single directory. + +Ensure all expected files and directories are there. For string / boolean output, splitting stderr and stdout of the cwltool commandline into separate files allows the user to easily look through the stdout of a workflow without needing the noise of stderr. Redirects are briefly discussed in the second episode of this tutorial. + +::::::::::::::::::::::::::::::: + + +## Type mismatch + +Type errors take place when there is a mismatch in type between variables. +When you declare a variable in the `inputs` section, the type of this variable has to match the type in the YAML inputs file +and the type used in one of the workflows steps. +The error message that is shown when this error occurs will tell you on which line the mismatch happens. + +__rna_seq_workflow_type_fail.cwl__ + +```yaml +`r xfun::file_string('files/debug/rna_seq_workflow_type_fail.cwl')` +``` + +__Validation Command__ + +```bash +$ cwltool rna_seq_workflow_type_fail.cwl workflow_input_debug.yml +``` + + +::::::::::::::: spoiler + +## Incompatible type cwltool error message + +``` +rna_seq_workflow_type_fail.cwl:5:3: Source 'rna_reads_fruitfly' of type "int" is incompatible +rna_seq_workflow_type_fail.cwl:12:7: with sink 'reads_file' of type "File" +rna_seq_workflow_type_fail.cwl:5:3: Source 'rna_reads_fruitfly' of type "int" is incompatible +rna_seq_workflow_type_fail.cwl:23:7: with sink 'ForwardReads' of type ["File", {"type": + "array", "items": "File"}] + +``` + +::::::::::::::::::::: + + + +## Format error + +Some files need a specific format that needs to be specified in the YAML inputs file, for example the fastq file in the RNA-seq analysis. + +When you don't specify a format, an error will occur. You can for example use the [EDAM](https://www.ebi.ac.uk/ols/ontologies/edam) ontology. + +:::::::::::::::::::::::::: callout + +The format attribute for a File entry is only required if the format attribute is specified +on the workflow input. + +You may use `cwltool --make-template /path/to/cwl_workflow.cwl` to set the formats for +each input for you. + +:::::::::::::::::::::::::: + +__rna_seq_workflow_with_format.cwl__ +```yaml +`r xfun::file_string('files/debug/rna_seq_workflow_with_format.cwl')` +``` + +__workflow_input_undefined_format.yaml__ +```yaml +`r xfun::file_string('files/debug/workflow_input_undefined_format.yaml')` +``` + +```bash +$ cwltool rna_seq_workflow_with_format.cwl workflow_input_undefined.yml +``` + +::::::::::::::::::::::::: spoiler + +### Incompatible format error message + +``` +ERROR Exception on step 'mapping_reads' +ERROR [step mapping_reads] Cannot make job: Expected value of 'ForwardReads' to have format http://edamontology.org/format_1930 but + File has no 'format' defined: { + "class": "File", + "location": "file:///.../rnaseq/GSM461177_1_subsampled.fastqsanger", + "size": 142867948, + "basename": "GSM461177_1_subsampled.fastqsanger", + "nameroot": "GSM461177_1_subsampled", + "nameext": ".fastqsanger" +} +``` + +::::::::::::::::::::::::: + + + +[tab-error]: files/debug/tab-error.cwl diff --git a/episodes/debug.md b/episodes/debug.md deleted file mode 100644 index 52808f19..00000000 --- a/episodes/debug.md +++ /dev/null @@ -1,366 +0,0 @@ ---- -title: "Debugging Workflows" -teaching: 0 -exercises: 0 -questions: -- "How can I check my CWL file for errors?" -- "How can I get more information to help with solving an error?" -- "What are some common error messages when using CWL?" -objectives: -- "Check a CWL file for errors" -- "Output debugging information" -- "Interpret and fix commonly encountered error messages" -keypoints: -- "Run the workflow with the `--validate` option to check for errors" -- "The `--debug` option will output more information" -- "'Wiring' errors won't necessarily yield an error message" ---- - -When working on a CWL workflow, you will probably encounter errors. There are many different errors possible. -It is always very important to check the error message in the terminal, because it will give you information on the error. -This error message will give you the type of error as well as the line of code that contains the error. -Some of these errors will be explained in this episode. - -As a first step to check if your CWL script contains any errors, you can run the workflow with the `--validate` flag. -~~~ -cwltool --validate CWL_SCRIPT.cwl -~~~ -{: .language-bash} - -It is possible that the script is validated, however, it still gets an error. -If you encounter an error, the best practice is to run the workflow with the `--debug` flag. -This will provide you with extensive information on the error you encounter. -~~~ -cwltool --debug CWL_SCRIPT.cwl -~~~ -{: .language-bash} - -### YAML errors -First of all, errors in the YAML syntax. When writing a piece of code, it is very easy to make a mistake. - -Some very common YAML errors are: - -#### Tabs -Using tabs instead of spaces. In YAML files indentations are made using spaces, not tabs. - Please download and run [this example][tab-error] which includes a tab character. - -~~~ -$ cwltool tab-error.cwl workflow_input.yml -~~~ -{: .language-bash} - -~~~ -ERROR Tool definition failed validation: -while scanning for the next token -file:///tab-error.cwl:5:1: found character '\t' that cannot start any token -~~~ -{: .error} - -#### Field Name Typos - -Typos in field names. It is very easy to forget for example the capital letters in field names. - Errors with typos in field names will show `invalid field`. - -__rna_seq_workflow_fieldname_fail.cwl__ -~~~ -cwlVersion: v1.2 -class: Workflow - -inputs: - rna_reads_fruitfly: File - ref_fruitfly_genome: Directory - -steps: - quality_control: - run: bio-cwl-tools/fastqc/fastqc_2.cwl - in: - reads_file: rna_reads_fruitfly - out: [html_file] - - mapping_reads: - requirements: - ResourceRequirement: - ramMin: 5120 - run: bio-cwl-tools/STAR/STAR-Align.cwl - in: - RunThreadN: {default: 4} - GenomeDir: ref_fruitfly_genome - ForwardReads: rna_reads_fruitfly - OutSAMtype: {default: BAM} - SortedByCoordinate: {default: true} - OutSAMunmapped: {default: Within} - out: [alignment] - - index_alignment: - run: bio-cwl-tools/samtools/samtools_index.cwl - in: - bam_sorted: mapping_reads/alignment - out: [bam_sorted_indexed] - -outputs: - qc_html: - type: File - outputsource: quality_control/html_file - bam_sorted_indexed: - type: File - outputSource: index_alignment/bam_sorted_indexed -~~~ -{: .language-yaml} - -__workflow_input_debug.yml__ -~~~ -rna_reads_fruitfly: - class: File - location: rnaseq/GSM461177_1_subsampled.fastqsanger - format: http://edamontology.org/format_1930 # FASTQ -ref_fruitfly_genome: - class: Directory - location: rnaseq/dm6-STAR-index -~~~ -{: .language-yaml} - - -~~~ -$ cwltool rna_seq_workflow_fieldname_fail.cwl workflow_input_debug.yml -~~~ -{: .language-bash} - -~~~ -ERROR Tool definition failed validation: -rna_seq_workflow_fieldname_fail.cwl:1:1: Object `rna_seq_workflow_fieldname_fail.cwl` is not valid - because - tried `Workflow` but -rna_seq_workflow_fieldname_fail.cwl:35:1: the `outputs` field is not valid because -rna_seq_workflow_fieldname_fail.cwl:36:3: item is invalid because -rna_seq_workflow_fieldname_fail.cwl:38:5: invalid field `outputsource`, expected one of: - 'label', 'secondaryFiles', 'streamable', 'doc', 'id', - 'format', 'outputSource', 'linkMerge', 'pickValue', 'type' -~~~ -{: .error} - -#### Variable Name Typos - Typos in variable names. Similar to typos in field names, it is easy to make a mistake in referencing to a variable. - These errors will show `Field references unknown identifier.` - - -__rna_seq_workflow_varname_fail.cwl__ -~~~ -cwlVersion: v1.2 -class: Workflow - -inputs: - rna_reads_fruitfly: File - ref_fruitfly_genome: Directory - -steps: - quality_control: - run: bio-cwl-tools/fastqc/fastqc_2.cwl - in: - reads_file: rna_reads_fruitfly - out: [html_file] - - mapping_reads: - requirements: - ResourceRequirement: - ramMin: 5120 - run: bio-cwl-tools/STAR/STAR-Align.cwl - in: - RunThreadN: {default: 4} - GenomeDir: ref_fruitfly_genome - ForwardReads: rna_reads_fruitfly - OutSAMtype: {default: BAM} - SortedByCoordinate: {default: true} - OutSAMunmapped: {default: Within} - out: [alignment] - - index_alignment: - run: bio-cwl-tools/samtools/samtools_index.cwl - in: - bam_sorted: mapping_reads/alignments - out: [bam_sorted_indexed] - -outputs: - qc_html: - type: File - outputSource: quality_control/html_file - bam_sorted_indexed: - type: File - outputSource: index_alignment/bam_sorted_indexed -~~~ -{: .language-yaml} - -~~~ -$ cwltool rna_seq_workflow_varname_fail.cwl workflow_input_debug.yml -~~~ -{: .language-bash} - -~~~ -ERROR Tool definition failed validation: -rna_seq_workflow_varname_fail.cwl:8:1: checking field `steps` -rna_seq_workflow_varname_fail.cwl:29:3: checking object - `rna_seq_workflow_varname_fail.cwl#index_alignment` -rna_seq_workflow_varname_fail.cwl:31:5: checking field `in` -rna_seq_workflow_varname_fail.cwl:32:7: checking object - `rna_seq_workflow_varname_fail.cwl#index_alignment/bam_sorted` - Field `source` references unknown identifier - `mapping_reads/alignments`, tried - file:///.../rna_seq_workflow_varname_fail.cwl#mapping_reads/alignments -~~~ -{: .error} - -### Wiring error -Wiring errors often occur when you forget to add an output from a workflow's step to the `outputs` section. -This doesn't cause an error message, but there won't be any output in your directory. -To get the desired output you have to run the workflow again. -Best practice is to check your `outputs` section before running your script to make sure all the outputs you want are there. - -### Type mismatch -Type errors take place when there is a mismatch in type between variables. -When you declare a variable in the `inputs` section, the type of this variable has to match the type in the YAML inputs file -and the type used in one of the workflows steps. -The error message that is shown when this error occurs will tell you on which line the mismatch happens. - -__rna_seq_workflow_type_fail.cwl__ -~~~ -cwlVersion: v1.2 -class: Workflow - -inputs: - rna_reads_fruitfly: int - ref_fruitfly_genome: Directory - -steps: - quality_control: - run: bio-cwl-tools/fastqc/fastqc_2.cwl - in: - reads_file: rna_reads_fruitfly - out: [html_file] - - mapping_reads: - requirements: - ResourceRequirement: - ramMin: 5120 - run: bio-cwl-tools/STAR/STAR-Align.cwl - in: - RunThreadN: {default: 4} - GenomeDir: ref_fruitfly_genome - ForwardReads: rna_reads_fruitfly - OutSAMtype: {default: BAM} - SortedByCoordinate: {default: true} - OutSAMunmapped: {default: Within} - out: [alignment] - - index_alignment: - run: bio-cwl-tools/samtools/samtools_index.cwl - in: - bam_sorted: mapping_reads/alignment - out: [bam_sorted_indexed] - -outputs: - qc_html: - type: File - outputSource: quality_control/html_file - bam_sorted_indexed: - type: File - outputSource: index_alignment/bam_sorted_indexed -~~~ -{: .language-yaml} - -~~~ -$ cwltool rna_seq_workflow_type_fail.cwl workflow_input_debug.yml -~~~ -{: .language-bash} - -~~~ -ERROR Tool definition failed validation: - -rna_seq_workflow_type_fail.cwl:5:3: Source 'rna_reads_fruitfly' of type "int" is incompatible -rna_seq_workflow_type_fail.cwl:12:7: with sink 'reads_file' of type "File" -rna_seq_workflow_type_fail.cwl:5:3: Source 'rna_reads_fruitfly' of type "int" is incompatible -rna_seq_workflow_type_fail.cwl:23:7: with sink 'ForwardReads' of type ["File", {"type": - "array", "items": "File"}] -~~~ -{: .error} - -### Format error -Some files need a specific format that needs to be specified in the YAML inputs file, for example the fastq file in the RNA-seq analysis. -When you don't specify a format, an error will occur. You can for example use the [EDAM](https://www.ebi.ac.uk/ols/ontologies/edam) ontology. - -__rna_seq_workflow_debug.cwl__ -~~~ -cwlVersion: v1.2 -class: Workflow - -inputs: - rna_reads_fruitfly: File - ref_fruitfly_genome: Directory - -steps: - quality_control: - run: bio-cwl-tools/fastqc/fastqc_2.cwl - in: - reads_file: rna_reads_fruitfly - out: [html_file] - - mapping_reads: - requirements: - ResourceRequirement: - ramMin: 5120 - run: bio-cwl-tools/STAR/STAR-Align.cwl - in: - RunThreadN: {default: 4} - GenomeDir: ref_fruitfly_genome - ForwardReads: rna_reads_fruitfly - OutSAMtype: {default: BAM} - SortedByCoordinate: {default: true} - OutSAMunmapped: {default: Within} - out: [alignment] - - index_alignment: - run: bio-cwl-tools/samtools/samtools_index.cwl - in: - bam_sorted: mapping_reads/alignment - out: [bam_sorted_indexed] - -outputs: - qc_html: - type: File - outputSource: quality_control/html_file - bam_sorted_indexed: - type: File - outputSource: index_alignment/bam_sorted_indexed -~~~ -{: .language-yaml} - - -__workflow_input_undefined.yml__ -~~~ -rna_reads_fruitfly: - class: File - location: rnaseq/GSM461177_1_subsampled.fastqsanger -ref_fruitfly_genome: - class: Directory - location: rnaseq/dm6-STAR-index -~~~ -{: .language-yaml} - -~~~ -$ cwltool rna_seq_workflow_debug.cwl workflow_input_undefined.yml -~~~ -{: .language-bash} - -~~~ -ERROR Exception on step 'mapping_reads' -ERROR [step mapping_reads] Cannot make job: Expected value of 'ForwardReads' to have format http://edamontology.org/format_1930 but - File has no 'format' defined: { - "class": "File", - "location": "file:///.../rnaseq/GSM461177_1_subsampled.fastqsanger", - "size": 142867948, - "basename": "GSM461177_1_subsampled.fastqsanger", - "nameroot": "GSM461177_1_subsampled", - "nameext": ".fastqsanger" -} -~~~ -{: .error} -{% include links.md %} -[tab-error]: {{ page.root }}/code/tab-error.cwl diff --git a/episodes/files/debug/rna_seq_workflow_fieldname_fail.cwl b/episodes/files/debug/rna_seq_workflow_fieldname_fail.cwl new file mode 100644 index 00000000..34c42519 --- /dev/null +++ b/episodes/files/debug/rna_seq_workflow_fieldname_fail.cwl @@ -0,0 +1,41 @@ +cwlVersion: v1.2 +class: Workflow + +inputs: + rna_reads_fruitfly: File + ref_fruitfly_genome: Directory + +steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_fruitfly + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 5120 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_fruitfly_genome + ForwardReads: rna_reads_fruitfly + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignment + out: [bam_sorted_indexed] + +outputs: + qc_html: + type: File + outputsource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed diff --git a/episodes/files/debug/rna_seq_workflow_type_fail.cwl b/episodes/files/debug/rna_seq_workflow_type_fail.cwl new file mode 100644 index 00000000..099e9b93 --- /dev/null +++ b/episodes/files/debug/rna_seq_workflow_type_fail.cwl @@ -0,0 +1,41 @@ +cwlVersion: v1.2 +class: Workflow + +inputs: + rna_reads_fruitfly: int + ref_fruitfly_genome: Directory + +steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_fruitfly + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 5120 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_fruitfly_genome + ForwardReads: rna_reads_fruitfly + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignment + out: [bam_sorted_indexed] + +outputs: + qc_html: + type: File + outputSource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed diff --git a/episodes/files/debug/rna_seq_workflow_varname_fail.cwl b/episodes/files/debug/rna_seq_workflow_varname_fail.cwl new file mode 100644 index 00000000..c7766b4b --- /dev/null +++ b/episodes/files/debug/rna_seq_workflow_varname_fail.cwl @@ -0,0 +1,41 @@ +cwlVersion: v1.2 +class: Workflow + +inputs: + rna_reads_fruitfly: File + ref_fruitfly_genome: Directory + +steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_fruitfly + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 5120 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_fruitfly_genome + ForwardReads: rna_reads_fruitfly + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignments + out: [bam_sorted_indexed] + +outputs: + qc_html: + type: File + outputSource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed diff --git a/episodes/files/debug/rna_seq_workflow_with_format.cwl b/episodes/files/debug/rna_seq_workflow_with_format.cwl new file mode 100644 index 00000000..d96360df --- /dev/null +++ b/episodes/files/debug/rna_seq_workflow_with_format.cwl @@ -0,0 +1,41 @@ +cwlVersion: v1.2 +class: Workflow + +inputs: + rna_reads_fruitfly: File + ref_fruitfly_genome: Directory + +steps: + quality_control: + run: bio-cwl-tools/fastqc/fastqc_2.cwl + in: + reads_file: rna_reads_fruitfly + out: [html_file] + + mapping_reads: + requirements: + ResourceRequirement: + ramMin: 5120 + run: bio-cwl-tools/STAR/STAR-Align.cwl + in: + RunThreadN: {default: 4} + GenomeDir: ref_fruitfly_genome + ForwardReads: rna_reads_fruitfly + OutSAMtype: {default: BAM} + SortedByCoordinate: {default: true} + OutSAMunmapped: {default: Within} + out: [alignment] + + index_alignment: + run: bio-cwl-tools/samtools/samtools_index.cwl + in: + bam_sorted: mapping_reads/alignment + out: [bam_sorted_indexed] + +outputs: + qc_html: + type: File + outputSource: quality_control/html_file + bam_sorted_indexed: + type: File + outputSource: index_alignment/bam_sorted_indexed diff --git a/old_carpentries_incubator/code/tab-error.cwl b/episodes/files/debug/tab-error.cwl similarity index 100% rename from old_carpentries_incubator/code/tab-error.cwl rename to episodes/files/debug/tab-error.cwl diff --git a/episodes/files/debug/workflow_input_undefined_format.yaml b/episodes/files/debug/workflow_input_undefined_format.yaml new file mode 100644 index 00000000..7bd583b0 --- /dev/null +++ b/episodes/files/debug/workflow_input_undefined_format.yaml @@ -0,0 +1,6 @@ +rna_reads_fruitfly: + class: File + location: rnaseq/GSM461177_1_subsampled.fastqsanger +ref_fruitfly_genome: + class: Directory + location: rnaseq/dm6-STAR-index diff --git a/episodes/more_info.md b/episodes/more_info.md index df25b2a4..d9ca067a 100644 --- a/episodes/more_info.md +++ b/episodes/more_info.md @@ -1,7 +1,8 @@ --- -title: "More information" +title: "More Information" +teaching: 0 +exercises: 0 --- - If you want to know more about CWL script and workflows, you can look at one of these websites: - [CWL User Guide](http://www.commonwl.org/user_guide/index.html) diff --git a/learners/files/linux_setup.Rmd b/learners/files/linux_setup.Rmd index 6d643eb0..507f62ef 100644 --- a/learners/files/linux_setup.Rmd +++ b/learners/files/linux_setup.Rmd @@ -19,15 +19,11 @@ Download and install [VSCode](https://code.visualstudio.com/) [Open Benten in the marketplace][benten_vs_code_marketplace] and click the `Install` button. -If you are given the option to enable the extension on 'WSL: Ubuntu' please do so. - **Install Redhat Yaml VSCode Extension** [Open RedHad Yaml in the marketplace][redhat_yaml_vs_code_marketplace] and click the `Install` button. -If you are given the option to enable the extension on 'WSL: Ubuntu' please do so. - #### Attribute CWL files to the yaml file type diff --git a/learners/files/macos_setup.Rmd b/learners/files/macos_setup.Rmd index ff19b9b1..44322674 100644 --- a/learners/files/macos_setup.Rmd +++ b/learners/files/macos_setup.Rmd @@ -22,8 +22,6 @@ Download and install [VSCode][vs_code] [Open RedHad Yaml in the marketplace][redhat_yaml_vs_code_marketplace] and click the `Install` button. -If you are given the option to enable the extension on 'WSL: Ubuntu' please do so. - #### Attribute CWL files to the yaml file type Add the following chunk to the VSCode [user settings json][user_settings_json] to attribute CWL to the YAML file type. diff --git a/learners/files/windows_setup.Rmd b/learners/files/windows_setup.Rmd index 03155196..c9ae95c2 100644 --- a/learners/files/windows_setup.Rmd +++ b/learners/files/windows_setup.Rmd @@ -23,7 +23,7 @@ You may also wish to go through [Getting started with WSL2][getting_started_with ::::::::::::::: callout -For this tutorial, we expect you use the Ubuntu distribution. +For this tutorial, we expect you use the Ubuntu distribution as your WSL2 distribution of choice. :::::::::::::::: @@ -31,7 +31,7 @@ For this tutorial, we expect you use the Ubuntu distribution. Open PowerShell as Administrator and type in the following -``` +```bash wsl --list ``` @@ -68,6 +68,7 @@ Install Docker Desktop by following the instructions on the [Docker Desktop Inst * Make sure 'Use the WSL 2 based engine' is selected +
### VSCode Installation