Chapter 8 More examples:

Here they are more examples of a bit more complex pipelines

8.1 Two processes

The pipelines are connected by channels and the order is defined in the workflow

mzMLFiles = Channel.fromPath( '/crex/proj/uppmax2024-2-11/metabolomics/mzMLData/*.mzML' )
process featureFinder {
input:
file x
output:
file "${x.baseName}.featureXML" // #1

"cp $x ${x.baseName}.featureXML"

}

// This process gets input from featureFinder
process someotherprocess {
debug true
input:
file x // #2

"echo I got $x from featureFinder"

}

workflow {
    featureFinder_out = featureFinder(mzMLFiles) // #1
    someotherprocess(featureFinder_out) // #2
}

In this example, the process featureFinder sends the output and received by an output channel (#1) which is then sent to someotherprocess to do further processing.

Create a file called main_16.nf

nano main_16.nf

Copy the above code to it. Save the file (Ctrl+o enter) and exit (Ctrl+x) Now run

nextflow main_16.nf

8.2 Two processes including collect

As mentioned above we can instruct Nextflow to emit all the content of a channel in a single emittion so that the downstream process will receive all the content at once using the collect operator. For example

mzMLFiles = Channel.fromPath( '/crex/proj/uppmax2024-2-11/metabolomics/mzMLData/*.mzML' )
process featureFinder {
input:
file x 
output:
file "${x.baseName}.featureXML" // #1

"cp $x ${x.baseName}.featureXML"
}

// This process gets input from featureFinder
process someotherprocess {
debug true
input:
file x

"echo I got $x from featureFinder"

}

workflow {
    featureFinder_out = featureFinder(mzMLFiles).collect() // #1
    someotherprocess(featureFinder_out) // #2
}

In this example, the process featureFinder sends output to an output channel (featureFinder_out) (#1) which is then sent to someotherprocess (#2) to do further processing. However, someotherprocess now has all the data at once because.

Create a file called main_17.nf

nano main_17.nf

Copy the above code to it. Save the file (Ctrl+o enter) and exit (Ctrl+x) Now run

nextflow main_17.nf

What is the difference between this and the previous example?

8.3 Two processes, collect, accept parameters from the user

We can now try to even accept some parameters from the user.

mzMLFiles = Channel.fromPath(params.inputmzml) // #1
process featureFinder {
input:
file x
output:
file "${x.baseName}.featureXML"
"cp $x ${x.baseName}.featureXML"

}

// This process gets input from featureFinder
process someotherprocess {
debug true
input:
file x

"echo I got $x from featureFinder"

}

workflow {
    featureFinder_out = featureFinder(mzMLFiles).collect()
    someotherprocess(featureFinder_out)
}

In this example, we create the mzMLFiles channel using the input we are getting from the user (#1).

Create a file called main_18.nf

nano main_18.nf

Copy the above code to it. Save the file (Ctrl+o enter) and exit (Ctrl+x) Now run:

nextflow main_18.nf --inputmzml "/crex/proj/uppmax2024-2-11/metabolomics/mzMLData/*.mzML"

Notice how we are sending parameters to the workflow!

8.4 Three processes

Similarly we can create as many as processes we would like to. For example three:

mzMLFiles = Channel.fromPath(params.inputmzml) // #1
process justcopy {
input:
file x
output:
file "${x.baseName}.featureXML"
"cp $x ${x.baseName}.featureXML"
}
// This process gets input from justcopy
process someotherprocess {
debug true
input:
file x
output:
file "QC.txt"
"""
echo "${params.qc} percent!">> QC.txt 
"""
}
// This process gets input from someotherprocess
process lastone {
debug true
input:
file x
"""
(echo "Quality was:" ;cat $x)
"""
}

workflow {
    justcopy_out = justcopy(mzMLFiles)
    someotherprocess_out = someotherprocess(justcopy_out.collect())
    lastone(someotherprocess_out)
}

In this example, we create the mzMLFiles channel using the input we are getting from the user (#1). We are also getting another parameter from the user. Can you find it?

Create a file called main_19.nf

nano main_19.nf

Copy the above code to it. Save the file (Ctrl+o enter) and exit (Ctrl+x) Now run:

nextflow main_19.nf --inputmzml "/crex/proj/uppmax2024-2-11/metabolomics/mzMLData/*.mzML" --qc 100

Now let’s clean up the cache produced by Nextflow using the clean command

nextflow clean -f