A step-by-step instructions on how to run a FastQC job on DiPhyx
This guide provides step-by-step instructions on how to run a FastQC job on DiPhyx. By following these steps, you can easily set up and view the results of a FastQC analysis, which includes basic statistics, per base sequence quality, sequence length distribution, sequence duplication levels, and adapter content.
You can also watch our video tutorial on how to run FastQC on DiPhyx here:
Before you begin, make sure you have the following:
In this guide, we will be using the following files, which you can also download:
Log in to the DiPhyx platform and navigate to Software Packages page.
Type "FastQC" in the "Filter by name" field.
Click "View Details" in FastQC-CLI card.
On the FastQC-CLI software package page, choose the desired compute unit from the "Select compute unit" option menu. This allows you to select the suitable compute unit from the list of available units in the READY state.
Tip
If you don't have any available compute unit or need to create a new one for this project instance, we recommend following our guide on How to Create a Compute Unit on DiPhyx for step-by-step instructions.
After selecting a compute unit, click "Create Instance".
On 'Create new project' dialog:
Project Name
is unique and not already in use within the same compute unit to avoid any conflicts.Compute-Unit Volume
specifies the
directory on the compute unit that is mounted into the FastQC model.
This directory becomes accessible at the Project Volume.Run Script Name
refers to the script that will be executed upon project start. It is advisable to specify the absolute path to the script, for example, /data/run_script.sh
. Should you choose to use a relative path, the system will search for the script within the working directory, which, for FastQC, is /data
.
In the Compute-Unit Volume
field, click "browse icon" to add input files for your fastQC-CLI instance.
In the file explorer dialog, click "Upload file".
You can either drag & drop or use the "Browse files" for uploading your files.
Note: In this example, we used the provided sample files from the following GitHub repository (which we have linked for download in the beginning of this tutorial):
https://github.com/diphyx/studies/tree/main/fastqc
where test.fastq is a test case for fastQC and run_script.sh is a script that run the following command:
result_dir="results" # Name of the result directory input_file="/data/test.fastq" # Path to the input .fastq file mkdir -p $result_dir fastqc $input_file -o $result_dir
Once the files are uploaded, click on "Done" to finish the file upload process.
Click on "Create" to start the FastQC job.
Now on the "project page", start the job by clicking on the "Start" button on.
Click on the the project name, here "Fast qc", to view the project page.
On Fast qc project page, you can see the list of containers that have been created in this project. Here, click "Instance" to view the details of the FastQC-CLI instance you just created.
Tip
You can monitor the progress of the job by checking its logs in the "Logs" section.
Once the job is completed, in the "Volumes" section of this page, click "Browse" to access the results.
On the file explorer dialog, click on the "results" folder to view the FastQC results.
Click "more options" button of a result file.
Click "View" to see the results by File Viewer. You can directly visualize HTML or image files on the platform without the need for downloading them.
On DiPhyx File Viewer, you can explore the details of your FastQC results and analyze the generated reports.
Success
Congratulations! You have successfully run a FastQC instance on DiPhyx.