docsfastqc-cli-tutorial
Last edit December 07, 2024

Running a FastQC-CLI Job Using DiPhyx

A step-by-step instructions on how to run a FastQC job on DiPhyx

This guide provides step-by-step instructions on how to run a FastQC job on DiPhyx. By following these steps, you can easily set up and view the results of a FastQC analysis, which includes basic statistics, per base sequence quality, sequence length distribution, sequence duplication levels, and adapter content.

You can also watch our video tutorial on how to run FastQC on DiPhyx here:

Prerequisites

Before you begin, make sure you have the following:

  • Access to the DiPhyx platform
  • FastQC files to upload

In this guide, we will be using the following files, which you can also download:

test.fastq

fastq
test.fastq282 B
Download

run_script.sh

sh
run_script.sh863 B
Download

Steps

1. Go to Software Packages Page

Log in to the DiPhyx platform and navigate to Software Packages page.

2. Find FastQC-CLI

Type "FastQC" in the "Filter by name" field.

3. Go to FastQC-CLI Details

Click "View Details" in FastQC-CLI card.

4. Select Compute Unit

On the FastQC-CLI software package page, choose the desired compute unit from the "Select compute unit" option menu. This allows you to select the suitable compute unit from the list of available units in the READY state.

Tip

If you don't have any available compute unit or need to create a new one for this project instance, we recommend following our guide on How to Create a Compute Unit on DiPhyx for step-by-step instructions.

5. Click "Create Instance"

After selecting a compute unit, click "Create Instance".

6. Fill in The Fields

On 'Create new project' dialog:


  • Ensure that the Project Name is unique and not already in use within the same compute unit to avoid any conflicts.
  • The Compute-Unit Volume specifies the directory on the compute unit that is mounted into the FastQC model. This directory becomes accessible at the Project Volume.
  • The Run Script Name refers to the script that will be executed upon project start. It is advisable to specify the absolute path to the script, for example, /data/run_script.sh. Should you choose to use a relative path, the system will search for the script within the working directory, which, for FastQC, is /data.

In the Compute-Unit Volume field, click "browse icon" to add input files for your fastQC-CLI instance.

7. Upload Input Files

In the file explorer dialog, click "Upload file".

8. Browse or Drag and Drop

You can either drag & drop or use the "Browse files" for uploading your files.

Note: In this example, we used the provided sample files from the following GitHub repository (which we have linked for download in the beginning of this tutorial):

https://github.com/diphyx/studies/tree/main/fastqc

where test.fastq is a test case for fastQC and run_script.sh is a script that run the following command:

bash
result_dir="results" # Name of the result directory
input_file="/data/test.fastq" # Path to the input .fastq file
mkdir -p $result_dir
fastqc $input_file -o $result_dir

9. Click "Done"

Once the files are uploaded, click on "Done" to finish the file upload process.

10. Create The Project Instance

Click on "Create" to start the FastQC job.

11. Start The Project

Now on the "project page", start the job by clicking on the "Start" button on.

12. Confirm "Start"

13. View The Project

Click on the the project name, here "Fast qc", to view the project page.

14. View The Instance Page

On Fast qc project page, you can see the list of containers that have been created in this project. Here, click "Instance" to view the details of the FastQC-CLI instance you just created.

15. Check The Results and Logs

Tip

You can monitor the progress of the job by checking its logs in the "Logs" section.

Once the job is completed, in the "Volumes" section of this page, click "Browse" to access the results.

16. View The Results

On the file explorer dialog, click on the "results" folder to view the FastQC results.

17. Click on "More Options"

Click "more options" button of a result file.

18. View The Result Files

Click "View" to see the results by File Viewer. You can directly visualize HTML or image files on the platform without the need for downloading them.

19. Explore The Results

On DiPhyx File Viewer, you can explore the details of your FastQC results and analyze the generated reports.

20. Congratulations!

Success

Congratulations! You have successfully run a FastQC instance on DiPhyx.