Arrow image

13 Dec 2024

Strand’s Methylation Pipeline Series

1 | Strand’s Methylation Pipeline - An Overview

WRITTEN BY

Divya Anantsri

SHARE THIS

Blog

We developed a methylation pipeline on AWS Healthomics. The first part of this two-part series outlines the steps in the pipeline for analyzing methylation data from targeted methylome sequencing (TMS) and whole methylome sequencing (WMS), and the second part walks through the protocol for users to run this workflow on AWS Healthomics.

Why did we choose AWS HealthOmics?

We chose to leverage AWS Healthomics to optimize storage and computing costs. Storage on AWS is 75% lower than S3, and compute is 25% lower than EC2. The table below summarizes the additional benefits:

 

Summary of the pipeline 

Our pipeline consists of four main analysis phases, along with two optional steps:

1. Pre-alignment QC 

    • Here, the processes of trimming raw reads and FASTQ quality evaluation are carried out. BBDuK trims the FASTQ files based on quality and the presence of adapter sequences, and FastQC generates a quality control (QC) report from the FASTQ files to assess sequencing quality.

2. Alignment and associated QC

    • Essential alignment of the FASTQ files to the reference genome is done using BWAMeth.

Fragmentomics (optional): Alternatively, depending on the user’s research questions, the aligned file from stage 2 can be used for fragmentomics analysis and QC. FranaTK generates a range of fragmentomics features.

3. Methylation calling for CpGs

    • Methylation calling for CPGs is handled by Methyldackel, which generates CpG, CHH, CH, and bedgraph files. 

4. Targeted panel analysis

    • The final step includes targeted panel analysis and producing related QC metrics.

Fragmentomics (optional): An option to proceed with fragment-wise methylation calling is available after stage 4. The tool Patr generates fragment-wise methylation patterns and summaries.

The above pipeline is summarized in the following figure:

 

Tools in the pipeline:

  • BBDuK: Remove contaminating adapter and trim low quality regions
  • FASTQC: Generate QC distributions
  • BWAMeth: Read Alignment
  • Picard: Generates alignment and enrichment metrics
  • FranaTK: Generates a wide range of fragmentomics features
  • MethylDackel: Generates CpG, CHH, CHG bedgraph files
  • Patr: Generates fragment-wise methylation pattern and summaries
  • Stats: Consolidated statistics (Alignment, methylation, enrichment)

We have made this pipeline publicly available on our Strand Life Sciences Methylation Analysis Platform, which leverages AWS HealthOmics. Watch out for Part 2 of this series, which will provide a walkthrough of this portal for new users.

Today’s Pick
from Blogs

17 Dec 2024

Metadata Curation using AI/ML Methods

Sakshi Shinghal

Know More

06 Dec 2024

Strand’s Automated Variant Verification System Cuts Down Efforts by 80%

Divya Anantsri

Know More

Your Next
Blog Recommendations

28 Jun 2024

Somatic Variations Series

4 | Somatic Variants: Towards Better Therapy

Sanjna Banerjee

Know More

28 Nov 2024

AI or Manual Curation: The Path Forward

Sanjna Banerjee

Know More

12 Aug 2024

FDA Rule on Lab Developed Tests (LDTs)

1 | FDA Final Rule Affects Regulatory Oversight of LDT Manufacturers

Divya Anantsri

Know More

Let's
Talk

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Form
About image
Please fill out this form to
download the case study.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.