Real-World Data

Real-World Data

Bridging the Gap Between RWD and Powerful Clinical Insights

Image
01/ RWD expertise

Our Expertise in Handling RWD Data Can Help You

1 Image

01

Derive Real-World Insights from Curated Datasets

We have built discovery cohorts in multiple areas of disease (oncology & non-oncology) and demonstrated real-world insights with curated datasets such as AACR GENIE in NSCLC.

2 Image

02

Enable Precision Analytics for Patient Stratification

Our in-depth mutation-based and stage-specific stratification can power precision medicine using RWD, including assessment of immunotherapy outcome. We were recently approached by a biopharmaceutical company looking to leverage public datasets for improved patient stratification based on drug response biomarkers in ulcerative colitis. Through a proprietary workflow, we were able to analyze ~ 3000 samples, identifying five markers with 100% sensitivity at 70% specificity. This involved ingesting both public and client datasets, integrating literature-derived markers and machine learning-based response prediction, and creating a Disease Activity Score.

3 Image

03

Leverage a Robust Data Harmonization Framework

We harmonize varied datasets in a documented process, providing format uniformity, ontology consistency, and analysis preparedness at scale. The raw data we often work with varies in terms of data formats (FASTQ, BAM, CSV, etc), experiment types (Microarray, RNA-seq, ChiP-Seq, single-cell sequencing, etc), and cross-species data, among many other attributes.  The harmonization process involves integrating and standardizing this heterogeneous data to ensure consistency, compatibility, and accuracy across datasets.

4 Image

04

Accelerate and Scale Data Curation and Harmonization through LLMs

GPT-based tools drive acceleration of data ingestion, annotation, and curation at 95 % + accuracy, minimizing manual effort and turnaround time by 3x. In-house tool festiVAR supports scalable interpretation of rare disease with ACMG scoring and gene-phenotype correlation by LLM.

02/ Offerings in RWD

Our Offerings in Real World Data Studies and Harmonization

  • Strand has deep expertise in working with clinical omics data, and we are also establishing our own in-house RWD database, owing to our 24+ years of experience developing genomics solutions. We leverage this background for analyzing any type of RWD datasets. 
  • We have recently conducted two research studies using the AACR project Genie NSCLC datasets. Read more about it in this white paper.
  • Read more about it in this white paper

    Arrow
  • If data has been obtained from various sources (EHRs, claims data) & multiple providers, we standardize all the data and then harmonize it using a unified data model and create data pipelines. This enables downstream analysis of harmonized data.
  • Strand has been developing an in-house RWD data repository by collecting clinical data from various sources. We apply the following workflow to harmonize this data. Below is an example of a workflow we have implemented to analyze the in-house RWD data. This process can be customized as required.
  • Image
  • A solution to unify in-house clinical data into standardized, analyzable format
03/ Getting curious about our RWD solutions

Frequently Asked Questions

Yes, we have deep expertise in genomics and clinical data, specifically studies honing in on biomarker discovery through the integration of omics datasets. For example, we helped a biopharma client combine RNA-seq and microarray data from over 3,000 samples, using ML-based methods to identify five biomarkers with 100% sensitivity at 70% specificity. We’re confident that these skills enable us to conduct any kind of RWD studies. Further, our collaborative process involves conceptualization and analytical design, in addition to the study execution.
Yes, Strand can help with normalizing data and creating derived variables on top of the datasets to make them analytics-ready. Further, we will ensure high accuracy for these derived variables and validate them at delivery.
We can create uniform pipelines to go from FASTQ/BAM files to variants, help you annotate those variants, and eventually feed that data into a queryable database. Following this, we can help build clinical cohorts that are linked to the genomics data, thus ensuring this data can be used for downstream translational and clinical research.
We can implement an internal CDM that can harmonize these diverse inputs and enable seamless integration into downstream clinical research and translational workflows.
At Strand, we work seamlessly with diverse data types, routinely harmonizing and integrating them for robust, result-driven analysis.

Here is a quick summary of the data heterogeneity we work with regularly.

Image

We routinely harmonize complex multi-source datasets. In a recent project, we integrated both blood- and biopsy-samples from two different studies, which used CITE-Seq and scRNA-seq, identifying 10 gene markers that predict response to Vedolizumab in Ulcerative Colitis patients.
We employ a combination of AI and manual effort. The pipeline we are building requires a combination of AI (LLMs) + scripting + parsing + mapping to ensure a reduction in TAT and accuracy in standardization.
Yes. We develop data ingestion pipelines and implement derived variable (e.g., line of therapy, response metrics) generation based on client-specific business rules to eventually facilitate unified analysis. These pipelines can be customized based on the specific analysis and interpretation needs of each customer.