2 |
AI/ML in Spatial Transcriptomics and Cell Segmentation
WRITTEN BY
Sakshi Shinghal
SHARE THIS
Blog
AI can be used in a myriad of ways within the realm of precision medicine and biotechnology, from metadata curation (as mentioned in a previous blog) to drug discovery to protein structure prediction. These applications can be a complete new workflow or used to optimize and improve existing workflow.
To learn more about how Strand is integrating these technologies into workflows, one of our senior content analysts, Sakshi Shinghal, spoke to Ashish Kumar Choudhary, a scientist in the Solutions team.
Sakshi:Hi, I’m Sakshi Shinghal, a bioinformatician and geneticist who has worked with Strand for around 3 years. I recently obtained an MSc in Genomic Medicine from King’s College London, in conjunction with St George’s - University of London. I have experience in creating webpages for bioinformatics tools such as expression visualisation and variant analysis pipelines as well as developing variant analysis pipelines for different types of sequencing. I’m currently working at Strand as a senior content analyst, aiding in content development and marketing efforts.
Ashish: Hi, I’m Ashish Kumar Choudhary, a scientist at Strand, focusing on DNA-Seq, scRNA-Seq, and spatial transcriptomics; and have developed several pipelines to analyze this data. I am an expert in genomics data analysis and bioinformatics, bringing a unique blend of academic rigour and practical expertise to the table. I hold a master’s degree in biotechnology from the prestigious IIT Bombay and have also trained at the National Centre for Microbial Resource in Pune, where I gained hands-on experience in next-generation sequencing and metagenomics analysis. I am dedicated to leveraging genomics data to address biomedical challenges and make a meaningful impact on precision medicine.
Sakshi:Hi Ashish, thanks for taking the time to talk to me today. There’s been a lot of talk regarding spatial transcriptomics and working with 3D data, especially with Nature calling Spatial Proteomics Method of the Year (learn more in this blog). I heard that we have a new project where we use an automated AI-based method for cell segmentation and wanted to learn more about it.
First, taking a step back, could you briefly explain what spatial transcriptomics is and what information it can provide?
Ashish: To understand normal development and disease pathology, it is crucial to understand the relationship between cells and their relative locations within tissues. Using spatial transcriptomics, a cutting-edge technique in molecular profiling, researchers may map the locations of all the genes active in a tissue sample and assess each gene's activity.
Spatial transcriptomics techniques use intact tissue sections, spatial barcoding, or in situ hybridization to retain positional information. The position of any given cell, relative to its neighbours and non-cellular structures, can provide helpful information for defining cellular phenotype, cell state, and ultimately cell and tissue function. Spatial transcriptomics has become essential for biomedical research, particularly in developmental biology, cancer, immunology, and neuroscience.
Fig 1: A sample tissue which has been coloured using gene expression (transcriptomic) data to showcase spatial information
Sakshi: I see, so it’s important to not only know which cells are active, and when, but where they may be in relation to other cells within the tissue spatially. One aspect that would be really important in this then would be delineating, or segmenting, cell boundaries to distinguish between individual cells. Could you explain what exactly cell segmentation is and how scientists go about it?
Ashish: Cell segmentation, a fundamental computational technique, distinguishes and defines individual cell boundaries within biomedical images. It plays a crucial role in bioimage informatics, forming the basis for various detailed studies. Its primary goal is to approximate the boundaries between cells, enabling the accurate assignment of transcripts to individual cells.
Spatial transcriptomics involves mapping gene expression data to specific locations within a tissue. Cell segmentation ensures that the spatially resolved gene expression profiles are accurately assigned to the correct cells.
Segmentation offers insights into the spatial relationships and interactions between cells. For instance, researchers can examine how tumor cells engage with immune cells or the spatial organization of specific cell types in both healthy and diseased tissues.
Sakshi: Ah, I see. So you could kind of describe it as taking a biomedical image which already has expression data (such as transcripts) in specific locations and outlining the cells boundaries so that you can see which transcripts belong to which cells; so that you can better understand how they work and interact with each other. Could you describe how cell segmentation is carried out (at a high level)?
Ashish: There are two main approaches to cell segmentation:
1. Traditional Techniques: These methods rely on the intensity and gradient of images to differentiate cells from their surroundings.
Thresholding: This method sets a specific intensity value as a threshold to separate cells from the background, classifying pixels above this threshold as cells.
Watershed Transformation: Here, the image is treated as a topographic map with varying elevations, where the light intensities represent these elevations. The image is segmented into distinct regions based on these topological variations, effectively isolating overlapping cells.
2. Deep Learning Approaches: The advent of artificial intelligence has popularized deep learning techniques, such as convolutional neural networks (CNNs) and models like Mask R-CNN. These methods automatically learn from large datasets and are especially good at segmenting complex, overlapping cells. For instance, tools like Cellpose and DeepCell have been recognized for their effectiveness in various experimental settings.
Fig 2: Different approaches to cell segmentation
Choosing the right segmentation method involves evaluating their performance across various metrics, such as accuracy, adaptability to different imaging modalities, and efficiency with various cell types. Recent studies have shown that deep learning methods generally offer superior performance over traditional techniques, particularly when trained on diverse and large datasets.
Sakshi: Wow, that’s fascinating! So what was traditionally a map-making endeavour (of sorts) is now using machine learning to do this cell segmentation on the basis of large datasets and historical datasets. That’s such an innovative approach to automate this rather complex problem. I take it at Strand we have been able to develop a method that uses this technology to show improved results for segmentation. Could you tell me a little more about it?
Ashish: At Strand, we use our in-house segmentation pipeline, which is based on Cellpose 3.0, for spatial imaging. The Cellpose Library offers several pretrained models specifically designed to detect cells, with these models trained on one of nine datasets. Due to data complexity, we have occasionally encountered segmentation results that include false positives. However, we can train the model on new datasets to accurately identify true cells. Our pipeline excels at detecting and analyzing cells within complex images using these trained models. We have rigorously trained our model on a diverse range of images to ensure it accurately captures a high number of cells, even in challenging areas. This level of accuracy is crucial for in-depth cellular studies and enhances our potential to drive significant advancements in cellular research. We actually discussed the pipeline in an earlier post you can check out here.
Fig 3: Some results from the Strand pipeline, showing regions where cells were otherwise missed or false cells were captured.
Sakshi: I see, it sounds like this pipeline can produce some really robust results, being able to segment at a greater scale, and picking up those easy-to-miss cells. What was it that achieved these results, and are we working on improving them further?
Ashish: The Strand pipeline is equipped with advanced Cellpose3 which specifically focuses on image restoration for improved cell segmentation.The Traditional methods rely on linear and nonlinear filtering, while modern approaches employ deep neural networks. However, these networks need a training dataset of paired clean and noisy images to learn how to predict clean images from noisy inputs. By using Cellpose3 we were able to capture the more obscure cells that may have been missed earlier.
Cellpose has a couple of pre-trained models, all of which were trained with the regions of interest (ROIs) resized to a diameter of 30.0 (diam_mean = 30), except the ‘nuclei’ model which was trained with a diameter of 17.0 (diam_mean = 17).
We are currently looking into training a new model. In this model will cellpose compute the flow field representation for each mask image. As we are building the model from scratch we are also training the models such that the mean diameter can be set as desired to use for rescaling with the --diam_mean flag (or parameter) when defining an instance of the cellulose class. Training a model is similar to teaching it how to define cell boundaries based on the available data, such as the diameter of the cell, the channel to be used, cell pixels, and a few other parameters.
The output format will remain the same as what we received from the existing model. The main difference is that by training and implementing our own model, we can capture more true cells compared to the existing model.
Sakshi:Oh wow, so with these newly trained model we’ll be able to really customize the information we want to capture from these images with these additional parameters. This work promises to be really fascinating, and I’ll definitely keep my eyes peeled for further updates. Thanks Ashish for taking the time to talk to me.
To learn more about the work Strand check out us.strandls.com and reach out to us at bioinformatics@strandls.com
SHARE THIS
Blog
Today’s Pick from Blogs
24 Jan 2025
Strand’s Methylation Pipeline Series
2 |
Strand’s Methylation Pipeline - An Overview - Part 2