Hostname: page-component-586b7cd67f-t7czq Total loading time: 0 Render date: 2024-11-30T19:57:57.381Z Has data issue: false hasContentIssue false

DeepSpot: A deep neural network for RNA spot enhancement in single-molecule fluorescence in-situ hybridization microscopy images

Published online by Cambridge University Press:  19 April 2022

Emmanuel Bouilhol*
Affiliation:
CNRS, IBGC, UMR 5095, Université de Bordeaux, Bordeaux, France Bordeaux Bioinformatics Center, Université de Bordeaux, Bordeaux, France
Anca F. Savulescu
Affiliation:
IDM, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
Edgar Lefevre
Affiliation:
Bordeaux Bioinformatics Center, Université de Bordeaux, Bordeaux, France
Benjamin Dartigues
Affiliation:
Bordeaux Bioinformatics Center, Université de Bordeaux, Bordeaux, France
Robyn Brackin
Affiliation:
Advanced Medical Bioimaging CF, Charité—Universitätsmedizin, Berlin, Germany
Macha Nikolski*
Affiliation:
CNRS, IBGC, UMR 5095, Université de Bordeaux, Bordeaux, France Bordeaux Bioinformatics Center, Université de Bordeaux, Bordeaux, France
*
*Corresponding authors. E-mails: [email protected], [email protected]
*Corresponding authors. E-mails: [email protected], [email protected]

Abstract

Detection of RNA spots in single-molecule fluorescence in-situ hybridization microscopy images remains a difficult task, especially when applied to large volumes of data. The variable intensity of RNA spots combined with the high noise level of the images often requires manual adjustment of the spot detection thresholds for each image. In this work, we introduce DeepSpot, a Deep Learning-based tool specifically designed for RNA spot enhancement that enables spot detection without the need to resort to image per image parameter tuning. We show how our method can enable downstream accurate spot detection. DeepSpot’s architecture is inspired by small object detection approaches. It incorporates dilated convolutions into a module specifically designed for context aggregation for small object and uses Residual Convolutions to propagate this information along the network. This enables DeepSpot to enhance all RNA spots to the same intensity, and thus circumvents the need for parameter tuning. We evaluated how easily spots can be detected in images enhanced with our method by testing DeepSpot on 20 simulated and 3 experimental datasets, and showed that accuracy of more than 97% is achieved. Moreover, comparison with alternative deep learning approaches for mRNA spot detection (deepBlink) indicated that DeepSpot provides more precise mRNA detection. In addition, we generated single-molecule fluorescence in-situ hybridization images of mouse fibroblasts in a wound healing assay to evaluate whether DeepSpot enhancement can enable seamless mRNA spot detection and thus streamline studies of localized mRNA expression in cells.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Impact Statement

Our paper introduces DeepSpot, a Deep Learning-based tool specifically designed to enhance RNA spots which enables downstream spot detection without the need to resort to image per image parameter tuning. DeepSpot’s architecture is inspired by small object detection approaches by integrating dilated convolutions into a module specifically designed for context aggregation for small object and using Residual Convolutions to propagate this information along the network.

1. Introduction

Single cell microscopy together with RNA single-molecule fluorescence in-situ hybridization (smFISH) technologies allows gene expression profiling at subcellular precision for determining molecular states of various cell types(Reference Ke, Mignardi and Pacureanu 1 ) and that in high-throughput fashion(Reference Battich, Stoeger and Pelkmans 2 ). The repertoire of mRNA expression quantification methods is large and includes smFISH, clamp-FISH, amp-FISH, and multiplexed versions, such as MerFISH, all allowing the localization of RNA at subcellular level. There are technological differences between these methods in terms of number of detected RNAs and number of processed cells; however, all produce imaging data with mRNA spots that can be further matched to the spots’ x and y coordinates. With such increased image acquisition automation and the consequent growing number of high-throughput projects focused on spatially resolved transcriptomics, the need for automated and highly accurate detection of mRNA spots in fluorescent microscopy images has become increasingly important.

smFISH is a method for visualizing individual RNA transcripts in fixed cells. smFISH is based on targeting RNA molecules with a set of 24–48 oligonucleotide probes, each individually labeled with one fluorophore. The combined fluorescence intensity level obtained from this high number of probes makes each RNA transcript visible as a spot that can be computationally identified and quantified(Reference Raj, Van Den Bogaard, Rifkin, Van Oudenaarden and Tyagi 3 ).

Despite the progress made in recent years, it is still difficult to detect the localization of spots corresponding to different mRNAs in a fully automated manner. First, background intensity is often irregular due to various factors, including autofluorescence that can be caused by intrinsic cell properties and fixative-induced fluorescence phenomenon(Reference Rich, Stankowska and Maliwal 4 ) or growth medium and buffers. An additional contributor to the background noise is off-target binding of probes, which depends on a various number of parameters, including the length of the transcript, its sequence, the cell type used, and others. Second, spot detection is affected by the nonhomogeneous intensity distribution and indistinct spot boundaries relative to the background. Moreover, fluorescence in-situ hybridization (FISH) images may have a low signal-to-noise ratio (SNR). Additionally, the boundary between background (noise) and signal (spots) is usually not evident(Reference Zhang 5 ).

The main drawback of classical mRNA spot detection methods is the requirement of a strong human input to determine the best parameters to handle variable image to image properties such as SNR and presence of artifacts. Even small differences in these characteristics lead to the necessity for parameter fine-tuning(Reference Caicedo, Cooper and Heigwer 6 ). Other than being time-consuming, the quality of detection largely depends on the capacity of the user to correctly choose the method’s parameters according to each image properties (contrast, spots, artifacts, and noise). Some recent deep-learning-based approaches for mRNA spot detection try to circumvent this limitation, such as deepBlink(Reference Eichenberger, Zhan, Rempfler, Giorgetti and Chao 7 ).

Here, we introduce DeepSpot, a convolutional neural network (CNN) method dedicated to the enhancement of fluorescent spots in microscopy images, thus enabling downstream mRNA spot detection by conventional widely used tools without need for parameter fine-tuning. With DeepSpot, we show that it is possible to avoid the manual parameter tuning steps by enhancing the signal of all spots so that they have the same intensity throughout all images regardless of the contrast, noise, or spots shape. All the code as well as the pretrained model is available on GitHub, and a plug-in for the image analysis tool Napari is also distributed (https://github.com/cbib/DeepSpot).

DeepSpot gives a new twist to the residual network (ResNet) network architecture and learns to automatically enhance the mRNA spots, bringing them all to the same intensity. In parallel, a multinetwork architecture is integrated, trained by minimizing the binary cross-entropy (BCE) while providing context for mRNA spots thanks to the atrous convolutions. We evaluated the impact of the spot enhancement on the downstream mRNA spot detection, by performing spot detection using Icy with fixed parameters on both simulated images and experimental images manually annotated. Moreover, we compared the quality of mRNA spot detection from images enhanced by DeepSpot with deepBlink, and have shown that our method achieves greater generalization to effectively handle full variability of smFISH data. Finally, to illustrate the end-to-end use of DeepSpot in projects where detecting subcellular localization with high precision is essential, we generated smFISH images from mouse fibroblasts in a wound healing assay, where enrichment of expression of $ \beta $ -Actin toward the location of the wound is expected in the migrating 3T3 mouse fibroblasts.

2. Related Work

Methodologically mRNA spot detection can be related to the detection of small objects topic in image analysis. The goal of spot detection is to find small regions in an image that differ from the surroundings with respect to certain properties, such as brightness or shape, more precisely, regions with at least one local extremum(Reference Lindeberg 8 ). Spots can be considered as a particular case of more or less circular objects of small extent. Object detection has been one of the key topics in computer vision which goal is to find the position of objects. However, small object detection, such as mRNA spots, remains difficult because of low-resolution and limited pixels(Reference Huang, Rathod and Sun 9 ).

In this work, we propose a deep learning network inspired by small object detection approaches for mRNA spot enhancement and we show how it can enable the downstream accurate detection of spots.

2.1. mRNA spot detection

In the case of FISH images, mRNA spots are small, compact, and smaller than the resolution limit of the microscope(Reference Olivo-Marin 10 ); therefore, images of mRNA spots correspond to the maximum intensity pixel surrounded by the diffraction of the fluorescence signal defined by the point spread function (PSF), which can be modeled by a Gaussian within small radius disk (see Figure 1b). This radius depends on several imaging parameters and optical properties of the microscope such as the diffraction limit or the excitation state of the fluorophore.

Figure 1. (a) RNA spots on a noisy background. (b) Spots’ intensity is increased after the enhancement by $ e $ .

While there is no universal solution to the detection of small objects such as mRNA spots in fluorescent cellular images, a large plethora of work is available on the subject. A number of approaches have gained wide popularity thanks to the development of software tools embedding the algorithms and providing user with a graphical interface. In particular, ImageJ/Fiji(Reference Abràmoff, Magalhães and Ram 11 , Reference Schindelin, Arganda-Carreras and Frise 12 ) is widely used, largely due to the plug-in-based architecture, recordable macro language, and programmable Java API. Another popular tool is CellProfiler(Reference Kamentsky, Jones and Fraser 13 ), based on similar paradigms and FISH-Quant(Reference Mueller, Senecal and Tantale 14 ). A more recent platform, Icy(Reference De Chaumont, Dallongeville and Chenouard 15 ), also provides the possibility to develop new algorithms as well as a user interface for image analysis, including the Icy spot detector for the detection of mRNA spots, a method based on wavelet transform decomposition(Reference Olivo-Marin 10 ).

Deep learning networks have been introduced for mRNA spot detection(Reference Gudla, Nakayama, Pegoraro and Misteli 16 , Reference Mabaso, Withey and Twala 17 ), and more recently deepBlink(Reference Eichenberger, Zhan, Rempfler, Giorgetti and Chao 7 ). The latter focuses on a fully convolutional neural network based on the U-Net architecture. deepBlink not only provides the code, but also annotated smFISH data and implements a threshold-independent localization of spots.

2.2. Detection and enhancement of small objects

Consistently with the mRNA spot detection difficulty, detection of small objects remains a challenging part of the general object detection problem due to the limited information contained in small regions of interest. For instance, it has been shown that the object size has a major impact on the accuracy of Deep Learning object detection networks such as VGG, ResNet, or Inception V2(Reference Huang, Rathod and Sun 9 ). Indeed, small objects do not contain sufficient semantic information(Reference Liu, Gao, Sun and Fang 18 ), and thus the challenge is to capture semantic features while minimizing spatial information attenuation(Reference Fu, Li, Ma, Mu and Tian 19 ).

Expectedly, adding more context improves the detection of small objects(Reference Fu, Li, Ma, Mu and Tian 19 Reference Noh, Bae, Lee, Seo and Kim 21 ). An elegant solution is to use the dilated convolution (a.k.a. atrous convolution), because the receptive field can be expanded without loss of resolution and thus capture additional context without loss of spatial information(Reference Hamaguchi, Fujita, Nemoto, Imaizumi and Hikosaka 22 , Reference Yu and Koltun 23 ).

Of particular interest to our work is the signal enhancement, an image processing technique aiming to reinforce the signal only in those regions of the image where the actual objects of interest are, and potentially to weaken the noise or the signal from other structures(Reference Smal, Loog, Niessen and Meijering 24 ). In our case, the objects of interest are mRNA spots.

Image enhancement is the transformation of one image $ X $ into another image $ e(X) $ (see Figure 1). Pixel values (intensities) of $ X $ at spot locations are modified according to the transformation function $ e $ , the resulting pixel values in image $ e(X) $ measuring the certainty of mRNA presence at that position. Thus, $ e(X) $ can be considered as a probability map that describes possible mRNA spot locations(Reference Smal, Loog, Niessen and Meijering 24 ).

Small object enhancement has been developed in other fields than mRNA spots in fluorescent imaging, such as in astronomy to enhance stars or galaxies over the cosmic microwave background(Reference Sadr, Vos, Bassett, Hosenie, Oozeer and Lochner 25 ) or in the biomedical imaging to facilitate the human detection of larger spots such as nodules(Reference Liu, White and Summers 26 ). In the microscopy field, Deep-STORM(Reference Nehme, Weiss, Michaeli and Shechtman 27 ) enhances the resolution of single-molecule localization microscopy mRNA spots using deep learning. However, these images do not have the same characteristics as smFISH data in terms of noise or signal. For example, human nodules are much larger objects than typical mRNA spots. As for the star enhancement method proposed by Sadr et al. (Reference Sadr, Vos, Bassett, Hosenie, Oozeer and Lochner 25 ), it is not suited for low-intensity spots(Reference Pino, Sortino, Sciacca, Riggi and Spampinato 28 ), which is a major concern in spot detection for smFISH images.

3. Materials and Methods

3.1. Materials

3.1.1. smFISH datasets with alternatively established spots’ localization

We constructed a dataset $ {\mathrm{DS}}_{\mathrm{exp}} $ of 1,553 images from the experimental smFISH data acquired in Reference (Reference Chouaib, Safieddine and Pichon29), by applying $ 256\times 256 $ pixels patches to better fit in the GPU memory (see Table 1). The authors have performed spot detection using image analysis techniques, such as local maximum detection, implemented in the BIG-FISH pipeline(Reference Imbert, Ouyang and Safieddine 30 ). Along with the scripts, the authors provided a list of 57 parameter combinations that were used to detect the spots in 57 different image acquisition series of 32 genes used in their study. We ran the pipeline with these parameters and performed an additional manual curation to keep patches with number of spots between 10 and 150 and remove those with a visually obvious over- or under-identification of spots. The resulting $ {\mathrm{DS}}_{\mathrm{exp}} $ dataset of 1,553 images representing 27 different genes is thus experimentally generated and guarantees high confidence in the ground truth annotation of mRNA spot coordinates.

Table 1. List of datasets used for training (22 training datasets) and evaluation (21 test datasets) of the DeepSpot network. Images in the $ {\mathrm{DS}}_{\mathrm{var}}^i $ datasets have spot intensities between 160 and 220 for each image. For the fixed intensity datasets $ {\mathrm{DS}}_{\mathrm{fixed}}^i $ , the spot intensity is set to one value within [160…220] for a given image, but varies from image to image. Ten variable intensity $ {\mathrm{DS}}_{\mathrm{var}}^i $ and 10 fixed intensity $ {\mathrm{DS}}_{\mathrm{fixed}}^i $ datasets are named according to the number of spots in the images, $ i $ . The dataset $ {\mathrm{DS}}_{\mathrm{hybrid}} $ combines $ {\mathrm{DS}}_{\mathrm{exp}} $ with $ 25\% $ simulated images.

We have also downloaded the dataset $ {\mathrm{DS}}_{\mathrm{dB}} $ from the deepBlink publication(Reference Eichenberger, Zhan, Rempfler, Giorgetti and Chao 7 ), composed of 129 smFISH images acquired from four different cell culture conditions of the HeLa-11ht cell line. The authors provided their annotation of mRNA spots’ locations, which was performed using TrackMate and curated by experts.

To estimate the variability of spot intensity, we also calculated the coefficient of variation (CV) of spot intensities for each gene of $ {\mathrm{DS}}_{\mathrm{exp}} $ as well as for the entire $ {\mathrm{DS}}_{\mathrm{exp}} $ and $ {\mathrm{DS}}_{\mathrm{dB}} $ datasets (Supplementary Figure S1). While, for most images and genes, the CV ranges from 0.25 to 0.50, some have a very high CV. The global CV for all the spots of all the images in the $ {\mathrm{DS}}_{\mathrm{exp}} $ dataset is 0.72 and 0.22 for $ {\mathrm{DS}}_{\mathrm{dB}} $ .

3.1.2. Novel smFISH dataset from a wound healing assay

To evaluate whether DeepSpot enables precise mRNA spot detection in a biological context, we made use of the wound healing assay(Reference Moutasim, Nystrom and Thomas 31 ) to generate a second experimental dataset $ {\mathrm{DS}}_{\mathrm{wound}} $ . In the wound healing assay, migrating cells, such as fibroblasts, are grown on a coverslip and serum starved for synchronization. A scratch in the middle of the coverslip is then generated, mimicking a wound, followed by induction of the cells to polarize and migrate toward the wound, to generate wound closure, done using replacement of serum starved medium with 10% FBS-containing medium. We used 3T3 mouse fibroblasts in a wound healing assay, followed by cell fixation. Fixed samples were taken for smFISH experiments to visualize and quantify $ \beta $ -Actin mRNA and imaged on a custom-built Nikon Ti Eclipse widefield TIRF microscope. $ \beta $ -Actin has been previously shown to be enriched in neuronal growth cones of extending axons, as well as the leading edges of migrating cells, and this enrichment has typically been associated with cell polarity and neuronal plasticity(Reference Zhang, Singer and Bassell 32 Reference Condeelis and Singer 34 ). Based on this, we hypothesized that $ \beta $ -Actin would be enriched in the leading edge of migrating 3T3 fibroblasts. The dataset is composed of 96 images, 48 images of nonmigrating cells (control), and 48 images of migrating cells. Each image was divided into four patches, yielding a total of 384 patches of $ 256\times 256 $ pixels size.

3.1.3. Simulated and hybrid datasets

In addition to the experimental datasets, we have built 20 simulated datasets with images $ 256\times 256 $ pixels of width and height, the same as the patches of $ {\mathrm{DS}}_{\mathrm{exp}} $ . Briefly, the background was generated by a combination of Poisson noise and Perlin noise with a random intensity between [80, 150]. Elastic transformations were added to this noise to approximate the variety of textured background noise in the experimental images. Spots were generated as circles, randomly placed in the image, and then convolved with a Gaussian function that approximates PSF. Their size randomly ranges from 4 to 9 pixels in diameter, including Gaussian smoothing.

Two different types of simulated datasets were generated $ {\mathrm{DS}}_{\mathrm{fixed}} $ and $ {\mathrm{DS}}_{\mathrm{var}} $ , each containing 10 datasets defined according to the number of spots per image $ i\hskip0.35em \in \hskip0.35em \left\{\left[\mathrm{10}\dots \mathrm{100}\right]\hskip0.24em \operatorname{mod}10\right\} $ (see Table 1). For example, in the $ {\mathrm{DS}}_{\mathrm{fixed}}^{20} $ dataset, each image contains 20 spots, and in the $ {\mathrm{DS}}_{\mathrm{var}}^{70} $ dataset, each image contains 70 spots. Each image in the $ {\mathrm{DS}}_{\mathrm{fixed}} $ dataset has the same fixed spot intensity for all spots randomly chosen in the interval [160, 220], whereas in the $ {\mathrm{DS}}_{\mathrm{var}} $ dataset, the spot intensity is randomly chosen from the same interval for each spot, resulting in images with variable spot intensity.

In addition to the experimental and simulated datasets, we have built a hybrid dataset $ {\mathrm{DS}}_{\mathrm{hybrid}} $ where the experimental data from $ {\mathrm{DS}}_{\mathrm{exp}} $ are augmented by appending an additional $ 25\% $ of simulated images generated with both variable spots’ intensity and variable spots’ number per image, within the [160, 220] and [10, 100] intervals, respectively. The intensity of the spots is calculated with respect to the intensity of the background noise, so that the generated images have an SNR between 10 and 40. These values correspond to the minimum and median SNR values in our experimental images, respectively (Supplementary Figure S2).

3.1.4. Ground truth

Since the goal of our network is to learn to transform an image $ X $ into $ e(X) $ , where intensity at spots’ location is enhanced, the training step has to be provided with the enhanced counterpart of each image in the training set. That is, training sets include pairs of images $ \left\langle X,e(X)\right\rangle $ where $ e $ is the procedure that is used to produce ground truth enhanced images: for each spot (Figure 1a), the ground truth enhancement procedure $ e $ is applied at the spots’ locations $ A(X) $ , resulting in images where spots are enhanced as shown in Figure 1b.

In this work, we implement $ e $ as a kernel of $ 3\times 3 $ pixels at all locations where the spots were annotated in the experimental dataset $ {\mathrm{DS}}_{\mathrm{exp}} $ or generated for $ {\mathrm{DS}}_{\mathrm{fixed}} $ , $ {\mathrm{DS}}_{\mathrm{var}} $ , and $ {\mathrm{DS}}_{\mathrm{hybrid}} $ (see Table 1). The kernel has the same pixel values for all the spots, in order to drive the network to learn to enhance all spots up to the same level of intensity, regardless of the initial intensity in the acquired data. The enhancement kernel of DeepSpot is smaller than the smallest spot size in our datasets; therefore, it is not expected to augment the size spot. This is particularly important for spatially close spots. Moreover, background is kept the same between $ X $ and $ e(X) $ , so the transformation does not affect the background.

3.2. Method

In this section, we present the DeepSpot enhancement network in detail. We first overview the network architecture and then we discuss the custom loss function.

3.2.1. Network architecture

DeepSpot network is composed of two main components as shown in Figure 2. The first component is a multipath network shown in Panel A. The second component is an adapted residual network as shown in Panel B.

Figure 2. DeepSpot network architecture is composed of the context aggregation for small objects module constituted of a multipath network (Panel A) and a customized ResNet component (Panel B). A custom loss function is used for training the network.

Context aggregation module

As pointed out in Reference (Reference Hamaguchi, Fujita, Nemoto, Imaizumi and Hikosaka22), finding small objects is fundamentally more challenging than large objects, because the signal is necessarily weaker. To solve this problem in the context of mRNA spot detection, we developed a new module, that we called the context aggregation for small objects (CASO) module(Reference Hu and Ramanan 35 ), demonstrated that using image evidence beyond the object extent (context) always enhances small object detection results, and we therefore developed our CASO module to aggregate context around the mRNA spots. The CASO module is a multipath network as shown in Panel A of Figure 2. It takes the input image and processes it along three different paths each with different types of convolution blocks to collect specific information from the input image. Each path contains three convolution blocks.

  1. 1. The first path is composed of traditional convolution blocks (2D convolution, batch normalization, activation, and max pooling). These blocks, often used in CNNs, are particularly efficient to reinforce the semantic information at the expense of spatial information.

  2. 2. The second path uses only 2D convolutions, batch normalization, and activation. As Max Pooling is known to keep mostly the maximum intensities in images, some of the faint spots may be eliminated during the max pooling operation. In this path, we used strided 2D convolutions instead of a max pooling layer, to keep the information of low intensity of spots. To make sure to end up with the same receptive field as the first path, we set the stride to 2.

  3. 3. The third path makes use of the atrous convolution pooling(Reference Hamaguchi, Fujita, Nemoto, Imaizumi and Hikosaka 22 ), implemented as a 2D convolution with the dilatation rate of 2. The following layers are batch normalization, activation, and max pooling.

The CASO module is a multipath neural network and can learn more comprehensive and complementary features than a single path. In particular, the goal of the atrous convolution is to bring more context around the small spots (see Section 2.2), while the two other paths aggregate the semantic information of the bright spots and faint spots for the first and second paths, respectively. The results of the three encoding paths are then concatenated to construct a longer feature vector containing information extracted by each path. For all convolutional blocks, the activation function is the rectified linear unit (ReLU). The number of filters for the 2D convolutions in the CASO module are 32, 64, and 128 for the first, second, and third paths, respectively.

Custom ResNet

For the second component (Panel B of Figure 2), we customized the ResNet architecture to create a residual neural network composed of 10 consecutive convolutional residual blocks (ResBlock), using full pre-activation blocks described in Reference (Reference He, Zhang, Ren and Sun36), where the authors suggested that the better results obtained by the full pre-activation blocks are due to the pre-activation by the batch normalization that improves regularization of the model due to the fact that the inputs to all weight layers have been normalized. Each ResBlock is composed of three subblocks as shown in Figure 3. A subblock is constituted of a batch normalization followed by an activation (ReLU) and a 2D convolution. After the three subblocks, a spatial dropout layer with the rate of 0.2 is applied. Each ResBlock ends by the residual connection.

Figure 3. Full pre-activation residual block, composed of batch normalization, activation, and convolution, repeated three times before dropout and residual connection.

To obtain an output image with the same size as the input image, we used a particular type of convolutional block. Recently, Reference (Reference Wang, Guo, Liu and Wei37) demonstrated that the use of up residual blocks (UpResBlocks) instead of classic up-convolution blocks improves the performance of generative networks by preserving the effective features from the low-dimensional feature space to the high-dimensional feature space. Our decoding path is composed of three UpResBlocks and reconstitutes an output with the same size as the input while propagating low-dimensional feature information of the enhanced spots from the custom ResNet to the last layer. Each UpResBlock is constituted of three subblocks containing a 2D transposed convolution, batch normalization, and activation (ReLU). The three subblocks are followed by a spatial dropout layer with the rate of 0.2. UpResBlocks then end with a residual connection. A sigmoid activation function is applied to the last convolution, so that all the pixels have values in the $ \left[0,1\right] $ interval. The final image is obtained by normalizing the pixel intensities between 0 and 255.

3.2.2. Loss

We defined our custom loss function as a combination of BCE and mean squared error (MSE) functions. The main term of the loss function is the BCE loss, defined by $ {\mathrm{\mathcal{L}}}_{\mathrm{BCE}}\left(x,\hat{x}\right)\hskip0.35em =\hskip0.35em \left(x\log \left(\hat{x}\right)+\left(1-x\right)\log \left(1-\hat{x}\right)\right) $ that measures the difference between the images predicted by the network $ \hat{x} $ and the ground truth images $ x $ . While mostly used for classification, it can also be used for segmentation and enhancement due to its performance for pixel-level classification(Reference Jadon 38 ).

To this main $ {\mathrm{\mathcal{L}}}_{\mathrm{BCE}} $ term, we added a regularization term defined by MSE, $ \mathrm{MSE}\hskip0.35em =\hskip0.35em \frac{1}{n}{\sum}_{i\hskip0.35em =\hskip0.35em 1}^n{\left(\max \left({x}_i\right)-\max \left({\hat{x}}_i\right)\right)}^2 $ that is computed between the maximum value of the predicted image $ \max \left({\hat{x}}_i\right) $ and the maximum value of the ground truth image $ \max \left({x}_i\right) $ . This regularization drives the network to produce spots whose intensity is close to 255 (see Table 2), and therefore standardizes the signal enhancement intensity in the output images, which in its turn facilitates the downstream automatic detection of the spots. The total loss function is $ {\mathrm{\mathcal{L}}}_{\mathrm{BCE}}+{\mathrm{\mathcal{L}}}_{\mathrm{MSE}} $ .

Table 2. Spot enhancement performance in terms of resulting spot intensity. The measures displayed correspond to the spot intensity between [0, 255] after enhancement by the neural network and averaged by category of models and datasets. Between brackets are shown the $ 95\% $ confidence intervals. Model categories are listed in rows, whereas columns correspond to the dataset categories on which the different models were applied.

4. Results

We trained the DeepSpot network on our 20 simulated training datasets $ {\mathrm{DS}}_{\mathrm{fixed}} $ and $ {\mathrm{DS}}_{\mathrm{var}} $ as well as on the experimental and hybrid datasets $ {\mathrm{DS}}_{\mathrm{exp}} $ and $ {\mathrm{DS}}_{\mathrm{hybrid}} $ , resulting in 22 models $ {\mathrm{M}}_{\mathrm{fixed}} $ , $ {\mathrm{M}}_{\mathrm{var}} $ , $ {\mathrm{M}}_{\mathrm{exp}} $ , and $ {\mathrm{M}}_{\mathrm{hybrid}} $ . Training parameters were optimized with HyperOpt algorithm(Reference Bergstra, Yamins and Cox 39 ) and the ASHA scheduler(Reference Li, Jamieson and Rostamizadeh 40 ). The best configuration obtained and used for further trainings had the learning rate of 0.0001, the dropout rate of 0.2, and the batch size of 32 and 128 filters per convolution.

Each of the resulting 22 models was evaluated on the 21 test datasets from Table 1, yielding 462 enhanced test datasets (8,100 images in total). To assess whether DeepSpot enhancement enables easy spot detection, we applied the Icy spot detector (Reference Olivo-Marin 10 ) to the images enhanced by different models. Moreover, we defined a unique set of Icy parameters that matches the shape and intensity of the enhancing kernel of the DeepSpot network: scale 3 and sensitivity 20, and scale 7 and sensitivity 100. We then evaluated whether the detected spots from the enhanced images matched well with the annotated ground truth of spots’ coordinates.

4.1. Evaluation procedure

We denote by $ D(X)\hskip0.35em =\hskip0.35em \left\{{p}_1,\dots, {p}_n\right\} $ the point pattern detected by Icy from the enhancement of an image $ X $ by DeepSpot and by $ A(X)\hskip0.35em =\hskip0.35em \left\{{q}_1,\dots, {q}_m\right\} $ the ground truth annotation of spot coordinates. Notice that $ m $ is not necessarily equal to $ n $ , corresponding to under- or over-detection, and even for a well-detected spot, the coordinates in $ A $ and $ D $ may slightly differ. To account for these remarks, we used the $ k $ -d tree algorithm(Reference Bentley 41 ) to query the detection $ D $ for nearest neighbors in $ A $ as proposed in References (Reference Samet42,Reference De Berg, Cheong, Van Kreveld and Overmars43). The number of neighbors was set to 1 and the matching radius $ t $ to 3 (coherent with the enhancement kernel for ground truth images).

This allows to establish a matching for all annotated points under $ t\hskip0.35em =\hskip0.35em 3 $ and thus also defines the number of False Negatives or False Positives corresponding to the missing matches from $ D(X) $ or $ A(X) $ , respectively. True Negatives are defined by all pixels $ p $ of the confusion matrix such that $ p\hskip0.35em \in \hskip0.35em X\backslash \left\{A(X)\cup D(X)\right\} $ . However, given that $ \mid X\mid \gg \mid X\backslash \left\{A(X)\cup D(X)\right\}\mid $ implies inflated TN values, this makes measures such as accuracy, AUC, and ROC curve irrelevant.

The drawback of matching $ m $ versus $ n $ points is the possibility of ambiguous matching. With the $ k $ -d tree approach, it happens when two points $ {q}_i,{q}_j\hskip0.35em \in \hskip0.35em A $ can match to one $ {p}_k\hskip0.35em \in \hskip0.35em D $ (see Figure 4). This can happen if annotated spots $ {q}_i,{q}_j $ are close and the detected matching point for both of them, $ {p}_k $ , lies within the same distance $ t $ , which can correspond to an over-enhancement and thus blurring between the two spots in the enhanced image. While alternative solutions such as the Linear Assignment Problem can be used, they do not avoid the problem of matching two different numbers of points. The $ k $ -d tree approach has the advantage to keep the ambiguous matches (AMs) explicitly to measure this effect. We thus also report the number of AMs.

Figure 4. Spot matching by 1-neighbor $ k $ -d tree between the detected mRNA spots $ \left\{{p}_1,\dots, {p}_9\right\} $ depicted in blue and annotated spots $ \left\{{q}_1,\dots, {q}_7\right\} $ depicted in red. The $ k $ -d tree construction for $ \left\{{p}_1,\dots, {p}_9\right\} $ is shown on the left. Using the matching radius depicted by circles, the $ k $ -d tree queries for $ {q}_1 $ and $ {q}_2 $ , shown in red, lead to the same leaf $ {p}_2 $ and correspond to an ambiguous match, while query for $ {q}_3 $ leads to an unique match. mRNA spots $ {p}_1,{p}_4 $ , and $ {p}_8 $ are the False Negatives.

4.2. DeepSpot enhances the mRNA spot signal to the same intensity

To avoid the manual selection of the detection threshold, it is imperative to have a homogeneous spot intensity for the whole dataset in order to use a unique set of parameters for all images. Table 2 summarizes the intensities obtained after enhancement by DeepSpot for each category of datasets described in Table 1 (experimental, hybrid, simulated with variable, and fixed intensities).

As expected, intensities were closer to 255 when training and test datasets belong to the same category. For example, models trained on data with fixed intensities $ {\mathrm{M}}_{\mathrm{fixed}} $ and applied to data with fixed intensities $ {\mathrm{DS}}_{\mathrm{fixed}} $ produced enhanced spot intensities very close to 255. Similarly, enhancement close to 255 could be observed when models $ {\mathrm{M}}_{\mathrm{var}} $ were evaluated on $ {\mathrm{DS}}_{\mathrm{var}} $ . Of particular interest are the enhancement results from the $ {\mathrm{M}}_{\mathrm{exp}} $ training that are close to 250 for every dataset, experimental or simulated. Hybrid model enhances intensities to 241 on the experimental dataset. In general, the enhanced spot intensities were between 241 and 255, representing a variation of only 5.4% from the maximum intensity, which is sufficient to fully separate smFISH spots from the background in the enhanced images.

4.3. DeepSpot enables accurate mRNA spot detection

The summary statistics of performance of each model type are reported in Table 3 including the mean F1-score, precision, recall, and AMs, with 95% confidence interval. For a given model $ M $ , each metric was computed for all enhanced images (8,100 in total). The mean metric value $ \overline{x} $ and 95% confidence interval $ C $ were then calculated separately for each model type. Due to the high prevalence of True Negatives, instead of the Accuracy measure, we calculated the F1-score, which gives an indication of the model accuracy with a better balance between classes than the actual accuracy measure, by not including True Negatives. We compared the F1-scores for each of the 14 genes present in the test dataset; no major difference in DeepSpot performance can be observed for these genes (Supplementary Figure S3). This consistency in performance across different SNR values (Supplementary Figure S2) shows that DeepSpot has the capacity to enhance spots in images with varying characteristics.

Table 3. Models’ performance per model type. Metrics (F1-score, precision, recall, and ambiguous matches [AMs]) were calculated by averaging the values obtained for each image of the 21 test datasets. Top values in cells correspond to the mean value, whereas bottom values between brackets show the 95% confidence interval. Best values are highlighted in bold.

The results in Table 3 indicate that the number of FP is very low for all models, given that both precision and recall are high. Importantly, $ {\mathrm{M}}_{\mathrm{hybrid}} $ has shown best overall performance in terms of precision, recall, and F1-score, thus indicating that the mRNA spot enhancement by $ {\mathrm{M}}_{\mathrm{hybrid}} $ leads to the least FP and FN counts in the downstream mRNA spot detection.

Figure 5 shows the mean F1-scores for each of the 22 models evaluated on the 21 datasets. This heat map indicates that (a) models trained on the $ {\mathrm{DS}}_{\mathrm{fixed}}^i $ training sets perform better on the corresponding $ {\mathrm{DS}}_{\mathrm{fixed}}^i $ test sets rather than on variable intensity test sets, and (b) models trained on $ {\mathrm{DS}}_{\mathrm{var}}^i $ training sets show better performance on the corresponding test sets rather than on fixed intensity test sets. It also shows that $ {\mathrm{M}}_{\mathrm{var}} $ models are globally better than $ {\mathrm{M}}_{\mathrm{fixed}} $ models. A plausible hypothesis is that training on variable intensities makes models better at generalizing on other data. Finally, $ {\mathrm{M}}_{\mathrm{hybrid}} $ is the model that has the best overall performance, including the experimental dataset. Again, the diversity of training data drives this model to be more robust to newly encountered data.

Figure 5. Heat map of the F1-scores obtained by each of the 22 models when evaluated on the 21 test datasets described in Table 1.

More generally, given that the F1-score is above 90% for all the models, we can conclude that the architecture of the DeepSpot neural network is particularly suited for the task of mRNA spot enhancement.

4.4. DeepSpot enables more accurate spot detection compared with deepBlink

The main objective of our DeepSpot method is to circumvent the parameter fine-tuning and enable the downstream spot detection with a unique parameter set. As such, this objective fits well with the one that the authors of deepBlink(Reference Eichenberger, Zhan, Rempfler, Giorgetti and Chao 7 ) have pursued, despite the fact that the latter proposes a new spot detection method, while our goal is to fit a spot enhancement step into commonly used workflows. Consequently, deepBlink constitutes a relevant comparison target.

We compared the accuracy of spot detection by the model $ {\mathrm{M}}_{\mathrm{dB}} $ made available on deepBlink associated GitHub with that of DeepSpot when trained on hybrid data $ {\mathrm{M}}_{\mathrm{hybrid}} $ , both on our datasets as well as on the dataset provided by the authors of deepBlink, $ {\mathrm{DS}}_{\mathrm{dB}} $ . Table 4 shows the F1-scores for each dataset category. It should be noted that deepBlink’s precision is close to 99%, while its recall is very low for certain datasets (Supplementary Figure S4), meaning that the main drawback of deepBlink is in terms of false negatives (missing true spots). We illustrate this in Figure 6, where we provide a comparison of deepBlink and DeepSpot on both experimental and simulated data examples.

Table 4. Models’ performance for deepBlink and DeepSpot for smFISH spot detection. Overall F1-scores are calculated by averaging the values obtained for each image of the test datasets corresponding to each dataset category. Top values in each cell correspond to the mean value, whereas bottom values between brackets show the 95% confidence interval. Best values are highlighted in bold.

Figure 6. Examples of the results obtained with DeepSpot and deepBlink on the experimental test datasets DSexp (first row) and deepBlink (second row), and the simulated spots with fixed (DSfixed) and variable (DSvar) intensity datasets for the third and fourth rows, respectively. Colored circles indicate where the spots were detected by DeepSpot and deepBlink (blue and green, respectively). In the ground truth column, pink circles indicate the spots that were previously annotated as ground truth by alternative methods. The last two columns show the magnification at the positions indicated by the colored rectangles for DeepSpot and deepBlink, respectively.

DeepSpot clearly outperformed deepBlink on our datasets (Table 4). However, we found that $ {\mathrm{M}}_{\mathrm{dB}} $ performed better on experimental data than on simulated data, presumably because deepBlink model has only been trained on experimental data. $ {\mathrm{M}}_{\mathrm{hybrid}} $ results were consistent on all of our datasets. We have also applied our $ {\mathrm{M}}_{\mathrm{hybrid}} $ model to the smFISH test dataset $ {\mathrm{DS}}_{\mathrm{dB}} $ provided by the authors of deepBlink. Results reported in Table 4 for the detection of spots by $ {\mathrm{M}}_{\mathrm{dB}} $ from the images of the $ {\mathrm{DS}}_{\mathrm{dB}} $ dataset are those obtained by the authors and originally reported in Reference (Reference Eichenberger, Zhan, Rempfler, Giorgetti and Chao7). Not surprisingly, deepBlink model performance is better on their own smFISH images; however, DeepSpot managed to have an F1-score of nearly 88%, a noticeable achievement since the model has not been trained on the $ {\mathrm{DS}}_{\mathrm{dB}} $ data. Together, the results of Table 4 indicated that DeepSpot is a robust methodology that offers a generalist model for mRNA spot enhancement and ensures high-quality downstream spot detection without parameter tuning.

4.5. DeepSpot’s use in an end-to-end smFISH experiment

To evaluate whether DeepSpot can be effectively used in an end-to-end smFISH experiment, we have performed a wound healing assay in which cells migrate toward a wound to close it. To investigate whether $ \beta $ -Actin was enriched at the leading edges of 3T3 migrating fibroblast cells (see Section 3.1.2), the wound location was manually annotated as shown in a typical image example in Panel A of Figure 7.

Figure 7. Processing steps for the end-to-end wound healing assay. Panel A shows a typical smFISH image of the wound healing assay. Panel B shows the cell quantization procedure in three sections and the direction of the wound (red arrow). $ {\mathrm{S}}_{\mathrm{wound}} $ section is oriented toward the wound and is shown in light blue, and other sections are in dark blue. Panel C represents how the cell and nucleus were manually segmented and mRNA spots were counted after enhancement within the cytoplasmic portion of each section. Panel D presents the Wound Polarity Index (WPI) of cytoplasmic mRNA transcripts in $ {\mathrm{S}}_{\mathrm{wound}} $ compared to other sections for the $ \beta $ -Actin RNA. WPI was calculated in migrating and nonmigrating cells. The bars correspond to the median and the error bars to the standard deviation from the median for 100 bootstrapped WPI estimates.

All the cell images were segmented by manually, yielding cell and nucleus masks. Wound location defined the cell migration direction as schematically shown in Panel C of Figure 7. We partitioned the cell masks into three sections by computing 120° section centered at the nucleus centroid, and anchoring one of these sections as oriented toward the wound location $ {\mathrm{S}}_{\mathrm{wound}} $ at 60° angle to the left and to the right of the line between the nucleus centroid and the wound location. This cell segmentation allowed us to compute the normalized number of detected mRNA spots in the cytoplasmic part of each section $ {S}_1,{S}_2,{S}_3 $ .

Using the DypFISH framework(Reference Savulescu, Brackin and Bouilhol 44 ), we further compared the cytoplasmic mRNA relative density in $ {\mathrm{S}}_{\mathrm{wound}} $ (light blue) and in sections that are not oriented toward the wound (dark blue), as shown in Panel B of Figure 6. We used the Polarity Index(Reference Savulescu, Brackin and Bouilhol 44 ), that measures the enrichment of mRNA in different sections. Briefly, the Polarity Index measures how frequently the relative concentration within the wound section is higher than in the non-wound section. The Polarity Index lies between $ \left[-1,1\right] $ : a positive value implies a wound-correlated enrichment of RNA transcripts, whereas a negative value implies enrichment away from the wound, and a value of zero implies no detectable enrichment.

$ \beta $ -Actin mRNA was highly enriched in the leading edge of migrating cells, whereas almost no detectable enrichment of $ \beta $ -Actin was found in the leading edge of control cells. This is in line with previously published data, showing enrichment of $ \beta $ -Actin in leading edges of migrating fibroblasts(Reference Kislauskis, Li, Singer and Taneja 45 ).

5. Discussion and Conclusion

Recent FISH microscopy methods are capable of generating thousands of images, and it has thus become imperative to introduce algorithms capable to streamline the detection of mRNA spots and in particular to avoid manual fine-tuning of numerous parameters.

5.1. Limitation

The impact of the nonspecifically bound probes was not evaluated in this work. Consequently, even if DeepSpot is trained to distinguish between multiple signal intensity variations, in cases where the signal intensity of the nonspecifically bound probes is too similar in shape and intensity to the signal of the specifically bound probes, then DeepSpot will not be able to enhance only spots corresponding to specifically bound probes. This issue, common to all detection methods, deserves further investigation.

Moreover, it remains to be investigated whether DeepSpot would be suitable for more complex systems such as tissues and which adaptations in terms of its architecture and training data such improvement would require. Moreover, even if DeepSpot architecture was designed to be easily extensible to 3D by changing the 2D convolution layers to 3D layers in TensorFlow, the performance of DeepSpot in 3D has not been evaluated.

5.2. Discussion

In this work, we introduced DeepSpot, a novel CNN architecture specifically designed to enhance RNA spots in FISH images, thus enabling the downstream use of well-known spot detection algorithms, such as the Icy spot detector, without parameter tuning. In particular, the architecture of our network introduces the CASO module that relies on sparse convolution to provide more context for enhancement of small objects corresponding to mRNA spots. DeepSpot network has been trained and tested on 21 simulated datasets, all with different signal and noise characteristics, as well as on a previously published experimental dataset that was annotated for spot locations. We have shown that (a) our approach achieves better performance when the training is performed on data with highly variable intensity and (b) performing training on a combination of experimental and simulated data is a viable approach in real-life setting.

Furthermore, we compared the performance of combining DeepSpot and Icy to that of the state-of-the-art deep-learning-based method deepBlink and have shown that, on average, DeepSpot enables a substantially better detection of mRNA spots than deepBlink. We found that DeepSpot/Icy workflow provided excellent quality spot detection on the test datasets corresponding to the datasets on which it has been trained, with the average F1-score of above 97%, but also achieved high-precision results on fully unknown datasets with the F1-score of 88% for the datasets provided with the deepBlink publication. Taken together, the good results on both known and unknown data indicate that DeepSpot is a more generalist model than deepBlink and that it achieves a good balance between overfitting and underfitting. We hypothesize that this generalization capacity is possibly due to both strong regularization within the network and the diversity of signal provided by the carefully constructed training data.

To evaluate how well our method is suited for end-to-end biological investigations, we have shown the efficiency of the DeepSpot model trained on the combination of experimental and simulated data in the context of an independent study of cell migration. We have performed smFISH to detect $ \beta $ -Actin in mouse fibroblasts in a wound healing assay and enhanced the resulting images using our combination model, which allowed us to detect that the $ \beta $ -Actin mRNA enrichment is specific to leading edge of migrating cells as contrasted by its expression in nonmigrating cells.

To conclude, we have shown that DeepSpot enhancement enables automated detection and accurate localization of mRNA spots for downstream analysis methods and can thus be a useful tool to streamline not only spot detection, but also studies of localized mRNA enrichment within cells.

Acknowledgment

We thank Dr. Arthur Imbert for sharing the experimental smFISH data and helping to run the BIG-FISH pipeline.

Competing Interests

The authors declare no competing interests exist.

Authorship Contributions

Conceptualization: E.B. and M.N.; Data curation: A.F.S.; Investigation: E.B. and M.N.; Methodology: E.B. and M.N.; Resources: A.F.S. and R.B.; Software: E.B., E.L., and B.D.; Supervision: M.N.; Validation: A.F.S., E.L., B.D., and M.N.; Visualization: E.B.; Writing: E.B., A.F.S., and M.N. All authors approved the final submitted draft.

Funding Statement

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Data Availability Statement

DeepSpot network along with the code for training and for mRNA spot enhancement is fully open source and available on GitHub at https://github.com/cbib/DeepSpot. Our pretrained models used in this study are also available on our GitHub page, as well as the Napari plug-in. Data for simulated images for $ {\mathrm{DS}}_{\mathrm{var}} $ , $ {\mathrm{DS}}_{\mathrm{fixed}} $ , and $ {\mathrm{DS}}_{\mathrm{hybrid}} $ are available on Xenodo at https://doi.org/10.5281/zenodo.5724466.

Hardware and Framework

All the analysis presented in this paper are performed on 1 GPU Tesla T4 with 16 GB of memory, on a dedicated machine with two 2 CPU Intel Xeon Silver 4114 and 128 GB RAM. Training curves and comparison with a vanilla U-Net architecture are available in Supplementary Figure S5 and Supplementary Table S1. This work is implemented in Python 3.8, and we used TensorFlow 2.4 for creating and training the neural networks.

Supplementary Materials

To view supplementary material for this article, please visit http://doi.org/10.1017/S2633903X22000034.

References

Ke, R, Mignardi, M, Pacureanu, A, et al. (2013) In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods 10(9), 857860.CrossRefGoogle ScholarPubMed
Battich, N, Stoeger, T & Pelkmans, L (2013) Image-based transcriptomics in thousands of single human cells at single-molecule resolution. Nat Methods 10(11), 11271133.CrossRefGoogle ScholarPubMed
Raj, A, Van Den Bogaard, P, Rifkin, SA, Van Oudenaarden, A & Tyagi, S (2008) Imaging individual mRNA molecules using multiple singly labeled probes. Nat Methods 5(10), 877879.CrossRefGoogle ScholarPubMed
Rich, RM, Stankowska, DL, Maliwal, BP, et al. (2013) Elimination of autofluorescence background from fluorescence tissue images by use of time-gated detection and the azadioxatriangulenium (adota) fluorophore. Anal Bioanal Chem 405(6), 20652075.CrossRefGoogle ScholarPubMed
Zhang, M (2015) Small Blob Detection in Medical Images. Tempe, AZ: Arizona State University.Google Scholar
Caicedo, JC, Cooper, S, Heigwer, F, et al. (2017) Data-analysis strategies for image-based cell profiling. Nat Methods 14(9), 849.CrossRefGoogle ScholarPubMed
Eichenberger, BT, Zhan, Y, Rempfler, M, Giorgetti, L & Chao, JA (2021) deepBlink: threshold-independent detection and localization of diffraction-limited spots. Nucleic Acids Res 49(13), 72927297.CrossRefGoogle ScholarPubMed
Lindeberg, T (1993) Detecting salient blob-like image structures and their scales with a scale-space primal sketch: a method for focus-of-attention. Int J Comput Vis 11(3), 283318.CrossRefGoogle Scholar
Huang, J, Rathod, V, Sun, C, et al. (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 73107311. IEEE.Google Scholar
Olivo-Marin, J-C (2002) Extraction of spots in biological images using multiscale products. Pattern Recognit 35(9), 19891996.CrossRefGoogle Scholar
Abràmoff, MD, Magalhães, PJ & Ram, SJ (2004) Image processing with imageJ. Biophotonics Int 11(7), 3642.Google Scholar
Schindelin, J, Arganda-Carreras, I, Frise, E, et al. (2012) Fiji: an open-source platform for biological-image analysis. Nat Methods 9(7), 676.CrossRefGoogle ScholarPubMed
Kamentsky, L, Jones, TR, Fraser, A, et al. (2011) Improved structure, function and compatibility for cellprofiler: modular high-throughput image analysis software. Bioinformatics 27(8), 11791180.CrossRefGoogle ScholarPubMed
Mueller, F, Senecal, A, Tantale, K, et al. (2013) FISH-quant: automatic counting of transcripts in 3D FISH images. Nat Methods 10(4), 277278.CrossRefGoogle ScholarPubMed
De Chaumont, F, Dallongeville, S, Chenouard, N, et al. (2012) Icy: an open bioimage informatics platform for extended reproducible research. Nat Methods 9(7), 690.CrossRefGoogle ScholarPubMed
Gudla, PR, Nakayama, K, Pegoraro, G & Misteli, T (2017) SpotLearn: convolutional neural network for detection of fluorescence in situ hybridization (FISH) signals in high-throughput imaging approaches. In Cold Spring Harbor Symposia on Quantitative Biology, vol. 82, pp. 5770. New York: Cold Spring Harbor Laboratory Press.Google Scholar
Mabaso, MA, Withey, DJ & Twala, B (2018) Spot detection in microscopy images using convolutional neural network with sliding-window approach. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies, Funchal, Portugal. SciTePress.Google Scholar
Liu, Z, Gao, G, Sun, L & Fang, Z (2021) HRDNet: high-resolution detection network for small objects. In 2021 IEEE International Conference on Multimedia and Expo (ICME), Virtual, pp. 16. IEEE.Google Scholar
Fu, K, Li, J, Ma, L, Mu, K & Tian, Y (2020) Intrinsic relationship reasoning for small object detection. Preprint, arXiv:2009.00833.Google Scholar
Bell, S, Zitnick, CL, Bala, K & Girshick, R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 28742883. IEEE.Google Scholar
Noh, J, Bae, W, Lee, W, Seo, J & Kim, G (2019) Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), pp. 97259734. IEEE.CrossRefGoogle Scholar
Hamaguchi, R, Fujita, K, Nemoto, A, Imaizumi, T & Hikosaka, S (2017) Effective use of dilated convolutions for segmenting small object instances in remote sensing imagery. Preprint, arXiv:1709.00179.Google Scholar
Yu, F & Koltun, V (2016) Multi-scale context aggregation by dilated convolutions. In International Conference on Learning Representations (ICLR) San Juan, Puerto Rico. ICLR.Google Scholar
Smal, I, Loog, M, Niessen, W & Meijering, E (2009) Quantitative comparison of spot detection methods in fluorescence microscopy. IEEE Trans Med Imaging 29(2), 282301.CrossRefGoogle ScholarPubMed
Sadr, A, Vos, EE, Bassett, BA, Hosenie, Z, Oozeer, N & Lochner, M (2019) DeepSource: point source detection using deep learning. Mon Not R Astron Soc 484(2), 27932806.Google Scholar
Liu, J, White, JM & Summers, RM (2010) Automated detection of blob structures by Hessian analysis and object scale. In 2010 IEEE International Conference on Image Processing, Hong Kong, China, pp. 841844. IEEE.CrossRefGoogle Scholar
Nehme, E, Weiss, LE, Michaeli, T & Shechtman, Y (2018) Deep-STORM: super-resolution single-molecule microscopy by deep learning. Optica 5(4), 458464.CrossRefGoogle Scholar
Pino, C, Sortino, R, Sciacca, E, Riggi, S & Spampinato, C (2021) Semantic segmentation of radio-astronomical images. In Proceedings of the Progress in Artificial Intelligence and Pattern Recognition, pp. 393403. Cham: Springer International.CrossRefGoogle Scholar
Chouaib, R, Safieddine, A, Pichon, X, et al. (2020) A dual protein-mRNA localization screen reveals compartmentalized translation and widespread co-translational RNA targeting. Dev Cell 54(6), 773791.e5.CrossRefGoogle ScholarPubMed
Imbert, A, Ouyang, W, Safieddine, A, et al. (2022) FISH-quant v2: a scalable and modular analysis tool for smFISH image analysis. RNA 28(6), 786795.CrossRefGoogle ScholarPubMed
Moutasim, KA, Nystrom, ML & Thomas, GJ (2011) Cell migration and invasion assays. In Cancer Cell Culture, pp. 333343. Springer.CrossRefGoogle Scholar
Zhang, H, Singer, R & Bassell, GJ (1999) Neurotrophin regulation of $ \beta $ -actin mRNA and protein localization within growth cones. J Cell Biol 147(1), 5970.CrossRefGoogle ScholarPubMed
Lapidus, K, Wyckoff, J, Mouneimne, G, et al. (2007) ZBP1 enhances cell polarity and reduces chemotaxis. J Cell Sci 120(18), 31733178.CrossRefGoogle ScholarPubMed
Condeelis, J & Singer, RH (2005) How and why does $ \beta $ -actin mRNA target?. Biol Cell 97(1), 97110.CrossRefGoogle ScholarPubMed
Hu, P & Ramanan, D (2017) Finding tiny faces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 951959. IEEE.CrossRefGoogle Scholar
He, K, Zhang, X, Ren, S & Sun, J (2016) Identity mappings in deep residual networks. In European Conference on Computer Vision, Amsterdam, The Netherlands, pp. 630645. Springer.Google Scholar
Wang, Y, Guo, X, Liu, P & Wei, B (2021) Up and down residual blocks for convolutional generative adversarial networks. IEEE Access 9, 2605126058.CrossRefGoogle Scholar
Jadon, S (2020) A survey of loss functions for semantic segmentation. In IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Via del Mar, Chile, pp. 17. IEEE.Google Scholar
Bergstra, J, Yamins, D & Cox, D (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning, Atlanta, USA, pp. 115123. PMLR.Google Scholar
Li, L, Jamieson, K, Rostamizadeh, A, et al. (2020) A system for massively parallel hyperparameter tuning. Preprint, arXiv:1810.05934.Google Scholar
Bentley, J (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9), 509517.CrossRefGoogle Scholar
Samet, H (1990) The Design and Analysis of Spatial Data Structures, vol. 85. Reading, MA: Addison-Wesley.Google Scholar
De Berg, M, Cheong, O, Van Kreveld, M & Overmars, M (2008) Computational Geometry: Algorithms and Applications. Springer.CrossRefGoogle Scholar
Savulescu, AF, Brackin, R, Bouilhol, E, et al. (2021) Interrogating RNA and protein spatial subcellular distribution in smFISH data with DypFISH. Cell Rep Methods 1(5), 100068.CrossRefGoogle ScholarPubMed
Kislauskis, EH, Li, Z, Singer, RH & Taneja, KL (1993) Isoform-specific 3′-untranslated sequences sort alpha-cardiac and beta-cytoplasmic actin messenger RNAs to different cytoplasmic compartments. J Cell Biol 123(1), 165172.CrossRefGoogle ScholarPubMed
Figure 0

Figure 1. (a) RNA spots on a noisy background. (b) Spots’ intensity is increased after the enhancement by $ e $.

Figure 1

Table 1. List of datasets used for training (22 training datasets) and evaluation (21 test datasets) of the DeepSpot network. Images in the $ {\mathrm{DS}}_{\mathrm{var}}^i $ datasets have spot intensities between 160 and 220 for each image. For the fixed intensity datasets $ {\mathrm{DS}}_{\mathrm{fixed}}^i $, the spot intensity is set to one value within [160…220] for a given image, but varies from image to image. Ten variable intensity $ {\mathrm{DS}}_{\mathrm{var}}^i $ and 10 fixed intensity $ {\mathrm{DS}}_{\mathrm{fixed}}^i $ datasets are named according to the number of spots in the images, $ i $. The dataset $ {\mathrm{DS}}_{\mathrm{hybrid}} $ combines $ {\mathrm{DS}}_{\mathrm{exp}} $ with $ 25\% $ simulated images.

Figure 2

Figure 2. DeepSpot network architecture is composed of the context aggregation for small objects module constituted of a multipath network (Panel A) and a customized ResNet component (Panel B). A custom loss function is used for training the network.

Figure 3

Figure 3. Full pre-activation residual block, composed of batch normalization, activation, and convolution, repeated three times before dropout and residual connection.

Figure 4

Table 2. Spot enhancement performance in terms of resulting spot intensity. The measures displayed correspond to the spot intensity between [0, 255] after enhancement by the neural network and averaged by category of models and datasets. Between brackets are shown the $ 95\% $ confidence intervals. Model categories are listed in rows, whereas columns correspond to the dataset categories on which the different models were applied.

Figure 5

Figure 4. Spot matching by 1-neighbor $ k $-d tree between the detected mRNA spots $ \left\{{p}_1,\dots, {p}_9\right\} $ depicted in blue and annotated spots $ \left\{{q}_1,\dots, {q}_7\right\} $ depicted in red. The $ k $-d tree construction for $ \left\{{p}_1,\dots, {p}_9\right\} $ is shown on the left. Using the matching radius depicted by circles, the $ k $-d tree queries for $ {q}_1 $ and $ {q}_2 $, shown in red, lead to the same leaf $ {p}_2 $ and correspond to an ambiguous match, while query for $ {q}_3 $ leads to an unique match. mRNA spots $ {p}_1,{p}_4 $, and $ {p}_8 $ are the False Negatives.

Figure 6

Table 3. Models’ performance per model type. Metrics (F1-score, precision, recall, and ambiguous matches [AMs]) were calculated by averaging the values obtained for each image of the 21 test datasets. Top values in cells correspond to the mean value, whereas bottom values between brackets show the 95% confidence interval. Best values are highlighted in bold.

Figure 7

Figure 5. Heat map of the F1-scores obtained by each of the 22 models when evaluated on the 21 test datasets described in Table 1.

Figure 8

Table 4. Models’ performance for deepBlink and DeepSpot for smFISH spot detection. Overall F1-scores are calculated by averaging the values obtained for each image of the test datasets corresponding to each dataset category. Top values in each cell correspond to the mean value, whereas bottom values between brackets show the 95% confidence interval. Best values are highlighted in bold.

Figure 9

Figure 6. Examples of the results obtained with DeepSpot and deepBlink on the experimental test datasets DSexp (first row) and deepBlink (second row), and the simulated spots with fixed (DSfixed) and variable (DSvar) intensity datasets for the third and fourth rows, respectively. Colored circles indicate where the spots were detected by DeepSpot and deepBlink (blue and green, respectively). In the ground truth column, pink circles indicate the spots that were previously annotated as ground truth by alternative methods. The last two columns show the magnification at the positions indicated by the colored rectangles for DeepSpot and deepBlink, respectively.

Figure 10

Figure 7. Processing steps for the end-to-end wound healing assay. Panel A shows a typical smFISH image of the wound healing assay. Panel B shows the cell quantization procedure in three sections and the direction of the wound (red arrow). $ {\mathrm{S}}_{\mathrm{wound}} $ section is oriented toward the wound and is shown in light blue, and other sections are in dark blue. Panel C represents how the cell and nucleus were manually segmented and mRNA spots were counted after enhancement within the cytoplasmic portion of each section. Panel D presents the Wound Polarity Index (WPI) of cytoplasmic mRNA transcripts in $ {\mathrm{S}}_{\mathrm{wound}} $ compared to other sections for the $ \beta $-Actin RNA. WPI was calculated in migrating and nonmigrating cells. The bars correspond to the median and the error bars to the standard deviation from the median for 100 bootstrapped WPI estimates.

Supplementary material: PDF

Bouilhol et al. supplementary material

Bouilhol et al. supplementary material

Download Bouilhol et al. supplementary material(PDF)
PDF 464.6 KB