12 October 2017

Identification of cancer cells

The computer has learned to recognize single cells. Including cancer

Dmitry Trunin, N+1

The new SCENIC computer algorithm reconstructs the gene regulatory network and recognizes the state of a single cell. The use of a program developed by Belgian scientists made it possible to distinguish between individual types of cancer cells in oligodendroglioma and melanoma. The article was published in the journal Nature Methods (Aibar et al., SCENIC: single-cell regulatory network inference and clustering).

The process of RNA synthesis in a cell depends on a gene regulation network in which a limited number of transcription factors and cofactors regulate each other and the synthesized genes. Synthesized proteins determine the work of various cells of the body, which makes it possible to recognize and distinguish them. In total, several methods have been developed to determine from the RNA sequencing data of an individual cell which proteins are produced by it, but these methods did not use regulatory sequence analysis to predict the interaction between transcription factors and target genes. In the new algorithm, these connections are taken into account.

Strictly speaking, SCENIC (stands for Single-CEll regulatory Network Inference and Clustering, recognition and clustering of the regulatory network of a single cell) – this is not a separate program, but a workflow based on the use of three new packages of the Bioconductor project. The first of these, GENIE3, identifies potential target transcription factors based on co-expression. The second, RcisTarget, searches for the most important factors and determines the direct goals of the analysis (regulons). Finally, the third, AUCell, evaluates the activity of regulons in individual cells. The scientists also used the GRNBoost tool as an alternative to GENIE3 on large datasets. Detailed documentation of all the listed programs, as well as the source code and training manuals can be found on the project website.

To evaluate the algorithm's performance, the scientists tested it on RNA sequencing data from well-studied mouse brain cells. The program identified 151 regulons out of 1046 initial coexpression modules. By evaluating the activity of the regulons, the scientists identified the expected cell types along with a list of potential main regulators for each type. Moreover, cluster analysis by cell type turned out to be more accurate than many specialized methods of clustering individual cells.

The researchers also used SCENIC to determine the cell states in a data set obtained by RNA sequencing of 4043 oligodendroglioma cells (taken from six different tumors) and 1252 melanoma cells (taken from 14 lesions). Due to mutations of tumor-specific cells and complex chromosomal aberrations, the recognition of the states of cancer cells turned out to be more difficult than for the case of normal cellular states. The usual cluster analysis allows us to establish only the type of tissue from which the cancer cell was taken, but SCENIC saw a more complex picture. He found three different cell types for oligodendroglioma, and two for melanoma.

SCENIC.png
Different types of cancer cells found in tissues.
At the top, data for oligodendroglioma, at the bottom – for melanoma. 
A drawing from an article in Nature Methods.

With the help of an algorithm developed by scientists, it will be possible to build a more accurate map of the cells of the human body. It will also help to better understand the processes that control the production and activity of various types of cells, and will allow for more effective recognition of various types of cancer.

Earlier we wrote about how digital image analysis and machine learning helped in the early diagnosis of melanoma.

Portal "Eternal youth" http://vechnayamolodost.ru


Found a typo? Select it and press ctrl + enter Print version