Balance Theory of Signed Genetic Interactions Reveals Differences in Cancerous and Healthy Cells

January 11, 2020

Time: Monday 13.01.2020 13.30 – 14.00
Place: Meeting room A142, T-building
Speaker: Abbas K. Rizi

Balance Theory of Signed Genetic Interactions Reveals Differences in Cancerous and Healthy Cells

Abstract:

Genes are not independently functioning in the cell and their expressions are strongly correlated with each other. They communicate with each other through different regulatory effects which lead to the emergence of complex structures in the cells. Such structures are expected to be different for healthy and cancerous cells. To study the differences in the case of breast cancer, we have investigated the Gene Regulatory Network (GRN) of cells as inferred from the RNA-sequencing data using the maximum entropy principle. The GRN is a signed weighted network corresponding to the inductive or inhibitory interactions.
In this presentation, I will focus on a particular set of motifs in the GRN, the triangles, which can be imbalanced if the number of negative interactions in them is odd or balanced otherwise. I will show that the network in cancerous cells has fewer imbalanced triangles than in the healthy case. Moreover, in the healthy cells, imbalanced triangles are isolated from the main part of the network, while such motifs are part of the giant component of the network in cancerous cells.


Fundamental Papers on Balance Theory:


Biological Network Inference – Using the Principle of Max. Entropy


Graphical Lasso


Data

The data of mRNA (expression level) of 20532 genes in the case of Breast Cancer (BRCA: Breast invasive carcinoma) has been downloaded from The Cancer Genome Atlas (TCGA) project. For each gene, there exists 114 normal and 764 cancerous samples and the measurement of the expression levels have done with the technique of RNA sequencing (RNA-Seq). We have used the RPKM (Reads Per Kilobase transcript per Million reads.) normalized data. RPKM puts together the ideas of normalizing by sample and by the gene. When we calculate RPKM, we are normalizing for both the library size (the sum of each column) and the gene length. We had to reduce the number of genes because it is a difficult task to handle a 20532 in 20532 matrix computationally. For each gene, we have calculated the variance of its expression level over its samples and finally, we have store the first 483 genes with the highest variance due to more different activity patterns these genes show among the others. Note that there are so-called housekeeping genes that typically get transcribed continually. These genes are required for the maintenance of basic cellular function and are expressed in all cells of an organism under normal and patho-physiological conditions. Some housekeeping genes are expressed at relatively constant rates in most non-pathological situations.

Slides

Balance-Theory-of-Signed-Genetic-Interactions1