- Titile: The Research on Molecular Network Models of Complex Disease
- Abstract: Complex diseases such as cardiovascular disease, cancer and diabetes are major killers of human health. Unlike single gene defect diseases, complex diseases are usually believed to be associated with the interactions among genes and the interactions among genes and environmental factors. The underlying complex pathogeneses make both the early diagnosis and treatment be difficult. Therefore the research of complex diseases is one of the major challenges of biomedical research in this century. Recently, the rapid accumulation of biological knowledge and multi-level high-throughput omics data, revolutionary changed the research paradigm on complex disease. Instead of only focusing on single molecular, researchers are gradually extending their research to systematically analyze genome-wide biomolecular interactions, i.e. biomolecular networks. In this context, bio-molecular network, a powerful tool to study complex diseases, enables systematically integration of high throughput biological data and plenty of biological knowledge. Recently, a large number of biological networks have been constructed, including protein interaction network, transcriptional regulation network, signal transduction network, metabolic network and so on. These studies have played a significant role in revealing the pathogenesis of complex diseases and promoting treatment or prevention of diseases. However, the biomolecular network research on complex disease is still in its infancy. Due to the complexity of mechanisms and disease-related biological data, many challenges remain to be further explored and resolved. In this thesis, we use mathematical models of biomolecular networks to address some main challenges in the research of complex diseases. Specifically, we study molecular networks step by step from static, node dynamic, edge dynamic, to module dynamic of network construction and network analysis, and build several mathematical models to deepen the complex disease study by biomolecular networks concept. The main results of this thesis are summarized as follows.
- 1. We construct a molecular network model called Disease-Aging network to reveal the relationships among diseases. It is well learnt that association between genetic disease and aging process is critical for uncovering the the disease mechanisms and exploring the nature of aging. For the first time, we study this association from the viewpoint of network biology by constructing the Disease-Aging Network, and find (1) comparing with random control, aging and disease are significant overlap at the molecular and network level; (2) Disease can be divided into two types according to their relationship with aging: aging-related disease and non-aging-related disease. Genes related to the two disease types are different in function, evolution, and importance, etc. (3) Aging genes make significant contribution in connecting disease genes
- 2. We propose a Gaussian Graphical Model to construct the Transcription Factor Activity of the Network (TFAN). Traditional network inference models are generally in two categories: active sub-networks identification in protein level which can not tell cause and effect, and transcription regulatory network reconstruction which can reflect the causal relationship but cannot ensure a stable unique solution. We propose a novel Graphical model, which is the organic integration of multi-source data by a protein-genetic hybrid network, to overcome the above shortcomings. Specifically, we infer the causal relations from protein level to transcription level, and theoretically prove the unique conditions for the optimal solution. Comparing with traditional methods, the novel model has higher inference accuracy and larger applicable scope. Furthermore, the new model has been applied to the study of diabetes. We construct TFAN of diabetes, and find SP1 is an important factor in diabetes progress. Preliminary experiments have confirmed our predictions.
- 3. We establish a sliding window-based instantaneous regulatory network inference model to uncover the dynamic change of molecular interactions during the disease progression. Biologists and medical scientists believe that the dynamic change in the progression of complex disease is important, but quantitative research is difficult due to lack of data. We propose a sliding window model, by taking advantage of the characteristics of time series data, to construct the transient regulatory relationship in transcriptional regulation network. After validation in yeast cell cycle data, the new model is applied to study 3T3-L1 adipocyte differentiation model. We build a network reflecting the dynamic changes in fat differentiation, which may reflect the mechanism of obesity and diabetes. The results are confirmed by literature and several well-designed control studies.
- 4. We establish a revised simulated annealing-based algorithm to solve the constrained optimization model in identifying network communities. Community detection is an important way to analyze complex network, and can be potentially applied to find the network biomarkers for complex diseases. There have been a number of efficient methods to identify communities in complex networks, but a lot of them suffer from the limitations of resolution limit and misidentification. Here, we construct a constrained optimization model to address the above mentioned limitations, and solve the model by a revised simulated annealing-based algorithm. Then we apply the new model in the research of mouse teratoma, and explain the observed phenotype by a dynamic module model. Furthermore, the new method is developed as a free accessible software at http://www.aporc.org/doc/wiki/ModularityOptimization
- 5. We develop a Network Ontology Analysis model to analyze the function of biological networks, which is important to connect the complex disease phenotypes with biomolecular networks. Current tools for the analysis of biological networks are limited to analyze a set of genes or proteins involved in the network, and ignore the function of links and network topology. We construct a multi-objective programming model to define the function of edges in biological networks, and then use statistical tests to assess the enrichment GO terms for a given biological network. The new method is proved to be more efficient than traditional ones in both static and dynamic networks. We further apply it to the aging network, cancer network, and Alzheimer's disease networks, and find several important implications in related disease research. Furthermore, we develop a free accessible web-server for NOA at http://www.aporc.org/noa/.