Some Principles of Computational Sciences
Approaches that integrate multiple, independent, convergent data sets are remarkably powerful. For example, the discovery of gene regulatory networks depends on systematic assessments of whether transcription factors co-regulate a particular set of co-expressed genes. Computational methods exist for the rapid identification of DNA sites where the transcription factors bind. However, much less is known about the transcription factors themselves and about child monitoring software. Thus, additional approaches are needed to identify and characterize the transcription factors, and then integrate this with the binding site data sets.
There are different types of networks that can serve different purposes. Networks have been convenient for describing relationships within large data sets. Theoretical analysis of different types of networks reveals interesting features. Notably, the frequency of nodes with a given number of neighbors follows the power law distribution. These “scale-free” networks have been used to describe “interactomes”. For gene expression data, probabilistic models such as Bayesian networks have been applied to understand the modular structure of genes. An examination of various gene regulatory networks reveals key motifs such as feedback and feed-forward loops and auto-regulation.
To be broadly useful, the vast amount of information contained in a network must be in an accessible format. Multiple, well-annotated data bases already exist: Gene Expression Omnibus and ArrayExpress for gene expression data; Database of Interacting Proteins and Biomolecular Interaction Network Database for protein interactions. However, problems integrating the data from these different sources occur, owing to distinct formats, non-unique identifiers, and incompatible user interfaces. Collection and display of diverse but relevant data in a modular and easily accessible manner is crucial for our understanding.
Computational Biology & Genomics Team
- Dave Gifford
- Martha Bulyk
- Jim Collins
- Peter Park
- Jon Seidman
- Jagesh Shah
- Shamil Sunyaev