I’ve been reading a few papers on computational methods for finding regulatory networks from gene expression data. From “Module networks revisited”:
Following Hartwell et al. (1999) a ‘module’ is to be viewed as a discrete entity composed of many types of molecules and whose function is separable from that of other modules. Understanding the general principles that determine the structure and function of modules and the parts they are composed of can be considered one of the main problems of contemporary systems biology (Hartwell et al., 1999). The module network method of Segal et al. (2003) addresses this problem using gene expression data as its input. It has yielded novel biological insights in a number of complex eukaryotic systems (…) and has been the source of inspiration for numerous computational approaches to network inference as evidenced by its high number of citations.
The JMLR paper applies the module/regulator framework to stocks as well as genes. The Module Networks algorithm iterates between 1) building regression trees using known regulator genes to predict the expression of a module of other genes and 2) reassigning genes to modules which have a better likelihood of explaining their variance. There are many subtleties to the implementation including preventing cyclical graphs between regulators and modules, merging and splitting of modules, and smaller updates which refit the distributions at the leaves of the regression tree without rebuilding the tree.
- Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data (2003)
- Learning Module Networks (2005)
- Computational methods for discovering gene networks from expression data (2009)
- Module networks revisited: computational assessment and prioritization of model predictions (2009)