Hierarchical Canonical Correlation Analysis Reveals Phenotype, Genotype, and Geoclimate Associations in Plants
Hierarchical canonical correlation analysis. The figure shows the steps of HCCA. (a) At the first level, condition numbers are calculated for each pair of the four datasets. The pair (gene expression dataset and geoclimate dataset ) with the largest condition number is selected for performing CCA to find a feature representation that maximizes their correlation. (b) At level 2, condition numbers are calculated between DNA methylation dataset , mutation dataset , and the new dataset combined from gene expression dataset and geoclimate dataset at level 1. The pair (DNA methylation dataset and dataset ) with the larger condition number is selected for coprojection into a new dataset with CCA at this level. (c) At the last level, mutation data and the dataset are coprojected into the final combined dataset .