Using Tree-to-Tree Comparisons to Rank Glycan Structures
Glycans (carbohydrates) are one of the four major building blocks of biology and are critical contributors of diversity through post-translational protein modifications. The branched primary structure of glycans differentiates them from linear molecules in biology such as nucleic acids or proteins. Bioinformatics tools developed for analyzing linear molecules are not applicable to the analysis of glycans, so new tools that accommodate branching structures must be produced. Here we present a structure comparison algorithm as a contribution to the growing collection of bioinformatics analysis tools available specifically for glycans. This analytical tool represents glycans as mathematical trees and applies a tree-to-tree comparison algorithm to produce measures of structure similarity. There are many applications where comparing glycan structures can prove useful; we evaluated two: ranking glycan search engine results and lectin binding affinity prediction. This algorithm was able to rank glycan search engine results in a logical way where structures with the highest similarity appear earliest in the results but achieved only partial success in predicting lectin binding affinity. In the future, altering parameters within the algorithm may produce better results for binding affinity predictions.