It is now possible to identify over 30 functional
subfamilies among the WD-repeat-containing proteins found
in the completed genomes. The majority of these subfamilies
have at least one member for which experimental data allow
assignment to a cellular pathway or process. Half of the 63
WD-repeat-containing proteins in Saccharomyces cerevisiae,
half of the 70 in Caenorhabditis elegans, and a
third of the 100 plus predicted in Drosophila
can be assigned to 23 of these functional subfamilies.
Perhaps indicative of the future, 33 WD-repeat-containing
proteins from the partial genome of Arabidopsis thaliana
can now be assigned to 18 of these subfamilies. These assignments
have been made possible by combining traditional sequence
similarity with an implied common beta propeller structural
context to obtain measures of protein–protein surface
similarity. The beta propeller structural context is represented
in the form of a Hidden Markov Model. The procedure is
completely automated.