Takuya Kirioka, Panyavut Aumpuchin and Takeshi Kikuchi
The information on the 3D structure of a protein including its folding mechanism is encoded in its amino acid sequence. A β-trefoil protein is well known to have a remarkable 3D structure property, that is, the pseudo three-fold symmetry without clear hydrophobic packing. It is interesting to investigate how information on the folding mechanism to form such a topology is encoded in the amino acid sequence of a protein. In this study, analyses based on inter-residue average distance statistics and the conservation of hydrophobic residues are performed for sequences of 26 β-trefoil proteins to identify significant sites for the initial folding. Results are compared with the native 3D structures. The conserved hydrophobic residues are defined by multiple sequence alignment based on the 3D structures. It is confirmed that a conserved hydrophobic residue is always located in a β-strand. In particular, β-strands 5 and 6 are significant for the initial folding from the analyses based on the inter-residue average distance statistics. These results coincide well with the experimental data obtained so far for folding of some of the β-trefoil proteins. It is also confirmed that the conserved hydrophobic residues defined in this study contribute to form hydrophobic packing in β-trefoil proteins in general. Twelve conserved hydrophobic residue pairs are almost always observed to form packing in the 26 β-trefoil proteins from different superfamilies. We elucidate how the conserved hydrophobic residues in β-strands 5 and 6 contribute to the initial stage of folding of a β-trefoil protein. The common packing of the 12 conserved hydrophobic residue pairs are significant to form the whole β-trefoil fold structure.