Collagen has a unique folding mechanism that begins with the formation of a triple-helical structure near its C terminus followed by propagation of this structure to the N terminus. To elucidate factors that affect the folding of
collagen, we explored the folding pathway of
collagen-like model
peptides using detailed molecular simulations with explicit
solvent. Using biased molecular dynamics we examined the latter stages of folding of a
peptide model of native
collagen,
(Pro-Hyp-Gly)10, and a
peptide that models a Gly --> Ser mutation found in several forms of
osteogenesis imperfecta, (Pro-Hyp-Gly)3-Pro-Hyp-Ser-(Pro-Hyp-Gly)6. Starting from an unfolded state that contains a C-terminal nucleated trimer,
(Pro-Hyp-Gly)10 folds to a structure where two of the three chains associate through water-mediated hydrogen bonds and the third is relatively separated from this dimer. Calculated free-energy profiles for folding from this intermediate to the final triple-helical structure suggest that further folding occurs at a rate of approximately one Pro-
Hyp-Gly triplet per msec. In contrast, after 6 nsec of biased dynamics, the region N-terminal to the Ser residue in (Pro-Hyp-Gly)3-Pro-Hyp-Ser-(Pro-Hyp-Gly)6 folds to a structure where the three chains form close contacts near the N terminus, away from the mutation site. Further folding to an ideal triple-helical structure at the site of the mutation is unfavorable as the free energy of a triple-helical conformation at this position is more than 20 kcal/mol higher than that of a structure with unassociated chains. These data provide insights into the folding pathway of native
collagen and the events underlying the formation of misfolded structures.