Cathepsin B is a lysosomal
thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this
enzyme, we have determined the complete coding sequences for human and mouse
preprocathepsin B by using
cDNA clones isolated from human
hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of
preprocathepsin B contains 339
amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain)
cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of
procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single
cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of
procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the
cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the
cathepsin B gene.