The nucleotide sequence of the Mr = 28,500 flagellin gene of Caulobacter crescentus
Journal of Biological Chemistry
The DNA sequences which encode the Mr = 28,500 flagellin polypeptide of Caulobacter crescentus CB15 have been determined. The size of the protein, deduced from its DNA sequence (276 amino acids), is in agreement with its apparent molecular weight as measured by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The distribution of arginine residues within the protein sequence encoded by the gene correlates with their relative location as predicted by peptide alignment analysis (Gill,
... ., and Agabian, N. (1982) J. Bacteriol. 150, 925-933). DNA sequences 5' and 3' to the coding sequence were also determined. In the 5' region, DNA sequences homologous to consensus sequences associated with RNA polymerase recognition and transcription initiation sites in Escherichia coli (Pribnow box) are found. These are centered around 60, 90, and 120 base pairs upstream from the ATG codon at the beginning of the structural gene. Sequences 3' to the coding region were identified which might signal transcription termination. A typical E. coli 16 S ribosomal binding site (Shine-Dalgarno sequence) is located just 5' to the coding sequence, and for most of the amino acids there is a strong codon usage preference. Although this protein is exported from the cell (Gill, P.R., and Agabian, N. (1982) J. Bacteriol. 150, 925-933), the encoded NH2-terminal amino acid sequence is not different from the mature product.