General Designs Reveal a Purine-Pyrimidine Structural Code in Human DNA
The human genome carries a vast amount of information within its DNA sequences. The chemical bases A, T, C, and G are the basic units of information content, that are arranged into patterns and codes. Expansive areas of the genome contain codes that are not yet well understood. To decipher these, mathematical and computational tools are applied here to study genomic signatures or general designs of sequences. A novel binary components analysis is devised and utilized. This seeks to isolate the
... hysical and chemical properties of DNA bases, which reveals sequence design and function. Here, information theory tools break down the information content within DNA bases, in order to study them in isolation for their genomic signatures and non-random properties. In this way, the RY (purine/pyrimidine), WS (weak/strong), and KM (keto/amino) general designs are observed in the sequences. The results show that RY, KM, and WS components have a similar and stable overall profile across all human chromosomes. It reveals that the RY property of a sequence is most distant from randomness in the human genome with respect to the genomic signatures. This is true across all human chromosomes. It is concluded that there exists a widespread potential RY code, and furthermore, that this is likely a structural code. Ascertaining this feature of general design, and potential RY structural code has far-reaching implications. This is because it aids in the understanding of cell biology, growth, and development, as well as downstream in the study of human disease and potential drug design.