A CORRELATION INVESTIGATION OF BACTERIAL DNA CODING SEQUENCES

OANA ZAINEA, V.V. MORARIU

Department of Molecular and Biomolecular Physics, National Institute for R&D of Isotopic and Molecular Technology, 400293, Cluj-Napoca, Romania

The coding sequences (CDS) length series in the genome of E. coli and B. subtilis were considered. The organization of the series was investigated by detrended fluctuation analysis (DFA), which gives information about the correlation characteristics. The CDS length series show a low level of correlation, which indicates close to randomness or almost lack of organization at the genome level. However, this is an apparent result. Correlation characteristics should remain constant if various segments of the series are analysed, and the correlation characteristic is uniform throughout the genome series. We have segmented the genome series into four quarters and performed DFA on each segment. The results showed a non-uniform correlation characteristic throughout the genome, ranging from high correlation or anti-correlation to almost randomness. High correlation is present in the second quarter of B. subtilis and in the first quarter of E. coli genome. This suggests that similar length genes are located preferentially in these segments.

Corresponding author’s e-mail: oanaz@itim-cj.ro

Full text: PDF