In the complement of complete RefSeq genomes, the full set of ribosomal and tRNAs have been added either as functional or as potential pseudogenes (Figure 1B). The only cases where this minimal selleck kinase inhibitor standard could not be met were due either to issues with the sequence (sequencing or assembly) or cases of real biology such as in small compact genomes for endosymbionts. For example, Candidatus Hodgkinia Inhibitors,Modulators,Libraries cicadicola Dsem is missing several key functional tRNAs due to codon recoding [66]. Table 3 Selected annotation report examples1 Figure 1 Selected comparisons of genome measures. Principal component analysis showed expected relationships among the different measures (data not shown). Selected examples are plotted as double y-axis scatterplots. Legends indicate first or second y-axis for …
Further examination of the annotation measures Inhibitors,Modulators,Libraries across all genomes shows how other measures interact. For example, increasing coding density (more genes per Kbp) in genomes results from an increase in the ratio of short proteins (ratio of proteins that are less than 150 amino acids/ total proteins: Figure 2C). As the coding density increases and the ratio of short proteins increase, the average protein length decreases, a logical result as Inhibitors,Modulators,Libraries the increased coding density is due to an increase in short overlapping predicted ORFs. A more subtle impact shows that with increasing coding density the ratio of hypothetical to total proteins in the genome increases, whereas the utilization of the ATG start codon (standard start) decreases (Figure 2D). Increasing GC content also Inhibitors,Modulators,Libraries coincides with the usage of alternative start codons such as GTG.
However, increasing Inhibitors,Modulators,Libraries GC content and increasing genome length do not generally result in an increase in the hypothetical protein ratio (data not shown) suggesting that these trends are due to differences in annotation quality. Figure 2 Heatmap of selected annotation report measures for gammaproteobacteria. A set of measures were chosen corresponding to those used in principal component analysis (data not shown) but restricted to INSDC genomes from gammaproteobacteria. A two-dimensional … Although genome streamlining can impact these measures, for example many genomes from the Prochlorococcus genus exhibit increased coding density; there are other factors at play [64,67,68]. This is more clearly seen when closely related genomes are compared as in GSK-3 a heatmap [69]. Selected annotation measures for the gammaproteobacteria are compared in a heatmap in Figure 2. In several cases, increases or decreases in physical (length, GC content) or derived measures are due to biological causes. For example, gammaproteobacterial endosymbionts such as Buchnera spp. exhibit reduced genome size and decreased GC content [70,71].