Oral Presentation 44th Lorne Genome Conference 2023

Epigenetic signatures of invertebrate CpG island-like sequences (#23)

Allegra Angeloni 1 2 , Deniz Kaya 3 , Kevin Pang 4 , Andreas Hejnol 4 , Linda Dansereau 1 5 , Andrew Philp 1 5 , Ulrich Technau 6 , Vanessa Liang 7 , Greg Neely 7 , William Hatleberg 8 , Bernard Degnan 9 , Robert Klose 3 , Alex de Mendoza 10 , Ozren Bogdanovic 2 11
  1. Garvan Institute of Medical Research, Sydney, Australia
  2. School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, Australia
  3. Department of Biochemistry, University of Oxford, Oxford, United Kingdom
  4. Department of Biological Sciences, University of Bergen, Bergen, Norway
  5. St Vincent's Clinical School, UNSW, Sydney, Australia
  6. Department of Molecular Evolution and Development, Centre for Organismal Systems Biology, University of Vienna, Vienna, Austria
  7. Charles Perkins Centre, School of Life and Environmental Sciences, University of Sydney, Sydney, Australia
  8. Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
  9. Centre for Marine Science, School of Biological Sciences, University of Queensland, Brisbane, Australia
  10. School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
  11. Centro Andaluz de Biología del Desarrollo, Seville, Spain

CpG islands (CGIs) are a DNA sequence class conserved at vertebrate gene regulatory elements. A defining feature of CGIs is the lack of DNA methylation (5-methylcytosine, 5mC), an epigenetic modification associated with gene silencing. CGIs are exclusively studied in the context of hypermethylated vertebrate genomes, however it is unclear whether 5mC is the primary determinant of CGI regulatory function. Unlike vertebrate genomes, invertebrate genomes are typically sparsely methylated, thus the possibility of invertebrate genomes containing CGIs has not been considered. This study aims to establish whether CGIs are a vertebrate-specific innovation, or a deeply conserved feature of metazoan gene regulatory elements that exist independently of genomic 5mC content.

Non-methylated CpG island-like sequences (NMIs) were sequenced from eight invertebrate genomes using BioCAP-seq, a biochemical method based on protein affinity pulldown of CpG-rich DNA. We selected invertebrate genomes containing variable 5mC levels, ranging from the demosponge Amphimedon queenslandica to the chordate Branchiostoma lanceolatum. Analysis of invertebrate NMIs revealed close similarities to vertebrate CGIs identified experimentally and through sequence-based algorithms. Enriched BioCAP-seq signal was present at computationally predicted invertebrate CGIs, verifying the presence of CGI-like sequence features at invertebrate NMIs. Bisulfite sequencing and ATAC-seq confirmed NMI hypomethylation and association with accessible chromatin respectively. NMIs were predominantly localized to promoters and gene bodies. Promoter-associated NMIs contained methyl-sensitive and chromatin remodeling transcription factor binding motifs and were more highly conserved than non-NMI promoters (phastCons, p-value < 0.001). Finally, we examined the functional conservation of CGIs in invertebrates by validating the capacity of candidate NMIs to drive transgenic expression in the vertebrate zebrafish. 

In summary, NMIs identified in sparsely methylated invertebrate genomes resemble CGIs in heavily methylated vertebrate genomes, challenging the long-standing assumption that 5mC determines CGI function. Elucidating the epigenetic factors necessary for CGI evolution provides valuable insights into the fundamental mechanisms controlling gene expression.