RESEARCH


<

Project D01

>

Dr. Georg Künze

Leipzig University
Institute for Drug Discovery

Website

JProf. Dr. Panagiotis Kastritis

MLU Halle-Wittenberg
Inst. of Biochemistry & Biotechnology

Website

Modeling, understanding, and predicting >2 million Arabidopsis proteoforms in the post-genomic era

For the majority of the over 2 million non-synonymous single nucleotide polymorphisms (SNPs) found in the assembled Arabidopsis genomes of the 1001 Genomes Project, the effect on protein structure and function is unknown. The influence of the SNP-encoded single amino acid variations can be complex, including changes of protein expression, stability, transport, post-translational modification, molecular interaction, and other effects. In order to obtain a comprehensive overview of the landscape of single amino acid variations in Arabidopsis, our project performs proteome-wide bioinformatic analyses of SNP effects in all Arabidopsis proteins. Structural modeling with AlphaFold and classical biophysics-based approaches reveals the 3D distribution and energetic impact of amino acid variations in Arabidopsis proteins and protein complexes. In silico deep mutational profiling with artificial intelligence methods allows the identification of regions with high evolutionary constraint and functional importance. Furthermore, our project develops bioinformatics tools for protein variant 3D visualization and effect prediction, which are implemented in the SNPstar database, providing a user-friendly interface to explore the repertoire of SNPs in Arabidopsis.