#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Pan-genomic open reading frames: A potential supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction


Autoři: Zhengcao Li aff001;  Henner Simianer aff001
Působiště autorů: Animal Breeding and Genetics Group, Center for Integrated Breeding Research, Department of Animal Sciences, University of Goettingen, Goettingen, Germany aff001;  State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, China aff002
Vyšlo v časopise: Pan-genomic open reading frames: A potential supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction. PLoS Genet 16(8): e32767. doi:10.1371/journal.pgen.1008995
Kategorie: Research Article
doi: https://doi.org/10.1371/journal.pgen.1008995

Souhrn

Pan-genomic open reading frames (ORFs) potentially carry protein-coding gene or coding variant information in a population. In this study, we suggest that pan-genomic ORFs are promising to be utilized in estimation of heritability and genomic prediction. A Saccharomyces cerevisiae dataset with whole-genome SNPs, pan-genomic ORFs, and the copy numbers of those ORFs is used to test the effectiveness of ORF data as a predictor in three prediction models for 35 traits. Our results show that the ORF-based heritability can capture more genetic effects than SNP-based heritability for all traits. Compared to SNP-based genomic prediction (GBLUP), pan-genomic ORF-based genomic prediction (OBLUP) is distinctly more accurate for all traits, and the predictive abilities on average are more than doubled across all traits. For four traits, the copy number of ORF-based prediction(CBLUP) is more accurate than OBLUP. When using different numbers of isolates in training sets in ORF-based prediction, the predictive abilities for all traits increased as more isolates are added in the training sets, suggesting that with very large training sets the prediction accuracy will be in the range of the square root of the heritability. We conclude that pan-genomic ORFs have the potential to be a supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction.

Klíčová slova:

Gene prediction – Genetics – Genomics – Heredity – Human genomics – principal component analysis – Saccharomyces cerevisiae – Single nucleotide polymorphisms


Zdroje

1. Meuwissen Theo HE and  Hayes Ben J and  Goddard Michael E. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001; 157(4):1819–1829.

2. Goddard ME and  Hayes BJ. Genomic selection. Journal of Animal breeding and Genetics. 2007; 124(6):323–330. doi: 10.1111/j.1439-0388.2007.00702.x 18076469

3. Schaeffer LR. Strategy for applying genome-wide selection in dairy cattle. Journal of animal Breeding and genetics. 2006; 123(4):218–223. doi: 10.1111/j.1439-0388.2006.00595.x

4. Goddard Michael E and  Hayes Ben J and  Meuwissen Theo HE. Genomic selection in livestock populations. Genetics research. 2010; 92(5-6):413–421. doi: 10.1017/S0016672310000613 21429272

5. Crossa José and  Pérez-Rodríguez Paulino and  Cuevas Jaime and  Montesinos-López Osval and  Jarquín Diego and  de los Campos Gustavo and  Burgueño Juan and  González-Camacho Juan M and  Pérez-Elizalde Sergio and  Beyene Yoseph and others. Genomic selection in plant breeding: methods, models, and perspectives. Trends in plant science. 2017; 22(11):961–975. doi: 10.1016/j.tplants.2017.08.011 28965742

6. Abraham Gad and  Inouye Michael. Genomic risk prediction of complex human disease and its clinical application. Current opinion in genetics & development. 2015; 33 : 10–16. doi: 10.1016/j.gde.2015.06.005

7. Wray Naomi R and  Yang Jian and  Hayes Ben J and  Price Alkes L and  Goddard Michael E and  Visscher Peter M. Author reply to A commentary on Pitfalls of predicting complex traits from SNPs. PLoS genetics. 2013; 14(12):894.

8. de los Campos Gustavo and  Vazquez Ana I and  Fernando Rohan and  Klimentidis Yann C and  Sorensen Daniel. Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS genetics. 2013; 9(7):e1003608. doi: 10.1371/journal.pgen.1003608 23874214

9. Evans Luke M and  Tahmasbi Rasool and  Vrieze Scott I and  Abecasis Gonçalo R and  Das Sayantan and  Gazal Steven and  Bjelland Douglas W and  De Candia, Teresa R and  Goddard Michael E and  Neale Benjamin M and others. Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nature genetics. 2018; 50(5):737–745. doi: 10.1038/s41588-018-0108-x

10. Wray Naomi R and  Yang Jian and  Hayes Ben J and  Price Alkes L and  Goddard Michael E and  Visscher Peter M. Pitfalls of predicting complex traits from SNPs. Nature Reviews Genetics. 2013; 14(7):507–515. doi: 10.1038/nrg3457 23774735

11. Yang Jian and  Benyamin Beben and  McEvoy Brian P and  Gordon Scott and  Henders Anjali K and  Nyholt Dale R and  Madden Pamela A and  Heath Andrew C and  Martin Nicholas G and  Montgomery Grant W and others. Common SNPs explain a large proportion of the heritability for human height. Nature genetics. 2010; 42(7):565–569. doi: 10.1038/ng.608 20562875

12. Yang Jian and  Zeng Jian and  Goddard Michael E and  Wray Naomi R and  Visscher Peter M. Concepts, estimation and interpretation of SNP-based heritability. Nature genetics. 2017; 49(9):1304. doi: 10.1038/ng.3941 28854176

13. Sieber P, Platzer M, Schuster S. The definition of open reading frame revisited. Trends in Genetics. 2018; 34(3):167–170.

14. Lapierre Pascal and  Gogarten J Peter. Estimating the size of the bacterial pan-genome. Trends in genetics. 2009; 25(3):107–110. doi: 10.1016/j.tig.2008.12.004 19168257

15. Vernikos George and  Medini Duccio and  Riley David R and  Tettelin Herve. Ten years of pan-genome analyses. Current opinion in microbiology. 2015; 23 : 148–154. doi: 10.1016/j.mib.2014.11.016 25483351

16. Tettelin Hervé and  Masignani Vega and  Cieslewicz Michael J and  Donati Claudio and  Medini Duccio and  Ward Naomi L and  Angiuoli Samuel V and  Crabtree Jonathan and  Jones Amanda L and  Durkin A Scott and others. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proceedings of the National Academy of Sciences. 2005; 102(39):13950–13955. doi: 10.1073/pnas.0506758102

17. Aherfi Sarah and  Pagnier Isabelle and  Fournous Ghislain and  Raoult Didier and  La Scola Bernard and  Colson Philippe. Complete genome sequence of Cannes 8 virus, a new member of the proposed family “Marseilleviridae”. Virus Genes. 2013; 47(3):550–555. doi: 10.1007/s11262-013-0965-4 23912978

18. Gao Lei and  Gonda Itay and  Sun Honghe and  Ma Qiyue and  Bao Kan and  Tieman Denise M and  Burzynski-Chang Elizabeth A and  Fish Tara L and  Stromberg Kaitlin A and  Sacks Gavin L and others. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nature genetics. 2019; 51(6):1044–1051. doi: 10.1038/s41588-019-0410-2 31086351

19. Li Ying-hui and  Zhou Guangyu and  Ma Jianxin and  Jiang Wenkai and  Jin Long-guo and  Zhang Zhouhao and  Guo Yong and  Zhang Jinbo and  Sui Yi and  Zheng Liangtao and others. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nature Biotechnology. 2014; 32(10):1045. doi: 10.1038/nbt.2979 25218520

20. Zhao Qiang and  Feng Qi and  Lu Hengyun and  Li Yan and  Wang Ahong and  Tian Qilin and  Zhan Qilin and  Lu Yiqi and  Zhang Lei and  Huang Tao and others. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nature genetics. 2018; 50(2):278–284. doi: 10.1038/s41588-018-0041-z 29335547

21. Dunn Barbara and  Richter Chandra and  Kvitek Daniel J and  Pugh Tom and  Sherlock Gavin. Analysis of the Saccharomyces cerevisiae pan-genome reveals a pool of copy number variants distributed in diverse yeast strains from differing industrial environments. Genome research. 2012; 22(5):908–924. doi: 10.1101/gr.130310.111 22369888

22. Sherman Rachel M and  Forman Juliet and  Antonescu Valentin and  Puiu Daniela and  Daya Michelle and  Rafaels Nicholas and  Boorgula Meher Preethi and  Chavan Sameer and  Vergara Candelaria and  Ortega Victor E and others. Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nature genetics. 2019; 51(1):30–39. doi: 10.1038/s41588-018-0273-y 30455414

23. Donati Claudio and  Hiller N Luisa and  Tettelin Hervé and  Muzzi Alessandro and  Croucher Nicholas J and  Angiuoli Samuel V and  Oggioni Marco and  Hotopp Julie C Dunning and  Hu Fen Z and  Riley David R and others. Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species. Genome biology. 2010; 11(10):R107. doi: 10.1186/gb-2010-11-10-r107 21034474

24. D’Auria Giuseppe and  Jiménez-Hernández Nuria and  Peris-Bondia Francesc and  Moya Andrés and  Latorre Amparo. Legionella pneumophila pangenome reveals strain-specific virulence factors. BMC genomics. 2010; 11(1):181–194. doi: 10.1186/1471-2164-11-181 20236513

25. Hu Pan and  Yang Ming and  Zhang Anding and  Wu Jiayan and  Chen Bo and  Hua Yafeng and  Yu Jun and  Chen Huanchun and  Xiao Jingfa and  Jin Meilin. Comparative genomics study of multi-drug-resistance mechanisms in the antibiotic-resistant Streptococcus suis R61 strain. PLoS One. 2011; 6(9):e24988. doi: 10.1371/journal.pone.0024988 21966396

26. Konstantinidis Konstantinos T and  Ramette Alban and  Tiedje James M. The bacterial species definition in the genomic era. Philosophical Transactions of the Royal Society B: Biological Sciences. 2006; 361(1475):1929–1940. doi: 10.1098/rstb.2006.1920

27. Botstein David and  Fink Gerald R. Yeast: an experimental organism for 21st Century biology. Genetics. 2011; 189(3):695–704. doi: 10.1534/genetics.111.130765 22084421

28. Fay Justin C. The molecular basis of phenotypic variation in yeast. Current opinion in genetics & development. 2013; 23(6):672–677. doi: 10.1016/j.gde.2013.10.005

29. Bloom Joshua S and  Ehrenreich Ian M and  Loo Wesley T and  Lite Thúy-Lan Võ and  Kruglyak Leonid. Finding the sources of missing heritability in a yeast cross. Nature. 2013; 494(7436):234–237. doi: 10.1038/nature11867 23376951

30. Kumar Anuj and  Snyder Michael. Emerging technologies in yeast genomics. Nature Reviews Genetics. 2001; 2(4):302–312. doi: 10.1038/35066084 11283702

31. Märtens Kaspar and  Hallin Johan and  Warringer Jonas and  Liti Gianni and  Parts Leopold. Predicting quantitative traits from genome and phenome with near perfect accuracy. Nature communications. 2016; 7 : 11512–11520. doi: 10.1038/ncomms11512 27160605

32. Marroni Fabio and  Pinosio Sara and  Morgante Michele. Structural variation and genome complexity: is dispensable really dispensable?. Current Opinion in Plant Biology. 2014; 18 : 31–36.

33. Peter Jackson and  De Chiara Matteo and  Friedrich Anne and  Yue Jia-Xing and  Pflieger David and  Bergström Anders and  Sigwalt Anastasie and  Barre Benjamin and  Freel Kelle and  Llored Agnès and others. Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Nature. 2018; 556(7701):339–344. doi: 10.1038/s41586-018-0030-5 29643504

34. Maher Brendan. Personal genomes: The case of the missing heritability. Nature News. 2008; 456(7218):18–21. doi: 10.1038/456018a

35. Hill William G and  Goddard Michael E and  Visscher Peter M. Data and theory point to mainly additive genetic variance for complex traits. PLoS genetics. 2008; 4(2):e1000008. doi: 10.1371/journal.pgen.1000008 18454194

36. Walker Francis O. Huntington’s disease. The Lancet. 2007; 369(9557):218–228. doi: 10.1016/S0140-6736(07)60111-1

37. Gonzalez Enrique and  Kulkarni Hemant and  Bolivar Hector and  Mangano Andrea and  Sanchez Racquel and  Catano Gabriel and  Nibbs Robert J and  Freedman Barry I and  Quinones Marlon P and  Bamshad Michael J and others. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science. 2005; 307(5714):1434–1440. doi: 10.1126/science.1101160 15637236

38. Goddard Michael E and  Wray Naomi R and  Verbyla Klara and  Visscher Peter M and others. Estimating effects and making predictions from genome-wide marker data. Statistical Science. 2009; 24(4):517–529. doi: 10.1214/09-STS306

39. Kim Hwasoon and  Grueneberg Alexander and  Vazquez Ana I and  Hsu Stephen and  de los Campos Gustavo. Will big data close the missing heritability gap?. Genetics. 2017; 207(3):1135–1145. doi: 10.1534/genetics.117.300271 28893854

40. Speed Doug and  Hemani Gibran and  Johnson Michael R and  Balding David J. Improved heritability estimation from genome-wide SNPs. The American Journal of Human Genetics. 2012; 91(6):1011–1021. doi: 10.1016/j.ajhg.2012.10.010 23217325

41. Erbe Malena and  Gredler Birgit and  Seefried Franz Reinhold and  Bapst Beat and  Simianer Henner. A function accounting for training set size and marker density to model the average accuracy of genomic prediction. PLoS One. 2013; 8(12):e81046. doi: 10.1371/journal.pone.0081046 24339895

42. Bentley Stephen. Sequencing the species pan-genome. Nature Reviews Microbiology. 2009; 7 : 258–259.

43. Georges Michel and  Charlier Carole and  Hayes Ben. Harnessing genomic information for livestock improvement. Nature Reviews Genetics. 2019; 20(3):135–156. doi: 10.1038/s41576-018-0082-2 30514919

44. Marouli Eirini and  Graff Mariaelisa and  Medina-Gomez Carolina and  Lo Ken Sin and  Wood Andrew R and  Kjaer Troels R and  Fine Rebecca S and  Lu Yingchang and  Schurmann Claudia and  Highland Heather M and others. Rare and low-frequency coding variants alter human adult height. Nature. 2017; 542(7640):186–190. doi: 10.1038/nature21039 28146470

45. Maurano Matthew T and  Humbert Richard and  Rynes Eric and  Thurman Robert E and  Haugen Eric and  Wang Hao and  Reynolds Alex P and  Sandstrom Richard and  Qu Hongzhu and  Brody Jennifer and others. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012; 337(6099):1190–1195. doi: 10.1126/science.1222794 22955828

46. Albert Frank W and  Kruglyak Leonid. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nature Reviews Genetics. 2015; 16(4):197–212. doi: 10.1038/nrg3891 25707927

47. Yan Hai and  Dobbie Zuzana and  Gruber Stephen B and  Markowitz Sanford and  Romans Kathy and  Giardiello Francis M and  Kinzler Kenneth W and  Vogelstein Bert. Small changes in expression affect predisposition to tumorigenesis. Nature genetics. 2002; 30(1):25–26. doi: 10.1038/ng799 11743581

48. Kleinjan Dirk A and  van Heyningen Veronica. Long-range control of gene expression: emerging mechanisms and disruption in disease. The American Journal of Human Genetics. 2005; 76(1):8–32. doi: 10.1086/426833 15549674

49. Goffeau André and  Barrell Bart G and  Bussey Howard and  Davis RW and  Dujon Bernard and  Feldmann Heinz and  Galibert Francis and  Hoheisel JD and  Jacq Cr and  Johnston Michael and others. Life with 6000 genes. Science. 1996; 274(5287):546–567. doi: 10.1126/science.274.5287.546 8849441

50. Es Lander and  Lm Linton and others. Initial sequencing and analysis of the human genome. Nature. 2001; 409(6822):860. doi: 10.1038/35057062

51. Li Mingzhou and  Chen Lei and  Tian Shilin and  Lin Yu and  Tang Qianzi and  Zhou Xuming and  Li Diyan and  Yeung Carol KL and  Che Tiandong and  Jin Long and others. Comprehensive variation discovery and recovery of missing sequence in the pig genome using multiple de novo assemblies. Genome research. 2017; 27(5):865–874. doi: 10.1101/gr.207456.116 27646534

52. Wang Wensheng and  Mauleon Ramil and  Hu Zhiqiang and  Chebotarov Dmytro and  Tai Shuaishuai and  Wu Zhichao and  Li Min and  Zheng Tianqing and  Fuentes Roven Rommel and  Zhang Fan and others. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature. 2018; 557(7703):43–49. doi: 10.1038/s41586-018-0063-9 29695866

53. Hurgobin Bhavna and  Golicz Agnieszka A and  Bayer Philipp E and  Chan Chon-Kit Kenneth and  Tirnaz Soodeh and  Dolatabadian Aria and  Schiessl Sarah V and  Samans Birgit and  Montenegro Juan D and  Parkin Isobel AP and others. Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant biotechnology journal. 2018; 16(7):1265–1274. doi: 10.1111/pbi.12867 29205771

54. Montenegro Juan D and  Golicz Agnieszka A and  Bayer Philipp E and  Hurgobin Bhavna and  Lee HueyTyng and  Chan Chon-Kit Kenneth and  Visendi Paul and  Lai Kaitao and  Doležel Jaroslav and  Batley Jacqueline and others. The pangenome of hexaploid bread wheat. The Plant Journal. 2017; 90(5):1007–1013. doi: 10.1111/tpj.13515 28231383

55. Golicz Agnieszka A and  Bayer Philipp E and  Barker Guy C and  Edger Patrick P and  Kim HyeRan and  Martinez Paula A and  Chan Chon Kit Kenneth and  Severn-Ellis Anita and  McCombie W Richard and  Parkin Isobel AP and others. The pangenome of an agronomically important crop plant Brassica oleracea. Nature communications. 2016; 7 : 13390.

56. Jun Yu and  Songnian Hu and  Jun Wang. A Draft Sequence of the Rice Genome (Oryza sativa L. Ssp. Indica). Science. 2002; 296(5565):79–91. doi: 10.1126/science.1068037 11935017

57. Wray Naomi R and  Kemper Kathryn E and  Hayes Benjamin J and  Goddard Michael E and  Visscher Peter M. Complex Trait Prediction from Genome Data: Contrasting EBV in Livestock to PRS in Humans: Genomic Prediction. Genetics. 2019; 211(4):1131–1141. doi: 10.1534/genetics.119.301859 30967442

58. Skelly Daniel A and  Merrihew Gennifer E and  Riffle Michael and  Connelly Caitlin F and  Kerr Emily O and  Johansson Marnie and  Jaschob Daniel and  Graczyk Beth and  Shulman Nicholas J and  Wakefield Jon and others. Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome research. 2013; 23(9):1496–1504. doi: 10.1101/gr.155762.113 23720455

59. Bergström Anders and  Simpson Jared T and  Salinas Francisco and  Barré Benjamin and  Parts Leopold and  Zia Amin and  Nguyen Ba Alex N and  Moses Alan M and  Louis Edward J and  Mustonen Ville and others. A high-definition view of functional genetic variation from natural yeast genomes. Molecular biology and evolution. 2014; 31(4):872–888. doi: 10.1093/molbev/msu037 24425782

60. Strope Pooja K and  Skelly Daniel A and  Kozmin Stanislav G and  Mahadevan Gayathri and  Stone Eric A and  Magwene Paul M and  Dietrich Fred S and  McCusker John H. The 100-genomes strains, an S. cerevisiae resource that illuminates its natural phenotypic and genotypic variation and emergence as an opportunistic pathogen. Genome research. 2015; 25(5):762–774. doi: 10.1101/gr.185538.114 25840857

61. Browning Brian L and  Browning Sharon R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics. 2013; 194(2):459–471. doi: 10.1534/genetics.113.150029 23535385

62. Li Heng and  Durbin Richard. Fast and accurate short read alignment with Burrows–Wheeler transform. bioinformatics. 2009; 25(14):1754–1760. doi: 10.1093/bioinformatics/btp324 19451168

63. VanRaden Paul M. Efficient methods to compute genomic predictions. Journal of dairy science. 2008; 91(11):4414–4423.

64. Team, R Core and others. R: A language and environment for statistical computing. Computing. 2013.

65. Pérez Paulino and  de Los Campos Gustavo. Genome-wide regression and prediction with the BGLR statistical package. Genetics. 2014; 198(2):483–495. doi: 10.1534/genetics.114.164442 25009151

66. Clifford David and  McCullagh Peter. Package ‘regress’. 2013.

67. Paradis Emmanuel and  Schliep Klaus. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics. 2018; 35(3):526–528. doi: 10.1093/bioinformatics/bty633


Článek vyšel v časopise

PLOS Genetics


2020 Číslo 8
Nejčtenější tento týden
Nejčtenější v tomto čísle
Kurzy

Zvyšte si kvalifikaci online z pohodlí domova

Cesta pacienta nejen s SMA do nervosvalového centra
nový kurz
Autoři: MUDr. Jana Junkerová, MUDr. Lenka Juříková

Svět praktické medicíny 2/2025 (znalostní test z časopisu)

Eozinofilní zánět a remodelace
Autoři: MUDr. Lucie Heribanová

Hypertrofická kardiomyopatie: Moderní přístupy v diagnostice a léčbě
Autoři: doc. MUDr. David Zemánek, Ph.D., MUDr. Anna Chaloupka, Ph.D.

Vliv funkčního chrupu na paměť a učení
Autoři: doc. MUDr. Hana Hubálková, Ph.D.

Všechny kurzy
Kurzy Podcasty Doporučená témata Časopisy
Přihlášení
Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.

Přihlášení

Nemáte účet?  Registrujte se

#ADS_BOTTOM_SCRIPTS#