Tools

Genome Assembly Page

Since the release of the first full soybean genome assembly in 2010, assemblies have been generated for more than 50 accessions, including multiple assemblies for the first reference, Williams 82 (Wm82).

There are several nomenclature patterns for the assemblies and annotations. The pattern used by the DOE-JGI and SoyBase has generally taken the form Wm82.a4.v1, with the middle field ("a4") indicating assembly version and the last field (v1) indicating the annotation version. Within the SoyBase and LegumeInfo Data Store, the pattern takes the form Wm82.gnm4.ann1 -- again, with the middle field ("gnm4") indicating assembly version and the last field (ann1) indicating the annotation version.

See below for details about the main reference genotypes and assemblies in use in the U.S.

Williams 82 Reference Genome Assemblies

The soybean Williams 82 (Wm82) reference genome has been released three times since its first release as a whole-genome shotgun approach in 2010.

Assembly - Common Assembly - Data Store Annotation - Common Annotation - Data Store Annotation - JGI
Wm82.a4 Wm82.gnm4 Wm82.a4.v1 Wm82.gnm4.ann1 Wm82.a4.v1
Wm82.a2 Wm82.gnm2 Wm82.a2.v1 Wm82.gnm2.ann1 Wm82.a2.v1
Wm82.a1 Wm82.gnm1 Wm82.a1.v1.1 Wm82.gnm1.ann1 Glyma1

Wm82 Genome Assembly Details

The soybean Williams 82 (Wm82) reference genome has been released three times since its first release as a 8X whole-genome shotgun approach in 2010. The soybean community has adopted naming standards to accommodate multiple Glycine species, culivars, and genome assemblies.

Wm82.a1.v1.1

Originally generated by the Joint Genome Institute (JGI) in 2008 as a 8X whole-genome shotgun assembly. The assembly was officially reported in 2010 in Nature. This assembly is known as Wm82.a1.v1.1 at SoyBase. It differs from Wm82.a1.v1 by the inclusion of more expression data used to construct the gene models.

Accession Source
GCF_000004515.3 RefSeq
GCA_000004515.2 GenBank
Glycine max v1.1 JGI/Phytozome
Wm82.gnm1 SoyBase/LIS Data Store

Wm82.a2.v1

The original assembly (Glyma1) was generated by the Joint Genome Institute (JGI) in 2006 and corrected by including more EST information as well as a denser genetic map. This assembly (Wm82.a2.v1) replaces the Glyma1 assembly (Wm82.a1.v1.1) by including the data from a genetic map produced by Perry Cregan and Qijian Song at the Beltsville Agricultural Research Center West, USDA, ARS (Song Q, Jenkins J, Jia G, et al. BMC Genomics. 2016;17:33). This corrects several issues in pseudomolecule reconstruction in the Glyma1 assembly. The Wm82.a2.v1 gene set was created by integrating ~1.6 million ESTs, some 454 ESTs and 1.5 billion paired-end Illumina RNA-seq reads with homology-based gene predictions. It is the version known as Wm82.a2.v1 at SoyBase

Accession Source
GCF_000004515.5 RefSeq
GCA_000004515.2 GenBank
Glycine max Wm82.a2.v1 JGI/Phytozome
Wm82.gnm2 SoyBase/LIS Data Store

Wm82.a4.v1

Originally generated by the Joint Genome Institute (JGI) in 2019. This version differs from Wm82.a2.v1 by the inclusion of more RNA evidence for the new gene models and the inclusion of optical mapping to resolve structural variations between other assemblies. The assembly was further enhanced by the use of long PACBIO reads. This is the version known as Wm82.a4.v1 at SoyBase.

Accession Source
GCF_000004515.5 RefSeq
GCA_000004515.2 GenBank
Glycine max Wm82.a2.v1 JGI/Phytozome
Wm82.gnm2 SoyBase/LIS Data Store

Other Cultivar Genome Assembly Details

CV Fiskeby III Assembly Details

The soybean cultivar Fiskeby III (PI 438471) reference genome was released in 2020 by the Joint Genome Institue (JGI). The plant introduction was developed in Sweden in the early 1970's. It has been a focus of development because of its tolerance to many biotic and abiotic stresses.

Glyma.FiskebyIII.a1.v1.1

The Joint Genome Institute (JGI) released in the genome sequence of Fiskeby III in 2020. It is known as Glyma.FiskebyIII.a1.v1.1 at SoyBase.

Accession Source
Glycine max Fiskeby III v1 JGI/Phytozome
FiskebyIII.gnm1 SoyBase/LIS Data Store

CV Lee Genome Assembly Details

The soybean cultivar Lee reference genome was produced by the Joint Genome Institute (JGI). The genome sequence was released in 2018. The cultivar Lee is a representative of the Southern soybean germplasm where Williams 82 is a representative of the Northern germplasm. The Southern germplasm has been under development for different biotic and abiotic phenotypes than that of the Northern germplasms and thus an examination of genome sequence of Lee in comparison with the genome sequence of Williams 82 may shed light on the source of the adaptive phenotypes in the Southern germplasm.

Glyma.Lee.a1.v1.1

Originally generated by the Joint Genome Institute (JGI) in 2018 where it is known as Glycine max Lee v1.1. This assembly is known as Glyma.Lee.a1.v1.1 at SoyBase.

Accession Source
GCA_002905335.1 GenBank
Glycine max Lee.v1.1 JGI/Phytozome
Lee.gnm1 SoyBase/LIS Data Store

Assembly Chromosome Lengths

Chr. Wm82.a1 Wm82.a2 Wm82.a4 Glyma.Lee.a1 Glyma.ZH13.a1 Glyma.FiskebyIII.a1
Gm01 55,915,596 56,831,625 57,932,356 58,711,476 59,644,097 58,603,989
Gm02 51,656,714 48,577,506 50,400,359 52,519,506 51,554,906 51,753,574
Gm03 47,781,077 45,779,782 46,951,867 48,043,252 47,131,600 47,330,301
Gm04 49,243,853 52,389,147 51,203,390 53,766,020 53,205,140 52,141,793
Gm05 41,936,505 42,234,499 42,274,531 43,551,194 44,072,942 43,497,425
Gm06 50,722,822 51,416,487 50,945,865 52,961,590 52,461,415 51,481,042
Gm07 44,683,158 44,630,647 44,949,257 46,256,006 47,739,153 45,972,095
Gm08 46,995,533 47,837,941 47,227,185 49,267,166 50,214,344 47,814,857
Gm09 46,843,751 50,189,765 50,572,669 50,397,292 51,981,488 49,460,384
Gm10 50,969,636 51,566,899 51,638,688 53,727,963 53,404,89 53,385,864
Gm11 39,172,791 34,766,868 39,643,746 39,810,553 40,685,020 39,393,093
Gm12 40,113,141 40,091,315 41,531,200 43,006,440 43,299,438 43,161,464
Gm13 44,408,972 45,874,163 45,225,049 46,215,650 46,899,706 46,185,279
Gm14 49,711,205 49,042,193 49,893,279 51,385,243 51,865,536 53,070,468
Gm15 50,939,161 51,756,344 53,754,296 55,555,992 54,175,411 52,060,485
Gm16 37,397,386 37,887,015 38,112,071 39,184,906 38,219,656 38,284,019
Gm17 41,906,775 41,641,367 41,740,657 43,086,465 42,702,338 41,905,860
Gm18 62,308,141 58,018,743 58,286,271 60,485,748 60,216,011 61,311,078
Gm19 50,589,442 50,746,917 51,272,881 52,627,897 52,410,246 51,186,492
Gm20 46,773,168 47,904,182 47,846,027 50,153,687 52,417,242 50,785,628