PGDSpider version 2.0.7.2 (October 2014)


An automated data conversion tool for connecting population genetics and genomics programs


Introduction

System requirements

Download and Installation Instructions

Formats supported by PGDSpider

Help

Screenshot

Links

How to cite PGDSpider and License

Contact and bug report



Introduction

PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances). Besides the conventional population genetics formats, PGDSpider integrates population genomics data formats commonly used to store and handle next-generation sequencing (NGS) data. Currently, PGDSpider is not meant to convert very large NGS files as it loads into memory the whole input file, whose size may exceed available RAM. However, since PGDSpider allows one to convert specific subsets of these NGS files into any other format, one could use this feature to calculate parameters or statistics for specific regions, and thus perform sliding window analysis over large genomic regions.

PGDSpider uses a newly developed PGD (Population Genetics Data) format as an intermediate step in the conversion process. PGD is a file format designed to store various kinds of population genetics data, including different data types (e.g. DNA sequences, microsatellites, AFLP or SNPs) and ploidy levels. PGD is based on the XML format and is therefore independent of any particular computer system and extensible for future needs. PGDSpider uses PGD to connect population genetics and genomics programs like a spider knits a web.

PGDSpider is written in Java and is therefore platform independent. It is user friendly due to its intuitive graphical user interface. PGDSpider allows the user to store his preferred conversion settings for repeated conversions of similar input formats. A command line version of PGDSpider is also provided, making it possible to embed PGDSpider in data analysis pipelines.



System requirements

PGDSpider is written in Java and therefore platform independent, but SUN Java 1.6 RE (or a newer version) has to be installed. Java6 RE can be downloaded under following link:

http://www.oracle.com/technetwork/java/javase/downloads/index.html



Download and Installation Instructions


1st step:

Install the Java6 RE


Windows:

Linux:

Mac:

2nd step:

Download the PGDSpider application and unzip it on the local drive: PGDSpider (2.0.7.2) - Download


Execute PGDSpider GUI:

Execute PGDSpider-cli (command line):

Java Web Start:

Additionally we provide the possibility to download and run PGDSpider from the web by the Java Web Start software. Java Web Start provides an easy, one-click activation of PGDSpider and it guarantees that you are always running the latest version.
Java Web Start is included in the Java Runtime Environment. Have a look at Installation Instructions under 1st step to get information on how to get Java6 RE (or a newer version).

Launch PGDSpider using Java Web Start:

Limitations: Starting PGDSpider from Java Web Start it is not possible to change the amount of memory PGDSpider is allowed to use (by default it is set to 1 GB).



Formats supported by PGDSpider:

PGDSpider is able to parse 31 and to write 34 different file formats:

  Data format   Version   References and Links   External dependency  Input format  Output format
  PGD   1.0   Lischer and Excoffier, 2012   x x
  Arlequin   3.5
  (24.2.2010)
  Excoffier and Lischer, 2010   x x
  BAM   (17.4.2011)   Li et al., 2009   SAMtools, BCFtools x x
  BAMOVA   1.02
  (27.9.2011)
  Gompert and Buerkle, 2011; Gompert et al., 2010     x
  BAPS   5.4
  (29.4.2010)
  Tang et al., 2009   x x
  BATWING   (2003)   Wilson et al., 2003   x x
  BCF   (14.5.2011)     SAMtools, BCFtools x x
  CONVERT   1.31
  (March 2005)
  Glaubitz, 2004   x  
  EIGENSOFT   5.0.2
  (April 2014)
  Patterson et al., 2006; Price et al., 2006   x x
  FASTA     Pearson, 1990   x x
  FASTQ     Cock, et al., 2010   x x
  FDist2 (datacal)     Beaumont and Nichols, 1996; Flint et al., 1999     x
  FSTAT   2.9.3.2
  (February 2002)
  Goudet, 2001   x x
  GDA   1.1
  (7.1.2002)
  Lewis, 2001   x x
  GENELAND   (12.04.2011)   Guillot, 2008; Guillot et al., 2005; Guillot and Santos, 2009;
  Guillot and Santos, 2010; Guillot et al.,2008
  x x
  GENEPOP   4.1
  (24.03.2011)
  Rousset, 2008   x x
  GENETIX   4.05
  (5.5.2004)
  Belkhir, 1996-2004   x x
  GESTE /
  BayeScan
  GESTE: 2.0
  BayeScan 2.01
  (December 2010)
  GESTE: Foll and Gaggiotti, 2006
  BayeScan: Fischer et al., 2011; Foll et al., 2010; Foll and Gaggiotti, 2008
    x
  HGDP   Stanford     x  
  HGDP-CEPH
  (Arlequin + log file)
  3.0     x  
  Immanc (BayesAss)   5.0
  (8.10.1998)
  Rannala, Mountain, 1997; Wilson, Rannala, 2003   x x
  IM (IMa)   (17.12.2009)   Hey and Nielsen, 2004, 2007; Nielsen and Wakeley, 2001   x x
  IMa2   (26.08.2011)   Hey, 2010   x x
  KML   2.2   Google 2009     x
  MEGA   5
  (24.4.2011)
  Tamura et al., 2011   x x
  MIGRATE   3.2.6
  (13.10.2010)
  Beerli, 2009   x x
  MSA   4.05   Dieringer, Schlotterer, 2003   x x
  MSVar   0.4.1.b
  (7.4.1999)
  Beaumont, 1999     x
  NewHybrids   1.1 beta
  (7.4.2003)
  Anderson and Thompson, 2002   x x
  NEXUS     Maddison et al., 1997
  --> able to read CharSet definitions within a MrBayes block
  x x
  ONeSAMP   1.2   Tallmon et al., 2008   x x
  PED     Purcell et al., 2007   x x
  PHYLIP (RAxML)   3.69
  (September 2009)
  Felsenstein, 1989; Felsenstein, 2004   x x
  SAM   1.4
  (17.4.2011)
  Li et al. 2009   SAMtools, BCFtools x x
  Structurama     Huelsenbeck et al., 2011     x
  STRUCTURE (fastSTRUCTURE)   2.3.4
  (July 2012)
  Falush et al., 2003; Falush et al., 2007; Pritchard et al., 2000;
  Hubisz et al., 2009; Raj et al., 2014
  x x
  VCF   4.1
  (2.8.2012)
  --> without structural variants (only SNP and INDELs)   SAMtools, BCFtools x x

Note that, PGDSpider is currently not meant to convert large NGS files as it loads into memory the whole input file, which may lead to memory issues. However, PGDSpider allows one to convert specific subsets of these NGS files into any other format, and this approach can be used to perform sliding windows analyses on large NGS files.


Help

If you have any problems:


Screenshot

PGDSpider GUI:


SPID Editor:

 


How to cite PGDSpide and License

Lischer HEL and Excoffier L (2012) PGDSpider: An automated data conversion tool for connecting population genetics and genomics programs. Bioinformatics 28: 298-299.


Copyright (c) 2007-2014, Heidi E.L. Lischer. All rights reserved.

PGDSpider is distributed under the BSD 3-Clause License. For the full text of the license, see the file LICENSE.txt.
By using, modifying or distributing this software you agree to be bound by the terms of this license.

 


Contact and bug report

If there are any bugs, send me an e-mail. Please give me a short description of the bug and tell me the input and output file format. If it is possible also attach the input file which caused the problem.

PGDSpider is an on-going project. For any comments or suggestions of further file formats, please send me an e-mail.


e-mail: heidi.lischer(at)iee.unibe.ch


Heidi Lischer and Laurent Excoffier

Computational and Molecular Population Genetics lab (CMPG)
Institute of Ecology and Evolution (IEE)
University of Berne
3012 Bern
Switzerland

members of the Swiss Institute of Bioinformatics (SIB)


10.10.2014