1000 genomes genome downloader | Dassault Systèmes®

PG 2015-05-08

Here you will find a component that may save you some time. It's designed to retrieve full sequences for a Sample ID from the 100 genomes project (http://www.1000genomes.org/). And a cople of examples on how to use it.

The component is designed to accept the sampleID

and download one or more of the following features for that sampleID:

sequence_read
alignment
exome_alignment

If available for the dataset selected the component will retrieve:

high coverage
low coverage
exon targetted
exome

The component is also designed to help you to select your closest server:

ftp.1000genomes.ebi.ac.uk
ftp.ncbi.nlm.nih.gov
Custom

After the information is retrieved, you can use the files obtained to start your NGS experiment. Like on the example shown below.

The first time you use the component on your system, you may notice that there is a delay retrieving the file, not relevant comparing with the size of the files being retrieved. That delay is due to the index file being retrieved to your system. Once the index of all the existing 100 genomes available samples is on your server, your could reuse it, default option, or force the retrieval of a fresh index file.

Any suggestion, improvement welcome

Pedro