1000 genomes genome downloader

Here you will find a component that may save you some time. It's designed to retrieve full sequences for a Sample ID from the 100 genomes project (http://www.1000genomes.org/). And a cople of examples on how to use it.

1000Genomes Genome Downloader.PNG

1000Genomes Genome Downloader (Bam file).PNG

The component is designed to accept the sampleID

and download one or more of the following features for that sampleID:

  • sequence_read
  • alignment
  • exome_alignment

If available for the dataset selected the component will retrieve:

  • high coverage
  • low coverage
  • exon targetted
  • exome

The component is also designed to help you to select your closest server:

  • ftp.1000genomes.ebi.ac.uk
  • ftp.ncbi.nlm.nih.gov
  • Custom

After the information is retrieved, you can use the files obtained to start your NGS experiment. Like on the example shown below.

The first time you use the component on your system, you may notice that there is a delay retrieving the file, not relevant comparing with the size of the files being retrieved. That delay is due to the index file being retrieved to your system. Once the index of all the existing 100 genomes available samples is on your server, your could reuse it, default option, or force the retrieval of a fresh index file.

Any suggestion, improvement welcome

Pedro