Here you will find a component that may save you some time. It's designed to retrieve full sequences for a Sample ID from the 100 genomes project (http://www.1000genomes.org/). And a cople of examples on how to use it.
The component is designed to accept the sampleID
and download one or more of the following features for that sampleID:
- sequence_read
- alignment
- exome_alignment
If available for the dataset selected the component will retrieve:
- high coverage
- low coverage
- exon targetted
- exome
The component is also designed to help you to select your closest server:
- ftp.1000genomes.ebi.ac.uk
- ftp.ncbi.nlm.nih.gov
- Custom
After the information is retrieved, you can use the files obtained to start your NGS experiment. Like on the example shown below.
The first time you use the component on your system, you may notice that there is a delay retrieving the file, not relevant comparing with the size of the files being retrieved. That delay is due to the index file being retrieved to your system. Once the index of all the existing 100 genomes available samples is on your server, your could reuse it, default option, or force the retrieval of a fresh index file.
Any suggestion, improvement welcome
Pedro