Created
January 7, 2016 20:34
-
-
Save konrad/a00b96b1d84c2f9b5e97 to your computer and use it in GitHub Desktop.
Revisions
-
konrad created this gist
Jan 7, 2016 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,9 @@ # Problem: You have a NCBI GEO accession and would like to get the URL of the SRA file that contains the sequencing data. # The sed command that removes the last characer of the string is essential as there is a invisible character that messes up the # downstream steps otherwise. GEO_ACCESSION="GSM1655353" # set you GEO accession here SRA_FTP_URL=$(curl "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=${GEO_ACCESSION}&targ=self&form=text&view=brief" 2>/dev/null | grep ftp-trace.ncbi.nlm.nih.gov | cut -c 32-| sed 's/.$//') FTP_SUB_FOLDER=$(ncftpls ${SRA_FTP_URL}/) SRA_FILE=$(ncftpls ${SRA_FTP_URL}/${FTP_SUB_FOLDER}/) echo $GEO_ACCESSION ${SRA_FTP_URL}/${FTP_SUB_FOLDER}/${SRA_FILE}