Tue Apr 7, 2015
Sequnceserver is a webserver that can be run locally and will provide a slick interface to Blast for any sequence data you may have. Any lab with a fair amount of sequence data will find this tool to be useful. Heres a few notes on our setup.
#We're using a local machine with RHEL6.4 Make sure blast and ruby are installed. yum install ncbi-blast yum install ruby # probably already installed gem install sequenceserver # Boom! simple as that
Sequenceserver’s config file.
~/.sequenceserver.conf allows you to specify the folder with your blast databases, change the port it is served on, and the number of threads/cpus you make available to blast. I put the sequence server folder in its own directory and link all of the blast databases to it. That way I can keep big NCBI databases like NT and NR in their own folders. I can also keep lab-generated data organized in their own folders with their own documentation.
#example directory structure . ├── blastdbs │ ├── db1 │ ├── db2 │ ├── ncbinr │ └── ncbint └── sequenceserver
Now if I have a file labdata.fna in the db1 folder I can generate an blast database with the following commands. The
title option will be visible in the sequenceserver interface so having it be short but descriptive is a good idea.
-parse_seqids is necessary for local databases if you want to recover the sequences from the web interface. Also, do not start your fasta names with numbers.
#add some documentation echo "data about these files in here" > Readme.md #make the blast database makeblastdb -dbtype nucl \ -title "This is Contig Data from Awesome Project XX generated April 2015" \ -parse_seqids \ -in labdata.fna #link it to the sequence server folder cp -l labdata.fna* ../../sequenceserver/ #resulting file structure . ├── blastdbs │ ├── db1 │ │ ├── labdata.fna │ │ ├── labdata.fna.nhr │ │ ├── labdata.fna.nin │ │ ├── labdata.fna.nog │ │ ├── labdata.fna.nsd │ │ ├── labdata.fna.nsi │ │ ├── labdata.fna.nsq │ │ └── Readme.md │ ├── db2 │ ├── ncbinr │ └── ncbint └── sequenceserver #these are all links ├── labdata.fna ├── labdata.fna.nhr ├── labdata.fna.nin ├── labdata.fna.nog ├── labdata.fna.nsd ├── labdata.fna.nsi └── labdata.fna.nsq
Lastly, because this is run on the server across the lab/campus, I usually launch it by ssh-ing into the machine and activating sequenceserver remotely. As anyone who has had a process interrupted when the ssh connection breaks can attest, you need a way to let the process continue running after you log out. I have been using screen for this purpose with a workflow as follows.
#login ssh username@localserver #install screen (if not installed) yum install screen #activate screen screen #turn on sequence server sequenceserver #profit http://localserver:4567