zach charlop-powers

Sequence Server

Tue Apr 7, 2015

Sequnceserver is a webserver that can be run locally and will provide a slick interface to Blast for any sequence data you may have. Any lab with a fair amount of sequence data will find this tool to be useful. Heres a few notes on our setup.

#We're using a local machine with RHEL6.4 Make sure blast and ruby are installed.
yum install ncbi-blast
yum install ruby           # probably already installed
gem install sequenceserver # Boom! simple as that

Sequenceserver’s config file. ~/.sequenceserver.conf allows you to specify the folder with your blast databases, change the port it is served on, and the number of threads/cpus you make available to blast. I put the sequence server folder in its own directory and link all of the blast databases to it. That way I can keep big NCBI databases like NT and NR in their own folders. I can also keep lab-generated data organized in their own folders with their own documentation.

#example directory structure
├── blastdbs
│   ├── db1
│   ├── db2
│   ├── ncbinr
│   └── ncbint
└── sequenceserver

Now if I have a file labdata.fna in the db1 folder I can generate an blast database with the following commands. The title option will be visible in the sequenceserver interface so having it be short but descriptive is a good idea. -parse_seqids is necessary for local databases if you want to recover the sequences from the web interface. Also, do not start your fasta names with numbers.

#add some documentation
echo "data about these files in here" >

#make the blast database
makeblastdb -dbtype nucl \
            -title "This is Contig Data from Awesome Project XX generated April 2015" \
            -parse_seqids \
            -in labdata.fna 

#link it to the sequence server folder
cp -l labdata.fna* ../../sequenceserver/

#resulting file structure
├── blastdbs
│   ├── db1
│   │   ├── labdata.fna
│   │   ├── labdata.fna.nhr
│   │   ├── labdata.fna.nin
│   │   ├── labdata.fna.nog
│   │   ├── labdata.fna.nsd
│   │   ├── labdata.fna.nsi
│   │   ├── labdata.fna.nsq
│   │   └──
│   ├── db2
│   ├── ncbinr
│   └── ncbint
└── sequenceserver #these are all links
    ├── labdata.fna
    ├── labdata.fna.nhr
    ├── labdata.fna.nin
    ├── labdata.fna.nog
    ├── labdata.fna.nsd
    ├── labdata.fna.nsi
    └── labdata.fna.nsq

Lastly, because this is run on the server across the lab/campus, I usually launch it by ssh-ing into the machine and activating sequenceserver remotely. As anyone who has had a process interrupted when the ssh connection breaks can attest, you need a way to let the process continue running after you log out. I have been using screen for this purpose with a workflow as follows.

ssh username@localserver

#install screen (if not installed)
yum install screen

#activate screen

#turn on sequence server


All content copyright 2014 zach charlop-powers unless otherwise noted. Licensed under Creative Commons.

Find me on Twitter, GitHub, or drop me a line. Made with HUGO with inspiration from KH and DFM,