Sequence Server
Sequnceserver is a webserver that can be run locally and will provide a slick interface to Blast for any sequence data you may have. Any lab with a fair amount of sequence data will find this tool to be useful. Heres a few notes on our setup.
#We're using a local machine with RHEL6.4 Make sure blast and ruby are installed.
yum install ncbi-blast
yum install ruby # probably already installed
gem install sequenceserver # Boom! simple as that
Sequenceserver’s config file. ~/.sequenceserver.conf
allows you to specify the folder with your blast databases, change the port it is served on, and the number of threads/cpus you make available to blast. I put the sequence server folder in its own directory and link all of the blast databases to it. That way I can keep big NCBI databases like NT and NR in their own folders. I can also keep lab-generated data organized in their own folders with their own documentation.
#example directory structure
.
├── blastdbs
│ ├── db1
│ ├── db2
│ ├── ncbinr
│ └── ncbint
└── sequenceserver
Now if I have a file labdata.fna in the db1 folder I can generate an blast database with the following commands. The title
option will be visible in the sequenceserver interface so having it be short but descriptive is a good idea. -parse_seqids
is necessary for local databases if you want to recover the sequences from the web interface. Also, do not start your fasta names with numbers.
#add some documentation
echo "data about these files in here" > Readme.md
#make the blast database
makeblastdb -dbtype nucl \
-title "This is Contig Data from Awesome Project XX generated April 2015" \
-parse_seqids \
-in labdata.fna
#link it to the sequence server folder
cp -l labdata.fna* ../../sequenceserver/
#resulting file structure
.
├── blastdbs
│ ├── db1
│ │ ├── labdata.fna
│ │ ├── labdata.fna.nhr
│ │ ├── labdata.fna.nin
│ │ ├── labdata.fna.nog
│ │ ├── labdata.fna.nsd
│ │ ├── labdata.fna.nsi
│ │ ├── labdata.fna.nsq
│ │ └── Readme.md
│ ├── db2
│ ├── ncbinr
│ └── ncbint
└── sequenceserver #these are all links
├── labdata.fna
├── labdata.fna.nhr
├── labdata.fna.nin
├── labdata.fna.nog
├── labdata.fna.nsd
├── labdata.fna.nsi
└── labdata.fna.nsq
Lastly, because this is run on the server across the lab/campus, I usually launch it by ssh-ing into the machine and activating sequenceserver remotely. As anyone who has had a process interrupted when the ssh connection breaks can attest, you need a way to let the process continue running after you log out. I have been using screen for this purpose with a workflow as follows.
#login
ssh username@localserver
#install screen (if not installed)
yum install screen
#activate screen
screen
#turn on sequence server
sequenceserver
#profit
http://localserver:4567