____ _ _ / ___|| |__ ___ | |_ __ _ _ _ _ __ \___ \| '_ \ / _ \| __/ _` | | | | '_ \ ___) | | | | (_) | || (_| | |_| | | | | |____/|_| |_|\___/ \__\__, |\__,_|_| |_| |___/ Version 2.1 Copyright (c) U.C.S.F., 1997-1999 ================ Introduction ================ Shotgun 2.1 is a new, non-iterative version of the Shotgun program. For more information and citations, refer to: Pegg, Scott CH., Babbitt, Patricia C. (1999) Shotgun: Getting more from sequence similarity searches. Bioinformatics, 15(9):729-740. ================ Installing ================ Shotgun is written in the platform independent language Python, so the first step in installation is to download and install Python (free from www.python.org). Shotgun calls and reads the WashU version of the BLAST program (blastp), or the current version of FASTA (fasta3_t). You can either make sure you have this program installed and accessible on your system, or you can change the call to BLAST within the Shotgun code (See the section "Modifying" below). ================ Changes from 1.x ================ Shotgun 2.1 is no longer dependent upon GCG software. The only outside programs used are "blastp" and "fasta3". The interactive mode of Shotgun 1.x has been removed. All information is provided on the command line (see the "syntax" section below). The output has been simplified and narrowed. The BLAST/FASTA alignment scores are no longer included. Matricies representing the number of proteins found by pairs of query sequences can be ouput at the end of the output file. The first matrix represents all proteins in the output, which the second represents only those found at expectation values less that 0.01. ================ Syntax ================ Shotgun 2.1 uses the following syntax: shotgun2.1 [-L size] [-D database] [-R cutoff] [-O outfile] [-FMU] files The files must be at the end of the command, and must be in FASTA format. The command line flags are: flag usage default listsize -L<+int> 250 BLAST or FASTA -F BLAST database -D nr reporting cutoff -R<+int> 2 show matrix -M false output file -O shotgun.out use old files -U false ================ Examples ================ To perform a Shotgun run on all of the .tfa files in the current directory, using BLAST against the NCBI nr database, with 250 hits per output file, reporting hits with Shotgun scores of 2 and above, not showing matricies, and writing the output to "shotgun.out", simply type: shotgun2.1 *.tfa To perform a Shotgun run on all of the .tfa files in the current directory, using FASTA against the NCBI nr database, with 1000 hits per output file, reporting hits with Shotgun scores of 3 and above, showing the matricies, and writing the output to "foo.out", type: shotgun2.1 -F -L1000 -R3 -M -Ofoo.out *.tfa ================ Modifying ================ You can modify the Shotgun program very easily to accomodate your particular needs. The most likely change you'll want to make is to change the similarity search. Changing from WashU BLAST to NCBI BLAST is fairly easy: (1) Go to line 235 and uncomment the call to NCBI BLAST and comment out the call to WashU BLAST. (2) Go to lines 313-314 and uncomment the section labelled "for NCBI". Then comment out the section labelled "WashU" on lines 309-310. Note: The output NCBI BLAST has changed several times recently. If you decide to use NCBI BLAST, keep a close eye on the file parsing. To change the similarity search to a program other than BLAST, or FASTA, you'll need to alter the system call on line 235 accordingly (or add your own function that calls the program), and include your own file parser (or change the current one). The program is small and modular, so it shouldn't be hard to make such modifications.