Bild eines Drachen Homepage von Jörg Rüdenauer Bild eines Raben

Studium Software Worte Ich
Zurück Home

How to execute the BLAST search

Now that the preparatory work is done, we can finally start to implement the blast search itself. The SeqSimilaritySearcher interface also defines a method that returns the DBs that can be searched; this is quickly implemented by using the SimpleSequenceDBInstallation class from the package org.biojava.bio.seq.db. I fill this installation in the constructor of the Blast class, using a constant array of the file names. This suffices for my purposes; of course, it would be nicer if the class looked for the available files itself...

public Blast() { this.seqDBs = new SimpleSequenceDBInstallation(); for (int i = 0; i < this.BLAST_DBS.length; i++) { try { this.seqDBs.addSequenceDB(SearchFactory.fastaProt2DB( BLAST_DBS[i]), null); } catch (java.io.FileNotFoundException e) { e.printStackTrace(); } } } public Set getSearchableDBs() { return this.seqDBs.getSequenceDBs(); }

The important method doing the search has the following signature:

public SeqSimilaritySearchResult search(SymbolList querySeq, SequenceDB db, Map searchParameters) throws BioException {

For a detailed description of the parameters and the usage of the method, I refer you to the Biojava API documentation. I'll just mention here that the parameters must be quite generic, so an IllegalArgumentException is thrown if the parameters are not compatible to BLAST. In this implementation, the default search mode is blastp for protein searches.

The assembly of the command line and the call to the external program is straightforward; I'll ommit the special safety precautions (are there parameters? Is the program specified? etc..) here. The environment variable BLASTDB must be set to the path to the data files for BLAST.

final String BLAST_PROGRAM = "blastall"; String commandLine = SearchFactory.getBlastPath(); commandLine += BLAST_PROGRAM + " "; Iterator it = searchParameters.entrySet().iterator(); while (it.hasNext()) { Map.Entry entry = (Map.Entry) it.next(); commandLine += " -" + entry.getKey() + " " + entry.getValue(); } commandLine += " -d " + db.getName(); String[] envs = {"BLASTDB=" + SearchFactory.getBlastDBPath()}; Process blast = null; blast = Runtime.getRuntime().exec(commandLine, envs);

The next problem is how to give the input sequence to BLAST. Luckily, the program can use the standard input for that. The correct format (so that BLAST terminates and the results can be parsed) is a '>', followed by the sequence ID, a carriage return, the sequence itself, another carriage return, an empty line, and an end-of-input symbol (ctrl-d). Ommit the ctrl-d, and BLAST waits for another sequence; ommit the >ID, and the parsing of the output goes wrong. So we have:

String qSeqID = SearchFactory.createID(((Sequence)querySeq).getName()); PrintWriter writer = new PrintWriter(new BufferedWriter( new OutputStreamWriter(blast.getOutputStream()))); writer.println(">" + qSeqID); writer.println(querySeq.seqString()); writer.println(); writer.print('\u0004'); // ctrl-d writer.flush(); writer.close();

All that remains is waiting for BLAST to finish, and to determine if everything was ok (return code 0). If there was an error, you could try to parse the output or error stream for sophisticated error handling; I just assumed that a parameter was wrong, so an IllegalArgumentException gets thrown.

try { blast.waitFor(); } catch (InterruptedException e3) { } if (blast.exitValue() != 0) { throw new IllegalArgumentException(); }

Next: How to parse the results
Back: How to implement a SequenceDB
The general idea


URL dieser Seite: http://www.joerg-ruedenauer.de/Software/blast/blast2.html
Autor dieser Seite: Jörg Rüdenauer
Letzte Änderung am: 14.07.2002
Haftungsausschluss


L-Space now!     Valid XHTML 1.0!     Valid CSS!