These tests simulate one round of mirscan filtering of candidate miRNAs using both a single-species candidate set and a two-species candidate set. At each step, files with expected results are provided for comparison. Those files can also be used to test a later step of analysis even if an early one fails.
Step 1. Install the core mirscan scripts in the current working directory.
Step 2. Download the sample file package and place the decompressed files in a subdirectory called "sampleFiles/".
Step 3. Move the files "oneSeq.rubyEtAl.py" and "twoSeq.rubyEtAl.py" to the current working directory.
Step 4. Execute training of the single-species sample by executing the following command:
python mirscanTrainer.py sampleFiles/oneSeq.round0.train oneSeq.rubyEtAl.py 0 sampleFiles/oneSeq.round0.test.matrixCompare sampleFiles/oneSeq.round0.matrix to sampleFiles/oneSeq.round0.test.matrix. The counts for the training set should be identical to the values from the sample file (columns 3 and 5). The remaining columns, including the scores, will likely differ. Even though the set of background candidates being used in this training session are consistent, the microRNA 5p ends for those candidates are unknown and are therefore assigned randomly (see supplemental text of Ruby et al. Genome Res. 2007 for more details).
Step 5. Execute scoring of the foreground and background sets with the following commands:
python mirscanExecute.py sampleFiles/oneSeq.round0.train oneSeq.rubyEtAl.py sampleFiles/oneSeq.round0.matrix sampleFiles/oneSeq.round0.test.scr python mirscanExecute.py sampleFiles/sample1.round0.fax oneSeq.rubyEtAl.py sampleFiles/oneSeq.round0.matrix sampleFiles/sample1.round0.test.scrCompare sampleFiles/oneSeq.round0.scr to sampleFiles/oneSeq.round0.test.scr and sampleFiles/sample1.round0.scr to sampleFiles/sample1.round0.test.scr. Their un-commented contents should be identical.
Step 6. Execute cutting of the background set with the following command:
python score_cut.py sampleFiles/oneSeq.round0.scr sampleFiles/sample1.round0.scr sampleFiles/sample1.round0.fax sampleFiles/sample1.round1.test.faxCompare sampleFiles/sample1.round1.fax to sampleFiles/sample1.round1.test.fax Their un-commented contents should be identical.
Step 7. Execute training of the two-species sample by executing the following command:
python mirscanTrainer.py sampleFiles/twoSeq.round0.train twoSeq.rubyEtAl.py 0 sampleFiles/twoSeq.round0.test.matrixCompare sampleFiles/twoSeq.round0.matrix to sampleFiles/twoSeq.round0.test.matrix The counts for the training set should be identical to the values from the sample file (columns 3 and 5). The remaining columns, including the scores, will likely differ. Even though the set of background candidates being used in this training session are consistent, the microRNA 5p ends for those candidates are unknown and are therefore assigned randomly (see supplemental text of Ruby et al. Genome Res. 2007 for more details).
Step 8. Execute scoring of the foreground and background sets with the following commands:
python mirscanExecute.py sampleFiles/twoSeq.round0.train twoSeq.rubyEtAl.py sampleFiles/twoSeq.round0.matrix sampleFiles/twoSeq.round0.test.scr python mirscanExecute.py sampleFiles/sample2.round0.fax twoSeq.rubyEtAl.py sampleFiles/twoSeq.round0.matrix sampleFiles/sample2.round0.test.scrCompare sampleFiles/twoSeq.round0.scr to sampleFiles/twoSeq.round0.test.scr and sampleFiles/sample2.round0.scr to sampleFiles/sample2.round0.test.scr Their un-commented contents should be identical.
Step 9. Execute cutting of the background set with the following command:
python score_cut.py sampleFiles/twoSeq.round0.scr sampleFiles/sample2.round0.scr sampleFiles/sample2.round0.fax sampleFiles/sample2.round1.faxTheir un-commented contents should be identical.
If results differ substantially from expected, consider the following:
If results are close to those that are expected but not exactly the same, here are some things to consider: