UserManual (back to Introduction)

Mirscan runs on Linux and uses the Python interpreter. Scripts are executed by typing python at the command line, followed by the name of the script that you are executing.

The steps below indicate how one would execute a complete mirscan cycle, which has three steps: (1) Training: evaluation of the differences between a foreground and a background set of miRNA hairpins/candidates; (2) Evaluation: scoring of candidate hairpins based on the scoring matrices developed during training; (3) Cutting: elimination of candidates with low scores. The user may bypass the first step by downloading or sharing a pre-derived scoring matrix. In these cases, it is up to the user to evaluate the relevence of the scoring matrix being used to the candidate set being evaluated.



(1) Training

mirscanTrainer.py is used to build a table of scores that reflect the frequencies of each returned value in the foreground (training) set of miRNA candidates versus the background (candidate) set. The foreground set is defined by the specified .train-format file; the background set is defined by the set of .fax-format files specified in the .train file. The set of criteria being evaluated is defined by the contents of the specified .py-formatted criteria file. Because the number of background candidates is often far larger than it is practical or useful to evaluate in order to define the foreground versus background frequencies, a number of randomly-selected background hairpins to include can be specified as a command-line argument (if "0" is specified, then the entire background set will be included). The scoring matrix will be written out to the specified .matrix-format file.



(2) Evaluation

mirscanExecute.py applies a scoring matrix and set of procedures to a set of miRNA candidates. The set of candidates can be either the foreground set (defined by a .train-format file) or a file from the background set (defined by a .fax-format file). The hairpins will be evaluated based on the set of criteria defined in the specified .py-formatted criteria file using scores from the specified .matrix-format file. Note: the scoring matrix must have been built using the specified criteria file! The scores will be written out to the specified .scr-format score file.



(3) Cutting

score_cut.py calculates a minimum passing score for miRNA candidates based on the distribution of scores among the foreground set and then makes a copy of the specified background file with those candidates whose scores are below the threshold removed. The foreground and background set scores are taken from two specified .scr-format files. The background .fax-format file to be filtered and copied can be specified along with the name of the new .fax-format file to be written. If these two filenames are not given, then statistics will be printed concerning the threshold score and the number of candidates that would be cut, but no cut will be made. Note: the specified background score file must contain scores from the specified background candidate file!