###############################################################################
### Scripts for the analysis of CPE combinatorial codes
###############################################################################

This package contains three scripts for the analysis of the CPE combinatorial codes
in a set of sequences  (in multi-fasta format).

FastaToTbl- converts the (multi)fasta format to tabular format. The tabular format contais one sequence record per line. The first field is the sequence identifier and the second field is sequence.


motifsearchTreeDecomposed - this script codes all the CPE motifs and verifies the their occurrence in a given set of sequences. As a result it ouputs the classification of each sequence according to the classes defined in the paper and the respective matched motifs.


These two files can be used in pipeline in the following way:

* USAGE: FastaToTbl (multi)fastafile | motifsearchTreeDecomposed.pl

The results can be redirected to an ouput file

* EX1: FastaToTbl input.fasta | motifsearchTreeDecomposed.pl > results.out


filterSeqsbySignal - if you want to recover again the sequences in fasta format for a specific class you can apply this script. It will return a multi-fasta file with all the sequences in fasta format.

Note: this script imports the module Bio::SeqIO so it requires the BioPerl package to be installed in your computer!

* USAGE: perl filterSeqsbySignal.pl filter_name results_file input_Sequence_file ouput_sequence_file

* EX2: perl filterSeqsbySignal-v0.pl 'activation_early_strong3_final' results.out input.fasta subset.fasta

This commnad will retrieve all the sequences found in the input.fasta (EX1) that contain the signal  'activation_early_strong3_final'. File results.out contain the classification accordind to EX1. subset.fasta contains the set of sequences that match this signal.


These scripts should run in any linux (with bioperl) without problems.
if you are using windows, then you can install Active Perl and use the ppm program (command line) to install bioperl.



