Sliding MinPD

Building evolutionary networks of serial samples via a recombination detection approach

Home |  Features |  Supplementary Material | Source Code

                              

Source Code

The C source code for version 1.2 (last updated:06/25/2009) is available for download here:

Windows executable and C source code with makefile

New Parameters for version 1.2

[markers for clustering] g"markers.txt" [Yes: add g and markers filename, No: g]
[clustering distance threshold] T0.001
[clustering option] j4 [0:no clustering, 1:by amino acid, 2:by bases 3:only by distance 4:like 1 but post-clustering, 5:like 2 but post, 6:like 3 but post]
 

Descriptions of the new variables:

  • Markers for Clustering: It is used to cluster similar sequences that share the same amino acids at specific positions (the markers). A sample markers file is included in the tar file above.

  • Clustering distance threshold: indicates how similar the sequences should be. The default is 0.001 which is equivalent to approx. 1 in 1000 nucleotide difference that decides whether two sequences are clustered for similarity (if their distance is less than this threshold)

  • Clustering Option: Allows the clustering of highly similar sequences before or after distance calculations. For improved recombination detection it is recommended to run pre-clustering (removes sequences), otherwise post-clustering (does not remove sequences).

Parameters from version 1.0

The parameter file contains all the parameter settings. The parameters in Sliding MinPD as they appear in the parameter file (vars.txt) are: 

[input file] i"in.fas"
[output file] o"out.txt"
[report all distances] d0 [0:No, 1:Yes]
[activate recombination detection] f1 [0:No, 1:Yes]
[recombination detection option] r2 [1:RIP, 2:B-RIP, 3:SB]
[crossover option] c1 [0:many, 1:only one]
[PCC threshold] p0.4
[window size] w200
[step size] s20
[bootstrap recomb. tiebreaker option] t0 [0:No, 1:Yes]
[bootscan seed] e-3
[bootscan threshold] h92
[bootstrap:1/bootknife:0] b0
[substitution model] vTN93 [options: JC69, K2P, TN93]
[gamma shape - rate heterogeneity] a0.5 [a0.5]
[show bootstrap values] k0 [0:No, 1:Yes]

The program is run with "minpd <vars.txt"

Descriptions of the variables:

  • Report All Distances: writes all pairwise distances between ancestor/descendants to a file

  • Activate Recombination Detection: This option turns the recombination detection feature on/off.

  • Recombination detection: r2 and r3 run the Bootscan methods that require specification of the bootscan seed, bootscan threshold and bootstrap or bootknife options values. 

  • Crossover detection: c1 will detect only one crossover, c0 will detect more than one.

  • PCC threshold: Specifies the Pearson Correlation Coefficient value used as a threshold to reduce pool of candidates. A lower value that is larger than zero decreases number of false positives.

  • Bootscan tiebreaker option: Bootstrap values may add up to more than 100 when ancestors with the same distance are present. Use only in conjunction with crossover option 1.

  • Bootscan Threshold: A threshold used for r2 and r3 to reduce pool of candidates.

  • Bootstrap or Bootknife: Bootstrap is the standard bootstrap method. Bootfknife (as implemented in an earlier version of RDP2) is a hybrid between a bootstrap and a jackknife and removes 25% to 50% of sites replacing them by other randomly picked sites. 

  • Substitution model  and gamma shape for distance calculations: These are required for all recombination detection options. The standard bootscan builds NJ trees from a distance matrix and therefore also requires specification of these variable.

  • The option to "show bootstrap values" is needed to display the bootstrap values in the network when using the network drawer (see below).

Input File

The input file should be a set of already aligned sequences in Fasta format with sequence IDs containing the sampling time as a prefix followed by a dot followed by a sequence identifier. The sequences do not need to appear in (sampling) order. See sample file "in.fas"

Output File

The output of Sliding MinPD is a list of the ancestor-descendant relationships and a collection of trees to build the evolutionary network.

The first part contains a list of four semicolon-delimited values:
Descendant;Ancestor;Distance;Bootstrap support

The recombinant sequences are shown with the notation
Ancestor=A1|BKP1|A2|BKP2|...|BKPn|An
Where A1 is the left ancestral donor, BKP1 the first breakpoint position, A2 the ancestral Donor between breakpoints BKP1 and BKP2, etc. An is the right-hand ancestral donor.

Network Drawer

The network can be build with the online network drawer: http://72.17.173.2:800/minpd/ and it is recommended to use smaller scale factors 0.5 instead of 1 with larger alignments. The input file is the output file created by MinPD.

The network drawer outputs the file as a FIG graphic file. Winfig, a free Windows viewer, can be used to view Fig files. Other viewers can be found here: http://homepage.usask.ca/~ijm451/fig/

Old Version

Version 1.0 is available for download here: 

Download C source code with makefile

Download command line (DOS) Windows executable