Sample In | Sample Out | Download (28.0 KB Zip File)
PSFIND finds and lists the polymorphic sites in a collection of aligned DNA sequences. The output file is used for input for other programs such as HAPPLOT and STEPHENS.
The first step is to set up your input sequence file using Notepad. Copy and paste the sequences from Editseq or other file format. The input file must have the following format:
The input file for PSFIND must have the following format:
Line 1 title Line 2 allele name (OTU name) Line 3 and on Sequence in upper case letters A,T,C,G,N (missing data), and '-' for deletions sequence must end with ; Later Line next allele name Later line +1 next sequence, end in ; And so forth
Lines must end with carriage returns before column 80. There can be no blank lines in the file after the last semicolon; if there is the program will expect another allele sequence. This can be check in Notepad.
See example sequence data in the file fliC6.txt
To execute, type PSFIND and respond to queries.
Query 1. Input file name is case sensitive and must be less than 20 characters long. I use text file (.txt) for input.
Query 2. Output file name is your choice. I use the extension .psi as a convention for these files.
Query 3. Y (Yes) will skip deletions whereas N (No) will include deletions as possible polymorphic sites. I recommend No for HAPPLOT.
The output file includes a list of polymorphic sites, the location of each site (Loc), the consensus sequence (C), and the polymorphic nucleotides of each allele. Matches to the consensus sequence are denoted by '.'. This output file is used for the input file to other programs. See an example output file in flic6.psi
Default parameters for PSFIND are:
MXSQ=50 max. no. of sequences (OTUs)
If the parameters values are exceeded, the program will print an error message.