STEC Center BannerLink to STEC HomepageLink to NFSTC Homepage
About STEC Center
  • Advisory Committee
  • Investigators and Strains
  • Clonal Analysis of STEC
  • Contact us
  • Databases
  • Isolate Database
  • EcMLST System
  • Literature Database
  • Tools
  • Obtain Strains
  • Reference Strain Sets
  • Molecular Protocols
  • Computer Programs
  • Server Statistics (Internal)
  • About the website
  • Accessibility
  • Privacy notice
  • Disclaimer
  • Sponsor
  • PS Find - Find list of polymorphic sites

    Sample In | Sample Out | Download (28.0 KB Zip File)
    Instructions

    PSFIND finds and lists the polymorphic sites in a collection of aligned DNA sequences. The output file is used for input for other programs such as HAPPLOT and STEPHENS.

    The first step is to set up your input sequence file using Notepad. Copy and paste the sequences from Editseq or other file format. The input file must have the following format:

    The input file for PSFIND must have the following format:

    Line 1 		title
    Line 2 		allele name (OTU name)
    Line 3 and on	Sequence in upper case letters
    A,T,C,G,N (missing data), and '-' for deletions
    sequence must end with ;
    Later Line 		next allele name
    Later line +1 	next sequence, end in ;
    And so forth
    

    Lines must end with carriage returns before column 80. There can be no blank lines in the file after the last semicolon; if there is the program will expect another allele sequence. This can be check in Notepad.

    See example sequence data in the file fliC6.txt

    To execute, type PSFIND and respond to queries.

    Query 1. Input file name is case sensitive and must be less than 20 characters long. I use text file (.txt) for input.

    Query 2. Output file name is your choice. I use the extension .psi as a convention for these files.

    Query 3. Y (Yes) will skip deletions whereas N (No) will include deletions as possible polymorphic sites. I recommend No for HAPPLOT.

    The output file includes a list of polymorphic sites, the location of each site (Loc), the consensus sequence (C), and the polymorphic nucleotides of each allele. Matches to the consensus sequence are denoted by '.'. This output file is used for the input file to other programs. See an example output file in flic6.psi

    Default parameters for PSFIND are:

    MXSQ=50 max. no. of sequences (OTUs)
    MXB=600 max. no. of bases per sequence

    If the parameters values are exceeded, the program will print an error message.

    Contact stec@cvm.msu.edu
    Operated by Shannon D. Manning at Michigan State University