Home

How To Use

Search
-Simple
-Advanced

Sequences
Example Data
Data Ownership





 

How do I use Seahorse Sleuth?

If you have a tissue sample from a seahorse, you can obtain expert and reliable identification of the species in two steps:
  1. Use standard molecular laboratory techniques to obtain nucleotide sequence from the one of the supported regions.
  2. Submit the sequence to this site and select the appropriate reference sequence dataset for comparison. An advanced cluster search option gives you the opportunity to perform a bootstrap analysis, while the maximum likelihood will perform more rigorous statistical analyses in placing your query sequence on the tree. Both the advanced cluster and maximum likelihood options will send you the results by email.

Try some of our example sequences to see how it works.
Be aware that there are issues of interpretation which you must bear in mind when using this site.

Search Strategy
You will have the greatest success if you use a hierarchical or iterative approach to identifying the source of your sequence. There are several reference sets to choose from, and there may be representatives for more than one region..
  • Using the simple search strategy, start with the first reference set to determine the suborder or family which is most closely related to your sequence.
  • Then choose one of the more specific and more detailed reference sets to fine-tune your analysis.
  • Then, repeat the search using the advanced mode, and use bootstrap resampling to evaluate the robustness of your identification.

Submitting a Sequence
To submit a sequence for analysis:
  • click on the Simple search link
  • paste your sequence into the Data Entry window
  • select the genomic locus in the pulldown menu beneath the window
  • select the reference dataset in the pulldown menu beneath the window
  • click on the Submit button

Your sequence must be either in FASTA format or as a text nucleotide sequence. Use either UPPER or lowercase. For example:

>mysample
ACCATAATAGTACAGCTGAAGGAATCTGTAGAAATTAAACCATAATAGT
ACAGCTGAAGGAATCGTAGAAATTAAACCATAATAGTACAGCTGAAGG
AATCTGTAGAAATTAAACCATAATAGTACAGCTGAAGGAATCTGTAGAA
ATTAA

or

ACCATAATAGTACAGCTGAAGGAATCTGTAGAAATTAAACCATAATAGTA
CAGCTGAAGGAATCGTAGAAATTAAACCATAATAGTACAGCTGAAGGAAT
CTGTAGAAATTAAACCATAATAGTACAGCTGAAGGAATCTGTAGAAATTAA
  • Only one sequence may be submitted at a time.

If your sequence contains illegal characters, that is those not included in the IUPAC ambiguity codes, then it will be rejected with an error message. If your sequence does contain any of the ambiguity codes, then they will be used both in aligning the sequence and in calculating evolutionary distances.

Your sequence will be analysed automatically. Please wait about 15 seconds and then click the Retrieve Results button to view your results. It will take longer for results to become available if full alignment and/or bootstrap resampling are requested.


IUPAC Nucleotide Codes
Ambiguous Symbol Meaning Origin of designation
  G G Guanine
  A A Adenine
  T T Thymine
  C C Cytosine
  U U Uracil
X R G or A puRine
X T or C T or C pYrimidine
X M A or C aMino
X K G or T Keto
X S G or C Strong interaction (3 H bonds)
X W A or T Weak interaction (2 H bonds)
X H A or C or T not-G, H follows G in the alphabet
X B G or T or C not-A, B follows A
X V G or C or A not-T (not-U), V follows U
X D G or A or T not-C, D follows C
X N G or A or T or C aNy

Advanced search and bootstrapping
The Advanced search window adds additional functions to the search process:

Bootstrapping
To perform a bootstrap analysis:

  • click on the Advanced search link
  • paste your sequence into the Data Entry window
  • select the genomic locus in the pulldown menu beneath the window
  • select the reference dataset in the pulldown menu beneath the window
  • select the number of bootstrap replicates you require
  • optionally enter an email address to which the results will be sent
  • click on the Submit button
A bootstrap analysis will take longer than a simple search. The length of time will depend on the number of pseudoreplicates you have chosen and on the load on our server. The page will not auto reload, it is upto the user to click on 'reload' button - please don't abuse. Alternatively, you may choose to have the results sent to you by email.

Emailed response
You can choose to have the results sent to you by email. If you enter an optional email address, you can close your browser once the search has been submitted.

Maximum Likelihood Analysis

The reference alignment, and the associated phylogenetic tree, are considered to be prior knowledge about the relationships among the reference organisms. Potentially the query sequence can be joined to that tree on any branch. We seek the connection point that has the highest statistical likelihood, thereby giving the maximum likelihood estimate of the relationship between the query and reference sequences. The maximum likelihood connection point is represented in the output by a dashed branch. For a particular connection point the determined likelihood score is the maximum likelihood estimate under the associated topology (that is, all the branch lengths are reoptimised for each connection point).

The Shimodaira-Hasegawa (SH) test is used for assessing a confidence limit on the connection point with the highest expected likelihood. The expected likelihood of a connection point is the expectation of likelihood under the true process of evolution (as a random variable). The SH test calculates such a confindence limit by simulating replicate datasets under an approximation of the least configurable configuration (LFC) in which is that all connection points have equivalent expected likelihoods, and comparing the observed differences in likelihood with the expected distribution of likelihoods under the LFC.

The utilised implementation of the SH test simulates 1000 non-parameteric bootstraps, and uses the RELL (Shimodaira and Hasegawa 1999) approximation. Branches that represent connection points within the confidence limit are colour red. A critical value of α = 0.05 is used (95% confidence limit).

The Results

The results will be displayed first as a phylogenetic tree in which the differences between sequences are proportional to the lengths of the horizontal branches separating the tips. The names of the reference species are colour-coded to help you identify close relatives. To save a copy of the tree as a PNG-format file, right-click (PC) or control-click (Mac) on the image and choose Download Image to Disk, or similar, from the pop-up menu.

If you have performed a bootstrap analysis, the resulting phylogenetic tree will display numbers at some of the nodes. These numbers are the percentage of bootstrap pseudoreplicates that contain the clade formed by the subtree starting at that node. This measure of "bootstrap support" is displayed only when at least 50% of the pseudoreplicates contain the clade. The phylogenetic tree displayed is the estimated tree, and not the consensus of the bootstrap pseudoreplicate trees.

If you scroll further down past the tree, you will also find a table showing the evolutionary distances between the user-submitted sequence and each of the sequences in the reference set. Sites having IUPAC ambiguity codes are included in the calculation of evolutionary distances. To save the contents of the table to disk, select all of the table, copy it, open a text file document on your computer (eg Notepad or SimpleText) and then paste it in.

If you scroll further down further again, there is a text version of the phylogenetic tree in Newick format. To save this to disk, select the contents of the text box in which it is displayed, open a text file document on your computer (eg Notepad or SimpleText) and then paste it in.

You can fine-tune your analysis by clicking on the Submit a sequence link to return to the Data Entry page, where you can choose a different reference set.

Issues of Interpretation Is it a seahorse?
Seahorse Sleuth is an online service for the identification of seahorses by phylogenetic analysis. Its scope is limited to the seahorses, and any submitted sequence will be treated as if it were a seahorses. A simple system has been implemented to flag sequences which might give unreliable results. Nevertheless, it remains the responsibility of the user to decide whether a phylogenetic analysis is appropriate in their individual case. The user should also seek other evidence to corroborate that any DNA sequence which they submit is actually a seahorse in origin, perhaps by searching Genbank. b>Phylogenetic Robustness The loci and method of phylogenetic analysis (evolutionary distances + neighbour-joining tree) used here are geared specifically to addressing questions of species or population identity. They may not be as well-suited to the robust reconstruction of higher-level relationships among more distantly related seahorse species. As such, many of the higher-level relationships suggested by Seahorse Sleuth are unstable and should not be considered an accurate reflection of the evolutionary relationships among these taxa.