| WWW-QUERY & CROSS-TAXA HELP |
ID Locus entry (EMBL, SWISS-PROT, NRSub)
LOCUS Locus entry (GenBank, Hovergen, EMGLib)
CDS .PE protein coding region (all)
RRNA .RR mature ribosomal RNA (all)
TRNA .TR mature transfer RNA (all)
MISC_RNA .RN other structural RNA
coding region (EMBL, GenBank, Hovergen, NRSub,
EMGLib)
SNRNA .SN small nuclear RNA (EMBL, GenBank, Hovergen, EMGLib)
SCRNA .SC small cytoplasmic RNA (EMBL, GenBank, Hovergen, NRSub,
EMGLib)
3'INT .3I 3' intron (Hovergen)
3'NCR .3F 3' non-coding region (Hovergen)
5'INT .5I 5' intron (Hovergen)
5'NCR .5F 5' non-coding region (Hovergen)
CPG .CG region > 200 bp with
CpGobs/CpGexp > 0.5 (Hovergen)
INT_INT .IN internal intron (Hovergen)
Each entry of a FEATURE TABLE describing a coding region of a DNA
fragment gives rise to a subsequence equal to the fragments described in the
location of the feature. The type of the resulting subsequence equals the key of
the corresponding feature table entry. The name of the resulting subsequence is
built by adding to the parent sequence's name an extension uniquely identifying
this particular feature.
Sequences of a given type are generally subsequences, i.e., fragments of parent sequences, except if the coding region covers totally the parent sequence, in which case ACNUC does not create a subsequence.
CHLOROPLAST Chloroplast genome (EMBL, GenBank, NBRF, Hovergen)
MITOCHONDRION Mitochondrial genome (EMBL, GenBank, NBRF, Hovergen)
KINETOPLAST Kinetoplast genome (EMBL, GenBank, Hovergen)
NUCLEAR Nuclear genome (all)
DNA Sequenced molecule is DNA (all)
RNA Sequenced molecule is RNA (all)
MRNA Sequenced molecule is mRNA (GenBank, Hovergen)
RRNA Sequenced molecule is rRNA (GenBank, Hovergen)
URNA Sequenced molecule is tRNA (GenBank, Hovergen)
URNA Sequenced molecule is snRNA (GenBank, Hovergen)
Document Format Example Journal article journal_code/volume/1st_page jme/34/17 Book book/year/1st_author book/1980/broker Thesis thesis/year/1st_author thesis/1984/wildgruber Patent patent/patent_coded_number patent/ep0238993 Unpublished, or submitted unpubl/year/1st_author unpubl/1993/cho
PRELIMINARY Preliminary annotated sequence
STANDARD Fully annotated sequence
UNANNOTATED Only DE, AC and R[NPXATL]
UNREVIEWED Sequence with unreviewed annotation
The left page ("Taxon selection")is used to build a query with Cross-Taxa which allows you to retrieve all gene families that are shared by a given set of taxa (the upper list) and that are not associated with another set of taxa (the lower list).
The right page ("Taxonomy helper") can be used to check the taxonomy of the species of interest.
Cross-Taxa gives access to a family retrieval system based on taxonomic criteria.
Its web interface is composed of two text fields.
It allows to retrieve all gene families that are shared,strictly or not,by a first set of taxa defined in
the first field and that are not associated with a second set of taxa defined in the second field.
Any taxonomic level can be used and mixed to compose the query (e.g.,Homo sapiens ,Primate,Mammalia ).
For example it is possible to retrieve the families of bacterial genes specific to a toxic strain of Escherichia coli,
or to retrieve the gene families found in mammals but not in birds or as well
to retrieve gene families which are found in mammals only.
The first set of taxa can be used for an inclusive or
exclusive selection of families.
Warning! Cross-Taxa queries can take a lot of time. For simple queries on
families (for example, to retrieve all the families containing a sequence of mammalia), we recomand to use WWW-Query.
It is as well possible to pre-select the families by the number of sequences/species,
as shown on this example.Two types of search are available:
Inclusive Search:
Any family containing at least one species from each taxon of the list will be selected
Usage:
Exclusive Search:
Any family containing only species from all the taxa of the list (i.e. none from other taxa) will be selected
Usage:
Nota Bene:
The number of sequences and taxa displayed with the list of families are correct for protein sequences only.
If you are using a nucleic database, the real number of sequences and taxa in the family
(as given on the family associated page) can be different.
Moreover, sligthly differences can appear here and now betwen the number of taxa and sequences given with the list (precalculated) and the real ones (given on the family page) even for protein databases.
The lists are stored in a sub-directory of /ftp/ftpdir/pub/ADE-User/data/ created via a cookie for the user (Your data are currently stored in the directory /ftp/ftpdir/pub/ADE-User/data/ 339228469, you can chek your previous operations here ).
It is up to the user to give a name to a list. If no name is given, the system uses by default list. Be aware than some lists are
created automatically by the system. These lists are always called list
and erase the lists previously defined with this name. The sequences list of a family
"FAMILY_NAME" is automatically called "FAMILY_NAME_lst" (or "P_FAMILY_NAME_lst" after a species selection).
Note that files older than 1 week in the directory created by the user are automatically cleaned.
You should go to the WWW-Query page (here).
This is an "expert-user" page allowing complex queries.
You can
Several database are available on the server:
First of all, your sequence may be actually not present in the databases you are querying (For example, if you are looking for a protein sequence in EMBL , or for a animal sequence in Hobacprot/Hobacnucl, or for a cds in Hobacprot, etc). See this question for more informations abot different databases.
Maybe there was a confusion between the name and the accession number of the sequence when using WWW-Query.
WWW-Query allows you to search a sequence by its name or its accession number;
for example if an accession number is given instead the name, the sequence will not bet retrieved.
Alternatively Quick Search allows you to retrieve all the sequences associated to a word,
which can indiferently be a name, an accession number, a keyword, a species...
The results are thus more exhaustive than with WWW-Query.
Finally, in several databases, as Hoverprot and Hobacprot, the sequence names can be sligtly different from the SWISS-PROT ones,
due to the duplication
of the sequences. To avoid this problem, use the accession number instead the sequence name to retrieve you sequence.
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
In construction...
back to FAQ
Under WWW-Query and Cross-Taxa, the result of each query is saved in a file stored locally on our server.
By this way, it is not immediatly lost and the user has the possibility to re-use it for building other queries or for performing treatments.
Thanks to the storage zone defined for the user, there is no confusion when many users are genererating lists with the same name at the same moment.
The lists (of sequences or families) are stored in a sub-directory of ftp://pbil.univ-lyon1.fr/pub/ADE-User/data created
via a cookie
for the user (For example your data are currently stored in the directory
ftp://pbil.univ-lyon1.fr/pub/ADE-User/data/
339228469,
and you can chek your previous operations
here
).
It is up to the user to give a name to the list to be generated.
If no name is given, the system uses by default list.
Be aware than some lists are created automatically by the system.
These lists are always called list and erase the lists previously defined
with this name.
The sequences list of a family named "FAMILY_NAME" is automatically called "FAMILY_NAME_lst"
(or "P_FAMILY_NAME_lst" after a species selection).
Other data such as alignment files or philogenetic tree files are stored in the user directory as well.
Partial alignments are stored in a sub-directory of the user directory called ALN.
Note that files older than 1 week in the directory created by the user are automatically cleaned.
back to FAQ
You can download all your data at URL:
ftp://pbil.univ-lyon1.fr/pub/ADE-User/data/
339228469
It is recommended that you use a dedicated FTP client to retrieve them instead of a Web browser like Netscape or Internet Explorer.
You can as well retrieve data sequences with the
Retrieve
button.
back to FAQ
WWW-Query allows you to retrieve sequences or families,Cross-Taxa is used to retrieve only families.
WWW-Query retrieves all the sequences wich fullfill several criteria of different sort, then
generates the list of these sequences, or the list of families associated to these sequences.
Cross-Taxa retrieve families on a taxononomic basis, allowing more precise taxononic selection than WWW-Query.
It is possible to combine results from Cross-Taxa and WWW-Query
(for example, to cross a family list generated with Cross-Taxa and a family generated with WWW-Query).
back to FAQ