Gene Annotation at TGD


  • Gene Names must conform to published guidelines: three capital letters followed by one or two digits (e.g. PDD1). This makes it possible for programs and search engines to easily identify Gene Names on this site and in publications.
  • A Gene Name is often an abbreviation of a phrase describing the gene's function. For example: PDD = Programmed DNA Degradation. The numbers following the abbreviation allow an abbreviation to be used to describe more than one gene. This is useful when describing genes that belong to the same family (the myosin genes MYO1, MYO2....MYO13) or that were identified during the same functional study (PDD1, PDD2, PDD3). Please do not assign different meanings to the same abbreviation, choose a different abbreviation for your new gene instead.
  • Name Descriptions describe the abbreviation used in the Gene Name. These descriptions are limited to 150 characters.
  • Before choosing an abbreviation for your gene, you may wish to check Saccharomyces Genome Database to see if your favored abbreviation means something drastically different in yeast. Some yeast abbreviations are famous. Please don't make the mistake of using the letters "PEX" to stand for something like "Perinuclear Exonuclease".
  • The Name Description field can contain letters, numbers, periods, underscores, dashes, and forward slashes (A-Z,0-9,.,_,-,/). Other punctuation marks can lead to security problems and are not allowed.


  • Headlines are limited to 240 characters, the standard limit imposed on gene descriptions at Genbank.
  • Headlines should describe the most important features of a gene. The different phrases of a Headline are typically separated by semicolons.
  • Punctuation marks that can lead to security problems and are not allowed in the Headline.


  • An Alias is an alternative published name for a gene. For example, both TTHERM_00630470 and TTHERM_00526730 have been published under the name TAP1. To avoid confusion, TTHERM_00630470 was renamed ARP1; TAP1 was added as an Alias to the gene.
  • Users must add a new Alias to the database before linking it to a gene.
  • Alias names can contain letters, numbers, periods, underscores, dashes, and forward slashes (a-z,0-9,.,_,-,/). Other punctuation marks can lead to security problems and are not allowed.


  • The Gene Ontology (GO) is a list of terms used to annotate genes in all organisms. Select terms from three lists (the Biological Process, Molecular Function, and Cellular Component ontologies) to describe each gene. To search for terms, visit the GO search website: AmiGO! Your friend in the Gene Ontology.
  • Only the numerical portion of the GO ID should be entered. For example, "GO:0006388" should be entered as "6388".
  • Note that many terms are labeled either "OBSOLETE" or "DEPRECATED". Please avoid annotating to these terms. Better alternatives certainly exist.
  • To comply with the standards of the GO user community, evidence must be provided for each GO annotation you make. This includes an Evidence Code showing the technique used to garner the information, plus a reference (Pubmed ID) for the paper describing the experiment.
  • GO annotations for genes cannot be supported by unpublished results at this time. All GO annotations must be accompanied by a Pubmed ID.
  • Our database is relatively simple and does not support the use of Evidence Code Qualifiers (contributes_to, NOT, etc.) or From/With annotations. We will expand the database in the future. For now, please annotate to the proper evidence code, even though these options do not exist.
  • Please do not use the IC, ND, EXP, or IEA Evidence Codes. (IEA annotations are provided exclusively by TIGR/JCVI.)
  • An excellent introduction and guide to the process of GO annotation is available at the Gene Ontology website. The most useful point at which to start is their discussion of the GO Evidence Codes. This page features an Evidence Code Decision Tree.
  • If you have any questions about annotations you would like to make, feel free to email GO can get rather confusing - we're here to help!


  • The General Information section of the page displays free-text essays (called Paragraphs on our site) about one or more genes. Important details about the history, family, or structure of a gene may not be captured in other parts of the gene annotation, such as Description and Gene Ontology Annotations. This section of the page allows you to share and highlight these findings.
  • Each Paragraph has a limit of 5000 characters.
  • Some characters and punctuation, including those to form hyperlinks and stylize text, have been disabled for security purposes.
  • Paragraphs can be written for many different reasons. Some common topics discussed in Paragraphs are:
    • Errors in a gene model
    • Gene family descriptions
    • Other genes identified by the same screening method
    • History of research on a gene
  • Each Paragraph can be linked to multiple genes. Changes to the Paragraph will be reflected on all Gene Pages.
  • A link to the Paragraphs written at TGD, and the Paragraph Number associated with each one, can be found in the Internal Links section of the left sidebar. This page also lists the genes linked to each paragraph.
  • Paragraph Order allows you to display Paragraphs on the Gene Page in a specific order (first on the page, second on the page, etc.). By default, enter "2". In the event of a tie, don't panic - both paragraphs will still be displayed.


  • Relevant papers indexed by Pubmed are pre-loaded regularly by an automated program. Only pre-loaded papers can be linked to genes.
  • Please try to link all papers that mention the gene by name in the main text.
  • Please do not link every genome-scale (i.e. microarray) paper to every gene. We will display information from these types of papers in a convenient way.