SEED Wiki   MSU SEED Tutorial UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View

Welcome to SEED Tutorial

Assignment overview

Part I. Find a gene. Locate specific gene/protein page in SEED using protein sequence, ID, or gene name.

Part II. Find and explore relevant genes clusters. Functionally related genes tend to cluster on prokaryotic genomes. In most prokaryotes 50% or more of the genes are clustered with related genes. We call this phenomenon "functional coupling". For any gene that you wish to study, chances are - it will either occur in a cluster, or there will be a corresponding gene in another genome that does occur in a cluster.

Part III. Explore a Subsystem - basics. Identify functional Subsystem(s) in SEED that your protein potentially belongs to, explore a Subsystem page in the browsing mode. Learn basic tools of subsystem visualization and analysis.

Part IV. Annotate a gene. Apply SEED tools (utilizing straightforward homology-based projections, as well as functional and genome context analysis) to reject, refine, or confirm current functional annotation of a protein.

Part V. Explore a Subsystem - advanced. Navigate Subsystem page in the advanced editing mode. Learn to edit existing Subsystems and to encode new ones.

Part VI. Extend a Subsystem. arabidopsisSS.pdf

workshop evaluation form evaluation.pdf

PART 0. Enter SEED using one of these URLs:

http://anno-1.nmpdr.org/public/FIG/index.cgi

http://anno-2.nmpdr.org/public/FIG/index.cgi

http://yersinia.uchicago.edu/public/FIG/index.cgi

Password: bohr.atom

You are on the FIG main (or Search) page now. Please, familiarize yourself with it. You can return to this page from any place in SEED by clicking FIG search link located at the top left corner of every page.

You don't need to authenticate yourself to simply browse the database, but in order to be able to annotate genes or encode Subsystems (Part V and beyond), you'll need a username. Use something in a format master:FirstL, where "FirstL" should be your first name and the first initial of your last name. You can use anything you wish, but do try to make it descriptive and unique. Type your username in the window "User ID" under the caption Searching for Genes or Functional Roles Using Text.

PART I. FIND A GENE

Every protein-encoding gene (PEG) in SEED has an individual WEB page containing a variety of data about the protein and the corresponding gene, tools for protein annotation and analysis, links to external resources, etc.

I.1. Use gene name, protein sequence, EC number, or ID (in GenBank, SwissProt, UniProt, other major databases) to look for the corresponding PEG page in SEED. You can look for a protein of your choice or pick one from the table below. Copy and paste its sequence or ID into an appropriate window on the FIG search page:

(i) If you chose to copy an ID, EC #, or annotation, paste them into the window Searching for Genes or Functional Roles Using Text, and press Search button. To limit your search to a specific genome - highlight this genome and click Search genome selected below button.

(ii) If you chose to copy a sequence, paste it into the window Searching DNA or Protein Sequences (in a selected organism), scroll up the page, highlight the genome of your protein, scroll back down and click Search for matches button (check that Search Program is set for “blastp”)

#TAIR IDother IDsSSProtein Sequence (fragment) or annotation
1At1g31860uni|O82768+/HisMAVSYNALAQSLARSSCFIPKPYSFRDTKLRSRSNVVFACNDNKNIALQAKVDNLLDRIKWDDKGLAVAIAQNVDTGAVLMQGFVNREALSTTISSRKATFFSRSRSTLWTKGETSNNFINILDVYVDCDRDSIIYLGTPDGPTCHTGEETCYYTSVFDQLNNDEASGNKLALTTLYSLE
2At3g22425tr|Q67YN9+/HisImidazoleglycerol-phosphate dehydratase (EC 4.2.1.19)
3At4g14910gi|18414338+/HisMELLSSSPAQLLRPNLSSRALLPPRTSIASSHPPPPRFLVMNSQSQHRPSISCASPPPGDNGFPAITTASPIESARIGEVKRETKETNVSVKINLDGHGVSDSSTGIPFLDHMLDQLASHGLFDVHVRATGDTHIDDHHTNEDVALAIGTALLKALGERKGINRFGDFTAPLDEALIHVSLDLSGRPYLGYNLEIPTQRVGTYDTQLVEHFFQSLVNTSGMTLHIRQLAGKNSHHIIEATFKAFARALRQATESDPRRGGTIPSSKGVLSRS
4At4g26900sp|Q9SZ30+/HisMEATAAPFSSIVSSRQNFSSSSSIRASSPASLFLSQKSIGNVNRKFKSPRSLSVRASSTSDSVVTLLDYGAGNVRSIRNALRHLGFSIKDVQTPGDILNADRLIFPGVGAFAPAMDVLNRTGMAEALCKYIENDRPFLGICLGLQLLFDSSEENGPVKGLGVIPGIVGRFDASAGIRVPHIGWNALQVGKDSEILDDVGNRHVYFVHSYRAIPSDENKDWISSTCNYGESFISSIRGNV
5At5g10330uni|Q9LFT5+/HisMGVINVQGSPSFSIHSSESNLRKSRALKKPFCSIRNRVYCAQSSSAAVDESKNITMGDSFIRPHLRQLAAYQPILPFEVLSAQLGRKPEDIVKLDANENPYGPPPEVFEALGNMKFPYVYPDPQSRRLRDALAQDSGLESEYILVGCGADELIDLIMRCVLDPGEKIIDCPPTFSMYVFDAAVNGAGVIKVPRNPDFSLNVDRIAEVVELEKPKCIFLTSPNNPDGSIISEDDLLKILEMPILVVLDEAYIEFSGVESRMKWVKKYENLIVLRTFSKRAGLAGLRVGYGAFPLSIIEYLWRAKQPYNVSVAGEVAALAALSNGKYLEDVRDALVRERERLFGLLKEVPFLNPYPSYSNFILCEVTSGMDAKKLKEDLAKMGVMVRHYNSQELKGYVRVSAGKPEHTDVLMECLKFY
6At4g34740tr|Q9STG9+/PurAmidophosphoribosyltransferase (EC 2.4.2.14)
7At1g74260uni|q9m8d3+/PurMLLQRSSMSQLWGSVRMRTSRLSLNRTKAVSLRCSAQPNKPKAAVSTGSFVTADELPSLVEKPAAEVIHFYRVPLIQESANAELLKAVQTKISNQIVSLTTEQSFNIGLESKLKDEKLSVLKWILQETYEPENLGTDSFLERKKQEGLHAVIVEVGPRLSFTTAWSTNAVSICRACGLDEVTRLERSRRYLLFSKEPLLENQIKEFAAMVHDRMTECVYTQKLVSFETNVVPEEVKYVPVMEKGRKALEEINQEMGLAFDEQDLQYYTRLFREDIKRDPTNVELFDIAQSNSEHSRHWFFAGNMVIDGKPMDKSLMQIVKSTWEANRNNSVIGFKDNSSAIRGFLVNQLRPLLPGSVCLLDVSARDLDILFTAETHNFPCAVAPYPGAETGAGGRIRDTHATGRGSFVVASTSGYCVGNLNMEGSYAPWEDSSFQYPSNLASPLQILIDASNGASDYGNKFGEPMIQGYTRTFGMRLPSG
8At1g31220sp|P52422+/PurMESRVLFSSQFNFPVNSPFKTRETSIAPLTPSRNVLSFSFRSPAERCAMRIVPLVKAASSTPQIVAEVDGSSHEPRRKKLAVFVSGGGSNFRKIHEGCSDGSVNGDVVLLVTNKKDCGGAEYARSNGIPVLVFPKAKREPSDGLSPSELVDVLRKYGVDFVLLAGYLKLIPVELVQAFPKRILNIHPALLPAFGGKGLYGIKVHKAVLESGARYSGPTIHFVNEEYDTGRILAQSAVRVIANDTPEELAKRVLHEEHKLYVEVVGAICEERIKWREDGVPLIQNKQNPDEYY
9At1g03090gi|17979456-/LeuMethylcrotonyl-CoA carboxylase biotin-containing subunit (EC 6.4.1.4)
10At2g26800tr|O81027-/LeuMQWNGVRRAHSIWCKRLTNNTHLHHPSIPVSHFFTMSSLEEPLSFDKLPSMSTMDRIQRFSSGACRPRDDVGMGHRWIEGRDCTTSNSCIDDDKSFAKESFPWRRHTRKLSEGEHMFRNISFAGRTSTVSGTLRESKSFKEQKYSTFSNENGTSHISNKISKGIPKFVKIVEVGPRDGLQNEKNIVPTSVKVELIQRLVSSGLPVVEATSFVSPKWVPQLADAKDVMDAVNTLDGARLPVLTPNLKGFQAAVSAGAKEVAIFASASESFSLSNINCTIEESLLRYRVVATAAKEHSVPVR
11At4g34030sp|Q9LDD8-/LeuMLRILGRRVVSASKELTSIQQWRIRPGTDSRPDPFRTFRGLQKGFCVGILPDGVDRNSEAFSSNSIAMEGILSELRSHIKKVLAGGGEEAVKRNRSRNKLLPRERIDRLLDPGSSFLELSQLAGHELYEEPLPSGGIITGIGPIHGRICMFMANDPTVKGGTYYPITIKKHLRAQEIAARCRLPCIYLVDSGGAYLPKQAEVFPDKENFGRVFYNESVMSSDGIPQIAIVLGSCTAGGAY
12At5g08280gi|15241573+/HemeMDIASSSLSQAHKVVLTRQPSSRVNTCSLGSVSAIGFSLPQISSPALGKCRRKQSSSGFVKACVAVEQKTRTAIIRIGTRGSPLALAQAYETREKLKKKHPELVEDGAIHIEIIKTTGDKILSQPLADIGGKGLFTKEIDEALINGHIDIAVHSMKDVPTYLPEKTILPCNLPREDVRDAFICLTAATLAELPAGSVVGTASLRRKSQILHKYPALHVEENFRGNVQTRLSKLQGGKVQATLLALAGLKRLSMTENVASILSLDEMLPAVAQGAIGIACRTDDDKMATYLASLNHEETRL
13At3g48730uni|Q42522+/HemeGlutamate-1-semialdehyde aminotransferase (EC 5.4.3.8)
14At1g69740sp|Q9SFH9+/HemeMATTPIFNASCSFPSTRGIDCKSYIGLRSNVSKVSVASSRIATSQRRNLVVRASESGNGHAKKLGMSDAECEAAVAAGNVPEAPPVPPKPAAPVGTPIIKPLNLSRRPRRNRASPVTRAAFQETDISPANFVYPLFIHEGEEDTPIGAMPGCYRLGWRHGLVQEVAKARAVGVNSIVLFPKVPEALKNSTGDEAYNDNGLVPRTIRLLKDKYPDLIIYTDVALDPYSSDGHDGIVREDGVIMNDETVHQLCKQAVSQARAGADVVSPSDMMDGRVGAIRSALDAEGFQNVSIMSYTAKYASSFYGPFREALDSNPRFGDKKTYQMNPANYREALIEAREDEAEGADILLVKPGLPYLDII
15At3g27740uni|O24447+/PyrMAMATRTLGFVLPTSLSSQPSFDRRGGGFRVSVIRCSTSPLTFPTSGVVEKPWTSYNARLVLEDGSIWPAKSFGAPGTRIAELVFNTSLTGYQEILTDPSYAGQFVLMTNPQIGNTGVNPDDEESGQCFLTGLVIRNLSISTSNWRCTKTLADYLTERDIMGVYDLDTRAITRRLREDGSLIGVLSTEQSKTDDELLQMSRSWDIVGIDLISDVSCKSPYEWVDKTNAEWDFNTNSRDGK
16At1g29900gi|18397283+/PyrCarbamoyl-phosphate synthase large chain (EC 6.3.5.5)
17At5g14760tr|Q94AY1-/NADMAAHVSTGNIHNFYLAGQVYRGQAFSWSSASTFMANPFKEPSWSSGVFKALKAERCGCYSRGISPISETSKPIRAVSVSSSTKYYDFTVIGSGVAGLRYALEVAKQGTVAVITKDEPHESNTNYAQGGVSAVLCPLDSVESHMRDTMVAGAHLCDEETVRVVCTEGPERIRELIAMGASFDHGEDGNLHLAREGGHSHCRIVHAADMTGREIERALLEAVLNDPNISVFKHHFAIDLLTSQDGLNTVCHGVDTLNIKTNEVVRFISKVTLLASGGAGHIYPSTTNPLVATGDGMAMAHRA

Both types of searches should return one or more PEG ID(s) matching your search criteria. A complete PEG ID in SEED looks something like that: fig|562.2.peg.1246, where “fig|562.2” is a genome ID and version, and “peg.1246” is an ID of a specific protein in this genome (the latter are often used in SEED as abbreviated PEG IDs)

Click on the PEG ID to follow the link to the corresponding gene/protein (PEG) page

PART II. Find and explore relevant genes clusters

II.1. We will explore a protein page in detail later. Our strategy in this tutorial is to first show you how to find relevant clusters of genes, by which we mean clusters of functionally related genes that include either the gene you are "positioned on" or a corresponding gene in another organism. The table at the top of a PEG page describes the genes in the region of the chromosome surrounding the gene you are positioned on. The entry for the gene you are positioned on is always shown in green. Just below the table is a small graphical display of the region. The gene you are positioned on is shown here by a green arrow. Genes that are believed to be "functionally related" to it (based on the fact that they occur close to each other in a number of genomes) are shown as blue. Others are red.

Nuclear eukaryotic genes rarely cluster (contrary to the situation in prokaryotes). However, you can still detect related gene clusters – occurring in other genomes – that contain genes homologous to the one you are positioned on. The CL link to the left of the gene leads to the precomputed list of such “indirect” clusters. Try clicking on it. The clusters are sorted by the number of genes in each (Cluster size). PEG IDs in this table are of the close homologs of your query gene (the corresponding similarity score appears in the left-most column). They are linked to the corresponding PEG pages. Open several of them to explore different clusters. Note that even though all these additional clusters are centered around an ortholog of the gene you started with, some of the clusters contain completely different members.

II.2. Explore PINS graphic display that shows the relevant gene clusters in a number of genomes. On every PEG page the columns pins, and fc-sc contain evidence of clustering of the gene you are positioned on with other genes in its immediate neighborhood. The value in the fc-sc column indicates "the strength" of functional coupling and is based on the number of phylogenetically distant organisms in which such clustering is observed. Click on the Pins button just to the left of your gene (shaded green). In a separate window, you should see a portrayal of different versions of the same cluster as they occur in other genomes, “pinned” around homologs of your gene. The query gene is in red. Other homologous genes in the region are shown by arrows with matching colors and numbers. Genes not conserved within the region are colored gray. Mouse over each arrow for more details. Finally, if you choose to click on the Commentary button, another window will pop up containing information about each of the colored sets of homologous genes.

PART III. Explore a Subsystem – the basics.

III.1 Find a Subsystem

There are several ways you may use to find a Subsystem(s) that contains a functional role (assignment) of the query protein:

(i) If a protein has been associated with one (or more) of Subsystems, this will be indicated on the corresponding PEG page by: -- a numerical entry (1, 2, etc) in the SS column of the Context table indicating the number of different Subsystems this PEG is connected to; -- a link on in the Subsystems in which this peg is present table under the Context table.

Note, that activating this link opens a Subsystem page (i) in a simplified “read-only” mode and (ii) with Subsystem spreadsheet display limited to a small number of genomes in the immediate phylogenetic neighborhood of the organism, from the PEG page of which you started. To display all genomes connected to this Subsystem, highlight “Show all” in a drop-down menu and click Show spreadsheet button. To open the same Subsystem in the regular unabbreviated mode – enter it from the FIG main search page.

(ii) If this is not the case, check if any homologs of your protein have been included in a subsystem. Such protein(s) in Similarity table will have a numerical entry (1, 2, etc) in the column In Sub. Go to the respective PEG page (by clicking on its ID) and then follow the subsystem link as described in (i) above.

(iii) You may use the section on the SEED Entry Page: Locate PEGs in Subsystems to search for a relevant subsystem using EC#, functional assignment (if you are lucky), or protein ID (follow instructions on the Entry Page)

(iv) Finally, you may browse a list of subsystems in SEED (or use your browser’s “find in page” functionality) for a potentially relevant term (e.g. NAD biosynthesis, etc). Reach the list of subsystems by clicking on Work on Subsystem button from the FIG main search page.

III.2 Explore a Subsystem

(i) Browse a Subsystem (SS) page. It opens with a Table of Functional Roles constituting this SS. The roles are defined by the most standard descriptive names, for example enzyme names and corresponding Enzyme Classification (EC) numbers, whenever they are available. Note, that role names must exactly match gene annotations in the underlying database. Abbreviations of functional roles are used in Subsystem Spreadsheet below and in SS diagrams.

(ii) Subsets of Roles table. The concept of sub-sets plays an important role in subsystems encoding and interpretation. They usually represent the most compact units, such as multi-subunit complexes, or variants of pathways. A star (*) in front of a sub-set abbreviated name causes all the functional roles grouped in it to collapse into a single column in a Subsystem spreadsheet – a useful feature for displaying synonymic functional roles or subunits of multi-subunit complexes

(iii) Subsystem spreadsheet is a table, in which each column represents a functional role in the subsystem, each row represents a specific genome, and cell are populated with proteins that implement specific functional roles in each organism. Protein IDs in the cells are linked to the corresponding PEG pages.

A small set of tools located immediately under the Subsets of Roles table allows the reduction of spreadsheet display to a selected sub-set of functional roles and/or to a selected group of organisms. The main Subsystem visualization/construction tools are located on SS page below the SS Spreadsheet. Try using the following:

(iv) Subsystem diagram (graphic representation of a pathway) is often helpful in analyzing a SS and assigning variant codes. These graphic maps (available for a number of SSs) can be accessed via a link above a SS Spreadsheet. Functional roles are shown by abbreviations in boxes. Key metabolites (precursors, products) and intermediates are shown by abbreviations or roman numerals in circles (linked to the KEGG Compounds db). Diagrams can be highlighted to show the presence/absence of genes implementing each functional role in a specific organism by activating the Color links located in the SS Spreadsheet in the Genome ID column.

(iv) Notes section at the bottom of each SS page contains annotator’s comments, lists open problems identified during SS encoding and analysis, and - most importantly - explains variant codes identified in the SS.

QUIZ: You have 10 minutes to explore a hypothetical protein in SEED. What can you tell about it?

Help with navigating PEG pages in SEED is available here: PEG page Help.

#TAIR IDFIG IDsAnnotation
1At2g32480fig|3702.1.peg.10319Membrane metalloprotease
2At1g24290fig|3702.1.peg.2823AAA-type ATPase family protein
3At3g13180fig|3702.1.peg.13410RNA-binding Sun protein
4At1g24490fig|3702.1.peg.2842Inner membrane ALBINO3-like protein 1, chloroplast precursor
5AT5g10910fig|3702.1.peg.23225S-adenosyl-methyltransferase mraW
6At5g63290fig|3702.1.peg.28377Related to coproporphyrinogen III oxidase
7At4g00110fig|3702.1.peg.17763similar to nucleotide sugar epimerase
8At2g39670fig|3702.1.peg.11081Radical SAM family enzyme
9At3g18680fig|3702.1.peg.14073Similarity to uridylate kinase

PART IV. Gene annotation in SEED.

SEED provides a very rich annotation environment. Just a few of the many possible strategies to annotate a gene in SEED are described below. Choose one of these strategies to annotate your protein of interest or one of the proteins from the table below. In order to choose a proper annotation for your protein:

Before annotating - please, check whether any close homologs of your protein in other organisms have been connected to a Subsystem already. If you believe that your protein belongs to the same Subsystem, we strongly encourage you to use the same annotation. However, if you consider it completely wrong – the right thing to do will be to contact the Subsystem curator and discuss the issue with him or her.

Annotating from a PEG page

1. Annotating using the Assignments for Essentially Identical Proteins table:

2. Annotating using To Make an Annotation link

3. Annotating using Similarities table

Annotating from a Commentary page

The unique advantage of this strategy over straight-forward homology projections is in considering chromosomal context: only close homologs in similar chromosomal clusters are assigned with identical specific functions.

Try annotating these proteins

Note, that for this part of the tutorial you will have to enter SEED using your “master_____” ID. Before searching for one of these proteins, make sure you type your User ID under Searching for Genes or Functional Roles Using Text caption.

#TAIR ID or organismFIG IDsSuggested annotation strategy
1At2g30200fig|3702.1.peg.10071annotate from PEG page
2At2g22230fig|3702.1.peg.9242annotate from PEG page
3At5g10160fig|3702.1.peg.23145annotate from PEG page
4At2g38040fig|3702.1.peg.10911annotate from PEG page
5At4g24830fig|3702.1.peg.20384annotate from PEG page
6At2g37500fig|3702.1.peg.10855annotate from PEG page
7At2g26080fig|3702.1.peg.9631annotate from PEG page
8At4g33010fig|3702.1.peg.21340annotate from PEG page
9At1g11860fig|3702.1.peg.1445annotate from PEG page
10At2g35370fig|3702.1.peg.10631annotate from PEG page
11At1g32470fig|3702.1.peg.3603annotate from PEG page
12At1g79050fig|3702.1.peg.7467annotate from PEG page
13At5g67100fig|3702.1.peg.28820annotate from PEG page
14At5g22110fig|3702.1.peg.24367annotate from PEG page
15At5g67100fig|3702.1.peg.28820annotate from PEG page
16At5g63960fig|3702.1.peg.28451annotate from PEG page
17At1g73250fig|3702.1.peg.6822annotate from PEG page
18At5g66280fig|3702.1.peg.28730annotate from PEG page
1Atropa belladonnafig|33113.1.peg.20annotate from Commentary page
2Cyanidium caldarium fig|2771.1.peg.195annotate from Commentary page
3Chaetosphaeridium globosumfig|96477.1.peg.63annotate from Commentary page
4Lotus corniculatus var. japonicusfig|34305.1.peg.12annotate from Commentary page
5Spinacia oleraceafig|3562.1.peg.22annotate from Commentary page
6Pinus koraiensisfig|88728.1.peg.120annotate from Commentary page
7Cyanophora paradoxafig|34305.1.peg.12annotate from Commentary page
8Amborella trichopodafig|13333.1.peg.18annotate from Commentary page
9Marchantia polymorpha fig|3197.1.peg.27annotate from Commentary page
10Nicotiana tabacumfig|4097.1.peg.18annotate from Commentary page
11Zea maysfig|4577.1.peg.8annotate from Commentary page
12Amborella trichopodafig|13333.1.peg.5annotate from Commentary page
13Oryza sativafig|39947.1.peg.13945annotate from Commentary page
14Chlamydomonas reinhardtiifig|3055.1.peg.69annotate from Commentary page
15Synechococcus elongatus PCC 7942fig|1140.3.peg.351annotate from Commentary page
16Arabidopsis thalianafig|3702.1.peg.6annotate from Commentary page

If you got that far – CONGRATULATIONS!

You are a certified SEED user!


PythonPowered
FindPage by browsing, title search , text search or an index
Or try one of these actions: LikePages, LocalSiteMap, SpellCheck