SEED Wiki   Lightweight SEED Tutorial UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View

Welcome to SEED Tutorial

Examine this page and follow the links for all you need to know and do during the tutorial

Assignment overview

Part I. Find a gene. Locate specific gene/protein page in SEED dedicated to the protein assigned to you. Familiarize yourself with different types of the data, tools, and links to external resources available on a gene/protein page

Part II. Annotate a gene. Apply SEED tools (utilizing straightforward homology-based projections, as well as functional and genome context analysis) to reject, refine, or confirm current functional annotation of “your” protein.

Part III. Find a subsystem. Identify functional Subsystem(s) in SEED that your protein potentially belongs to.

Part IV. Explore a subsystem. Learn the main tools of subsystem visualization and analysis.

Part V. Expand a subsystem. Learn the tools for subsystem projection onto additional genomes. Project a Subsystem onto the microbial genome of your protein. Predict how this pathway (subsystem) is implemented in this particular organism

Proceed to the list of "mystery" proteins, one of which you will be analysing during the tutorial

From the list below please pick the protein, which number matches the number on your computer terminal. Use User ID (master:Tutorial#) associated with it throughout the tutorial.

Terminal #User IDOrganismProtein IDProtein Sequence (fragment)Comments
0master:Tutorial0Burkholderia cepacia J2315noneMSTQQNRRFDALVFIGRFQPPHRGHLNVLKSALSRAERVCVLIGSTDKPRTIKDPFSFDERRQMLASLLDASERDRVTIAPLQDSTYNDGDWVRWVQDAVAVVLGDVAQRKVGLIGHEKDATSYYLRMFPQWELVDVDATEDISATEIRDQYFAERTNSFVQWAVPEPVFGWLERFRTQPEFAQLKSEAEFIAAYRKAWAAAPYPVTFVTVDAVVVHSGHILLVRRRSEPGRGLWALPGG
1master:Tutorial1Nostoc punctiformegi|23124765MTTLPDLDYVYKQQSQQNQELNLSADDYSLLTDLYQLTMAACYTGEGIEQRRASFELSVRRLPEGFGYLIAMGLTQALEYLAKIRFSSAQIAALQATGIFAHAGDRFWSLLAEGKFTGDVWAVPEGTAVFANQPLLRVEAPLWQAQLVETYLLNTINYQTLIATKAARLRDIAGESATLLEFGTRRAFSPQGSLWAARAALAGGLDSTSNVLAALQLGQQPSGTMAHALVMALSAIEGTE
2master:Tutorial2Synechocystis sp. PCC 6803gi|16331895MNTNLILDVDSYKVSHWLQYPPDTTAMYSYVESRGGRYPVTVFFGLQYILKRYLTQSIEPWMVEEANRLLTAHGLPFNYGGWRYIAEDLQGRLPVRIKAVPEGSVIPVHNVLMTVESTDPKVFWLVSWLETLLMRVWYPITVATQSWHLKQRIYQSLCRTADDPDGEINFKLHDFGARGVSSGESSGIGGLAHLVNFQGSDTVKALVYGQQYYNCPMAAYSIPAAEHSTITAWGREGEVLAYENMLTQFAKPGSVLAVVSDSYDLWNAIDHLWGDHLRAQVLDSGATVVIRPDSGDPVAIVAQTLERLEACFGSTLNSKGFRVLNAVRVIQGDGVDEESISAILEKTESLGFSTTNLAFGMGGALLQKVNRDTQKFAMKCSEVTVEDKAIPVYKDPVTDPGKTSKKGRLSLVKTDSGYGTVPTSSEDLLQVVYENGHLLQDQCLDAIRQRAWPLIRVNVPAS
3master:Tutorial3Xanthomonas axonopodis pv. citri str. 306gi|21241444MATPYHYLVFIGRFEPFHNGHAAVARHALGKAKKLIVLIGSADTPRTIRNPWTVAERAVMIESALPGETARLLVRPLRDHLYNESLWIAEVQRQVAEAVHADGGTLDANIGLIGMDKDASSYYLREFPQWPLEDVQHTATLSATELRRYLFEAGDIGFHGGLLMLRGNVPAPVYDMLEAFRRNSPSYAQLVAEYRFIEQYRAAWKDAPYPPTFVTTDAVVVHSGHVLLVRRRAEPGKGLWALPGGFVGQEEGLLDCCLRELREETRLKLPVPVLKGSLRGRQVFDHPERSQRGRTITHAFHFEFPAGELPAVRGGDDADKARWIPLAEVMAMGPRLYEDHLHILEFFLGRG
4master:Tutorial4Lactococcus lactis subsp. lactis Il1403gi|15673084MTLQDEIIKELGVKPVIDPKEEIRVSVDFLKDYLKKYPFIKSFVLGISGGQDSSLAGRLAQIAIEEMRQETADETYKFVAIRLPYGVQADEEDAQRALAFIQPDVSLTVNIKAAVEGQVAALNEAGIEVSDFNKGNIKARQRMITQYAVAGQYQGAVLGTDHAAENITGFFTKFGDGGADLLPLFRLNKRQGKALLAELGADPAIYEK
5master:Tutorial5Listeria innocua Clip11262gi|16800146MTNLFQDDSLTLHTDLYQLNMMKAYFDDGLHERRSVFEVFFRDMPFDSGFVVFAGLERIINYMQNLRFTETDIAYLHDELGFDGPFLEYLRNFKFKGNILAAKEGEFVFKTEPILQVEASLAEAQLIETALLNIVNFQTLIATKAARIRSVIDDETFAEFGTRRAQEMDAAIWGTRAAYIGGCDSTSNVRAGKIFGIPVSGTMAHAMVQAYRDELEAFRSYAKTHFDSIFLVDTYDTLKSGVPNAIKVAKEMGDKINFIGIRLDSGDMAFLSKKARQMLDEAGFTEAKIFASSDLDEHTILSLKAQKAKIDSWGVGTKLITAYDQPALGAVYKMAAIADENDILQDSIKLSSNTEKVSTPGKKKVYRIITNEDGLKAEGDYIALADESLENVDKLTMFHPVHTYIMKTVENFTARELLVPIFKDGELVYDMPTLDEIKAYKEENLALLWDEYKRTVRPEQYPVDLSVKCWKNKMRNIEKVRKSVQLHSPVELDMPF
6master:Tutorial6Methanococcoides_burtonii_DSM_6242noneMLKIGVFGCGAIGTELCKAIDSGHIEVELYAVYDRHEQSIINLKEQLKNTDPKVLEIVEMVKHVDLVVECASQQAVYDVVPTTLHAKCDVMVISVGAFADKKLLDTTFDIAKEYGCKIYFPSGAIVGLDGLKSASAASIYSVTLTTQKHPRSFEGAPYIVQNNIDLDSIKGKTVLFEGMA
7master:Tutorial7Pseudomonas fluorescens PfO-1gi|48731534MACRPGVATMKVIVLTGPESTGKSWLAAGIQQQFGGLRVDEYVRWFIEQYPRDTCLADIPEIARGQLQWEDAARAQQPRLLILDTHLLSNILWSQTLFGDCPPWLETELLARHYDLHLLLSPEQIEWTDDGQRCQPDFSERLAFYQATRTWLETHHQPLQIIQGNWLERHQQAFDAVRALLAD
8master:Tutorial8Psychrobacter sp. 273-4gi|41688874MYTSNLNNLILNSDSYKTSHWVQYPSGSEYLSSYIEARKGDYDVVFFGLQAFIKEYLSTPITHQDIDEAEMVIQAHGLTFNRAGWERLVDKHGGYLPLRIEAIPEGSIVPVSNVVCQVINTDPEFYWLPSYIETALLRAIWYPSTVASVSHYCKSIIRQALEKSADNTESLIFRLHDFGSRGASSQESVALGSLAHLVNFAGTDSMTALVAASRWYQMDKDMPAFSIPAAEHSTMTAWGRDGETAAFANMIEQFGGEGKSFSVVSDSYDLWNAIDNIWGGSLKDDVKNMGGTLVIRPDSGDPAKVVREALERLAVKFGTTVNRKGYKVLPDYVRIIQGDGISPQSLSKTIDVVMKAGFSA
9master:Tutorial9Streptococcus mutans UA159gi|24378952MYKDDSLTLHTDLYQINMMQVYFNQGIHNKRAVFEVFFRKEPFANGYAVFAGLERMIAYLQGLSFSETDIAYLEELGYPADFVAYLKEFKLELSVKSAKEGDLVFANEPIVQIEGPLAQCQLVETAILNIVNFQTLIATKAARIRSVIEDEPLLEFGTRRAQEMDAAIWGTRAAVIGGANATSNVRAGKLFGIPVSGTHAHALVQAYGNDYDAFMAYAGTHKDCVFLVDTYDTLRLGVPAAIRVASELGDKINFLGVRIDSGDMAYLSKKVRKLLDEAGYPHAKIYASNDLDENTILNLK
10master:Tutorial10Sulfolobus tokodaii str. 7gi|15922485MGLKVELAQIRPKLGDVKYNLEKHQEIISSSSADCIIFPELSLTGYILRDLVYEVYNESEKAIEKLSEENKCIIAGLVKEIRPGILRNTAAIIINHQINYIYKFYLPTYGLFEERRYFQPGDPKRDLKIFEYKGVKFGVIICEDAWHYEPIEALALLGADSIFIPAASPMRRLSTRLGIQDNWEALLKAHSIINGIWTIFVNNVGSQEEEFFWGGSMVVSPNGEVINRAKLFEEDIIITEINLDEVRKNRFFSSFREHNRDFHDVLRSL
11master:Tutorial11Thermotoga maritima MSB8gi|15644391MTVLIIGMGNIGKKLVELGNFEKIYAYDRISKDIPGVVRLDEFQVPSDVSTVVECASPEAVKEYSLQILKNPVNYIIISTSAFADEVFRERFFSELKNSPARVFFPSGAIGGLDVLSSIKDFVKNVRIETIKPPKSLGLDLKGKTVVFEGSVEEASKLFPRNINVASTIGLIVGFEKVKVTIVADPAMDHNIHIVRISSAIGNYEFKIENIPSPENPKTSMLTVYSILRTLRNLESKIIF
12master:Tutorial12Exiguobacterium sp. 255-15gi|46112931MRALIVIDYTVDFVADEGKLTCGKPGQTIEGRIASLMDEFSSEDYVVIANDIHEEGDTFHPETVLFPPHNIRGTHGRDLFGQVAEMARVADHVIDKTRYSAFAGTDLDLRLRERSIQEVHLVGVCTDICVLHTAVDAYNLGYKIVVHADAVASFNAAGHDWALTHFKQSIGADVVGE
13master:Tutorial13Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293gi|23023387MVEEPVWHRFWRYFNPVNIFKELLFGWKWAEGVLFLVLILLQLLVWTLGVVHNSTLDWFGLATGMMNIITVVLVAKGRITNYFWGLIYSVMYMPLAFQSQLFGEVALSAFWVVMQFVGVAAWLGFMKRDNLSEKSQDVV
14master:Tutorial14Nitrosomonas europaea ATCC 19718gi|30249848MKIALAQINCTPGDLRGNQLKILHACRQAREAGADLVITPEMSLCGYLAEDWLLRREFVQACHQALTELTAQVYDVTLIVGHPHNMNGNLFNAVSAVRDGRLLATHCKQHLFSDRLQDERRYFSAGNSLCTFECSGILFGLMTGSDYRHAAHLQSLHAAGAQVLLAVDASPYSIDSQIDRYQILREGITQTGLPAVYINPVGGQDELVFDGASFAMDHSGKLVCQLPAFQEALALIAIHG
15master:Tutorial15Clostridium thermocellum ATCC 27405gi|48859870MYNVGIYGGSFNPLHLGHIRCIIEAANQCKELHIIISCGVNRNEIPPRVRYRWIYQVTKHIGNVKIHFLEDDAVDKNAYSKEYWQEDAQKVKDMVGKPIDVVFCGSDYDENSFWKQCYPESELYIIKRNGISSTEIRKNPYAHWDSIPNVVREYYVKKVLLIGGESTGKSTLTINLANYYNTNYVEEVGREISMRSGTDMLMIPSDFTDILLTHKMKELEAAKQSNKVLFIDTDCLITRFYIDFLDDPQNDKERNKALADAISALNHYDLVFYLEPDVEFVQDGDRSEVIAANREKYGNQIKKLFDERGIKYISVSGNYHERFLRVTSEVDRMLGINRAE
16master:Tutorial16Clostridium acetobutylicum ATCC 824gi|15895057MHSNNIILVYGGAFNPPSASHITLAKQLLNYTGAKKLMFVPVGNQYKKKELIPAYHRINMLQIACECNNRLEVNTTDVDFKRRLYTIETLEIIKKQNSDKDIYFIIGTDNLRDILNWKHWQRLLTEYKIIVMDRGEDTIFKVFKDIPILKKYKANLIQIPGLLVNNISSTLIRNNIRQDKTIEHLTIKKIIKYIKENNLYK
17master:Tutorial16Vibrio_parahaemolyticus RIMD 2210633gi|28900268MKKIAIFGSAFNPPSLGHKSVIESLSHFDLVLLEPSIAHAWGKNMLDYPIRCKLVDAFIKDMGLSNVQRSDLEQALYQPGQSVTTYALLEKIQEIYPTADITFVIGPDNFFKFAKFYKAEEITERWTVMACPEKVKIRSTDIRNALIEGKDISTYTTPTVSELLLNEGLYRETLSGK
18master:Tutorial18Porphyromonas gingivalis W83gi|34539930MLTGLFFGSFNPMHIGHLALANYLTEYTPIGQLWFVPSPLNPLKNTQELLPYDLRCELIEQAIRKDIRFQVLRIEELLPSPHYTIRTLRALSMLYPHHRFALLIGADNWQSFDRWKDHHRLMAKYELIIYPRFGYEVDDTTLPTGCRYIHDAPRIEISSTQIRTSILEGKDLRYWLPLPESQDVIASALQSCLSPKR
19master:Tutorial19Prochlorococcus marinus MED4gi|33862001MKFDTKNRIALFGTSADPPTIGHKKILEELSNIYSCVIAYASDNPKKKHKENIFFRNLLLKSLIKDINNPKIIFNQKISSQWAIESIEECQKNYPSSKVDFVIGSDLITEIFSWKNFDKIIHAVKLLIIKREGYPIESKTLKMLKINKVIFEISSLNIPNISSSMVRLNNNYSDLPESLIDIVKKNNLYKTIR
20master:Tutorial20Thermus thermophilus HB27gi|46199841MSEVRHAVLQFRPEKSRLRESLARLRAHLEALRPHAPQVVVLPEAALTGYFLQGGVRELALTRHELLELLVGVYEKVGWEGVLDVVVGFYERDEGAYYNSAAYLELPHRVVHVHRKVFLPTYGVFDEERYLARGRRVEAFRTRFGRAALLICEDFWHSITATIAALDGAEVIYVPSASPARGFQGGYPENVARWRTLAQAVAAEHGLYVVVASLVGFEGGKGMSGGSLVVGPDGRILAEA
21master:Tutorial21Cytophaga hutchinsoniigi|48853426MLKPFNFKVWIDENRHLLKPPVGNKQVYVGNDDFIVMVVGGPNARKDFHYNEGEEFFYQVEGNIILKIIEDGKIVDVPIYEGDIFLLPARVPHSPQRGENTIGLVIERYRNDEELDGFMWYCEACNHKLYEEFVMVKDIVSQLPVVMNKFYSSKELCTCSNCGTVMEAPVKR
22master:Tutorial22Staphylococcus aureus subsp. aureus N315gi|15924903MYQLEDDSLMLHNDLYQINMAESYWNDNIHEKMAVFDLYFRKMPFNSGYAVFNGLKRVIDFIEHFGFSESDLEYLKSIGYKDDFLSYLKDLKFTGSIRSMQEGELCFGNEPLLRVEAPLIQAQLIETILLNIVNFHTLITTKASRIRQIASNDKLMEFGTRRAQEIDAALWGARAAYIGG
23master:Tutorial23Shigella flexneri 2a str. 301gi|16128719MDFFSVQNILVHIPIGAGGYDLSWIEAVGTIAGLLCIGLASLEKISNYFFGLINVTLFGIIFFQIQLYASLLLQVFFFAANIYGWYAWSRQTSQNEAELKIRWLPLPKALSWLAVCVVSIGLMTVFINPVFAFLTRVAVMIMQALGLQVVMPELQPDAFPFWDSCMMVLSIVAMILMTRKYVENWLLWVIINVISVVIFALQGVYAMSLEYIILTFIALNGSRMWINSARERGSRALSH
24master:Tutorial24Pyrobaculum aerophilum str. IM2gi|18313835MCFNEVSTDSPVLMEWLEKYGIPVREDAEVFAVYGRDRDILRALRESDKVVVGISPPGLDVKLAALDLRELPSLTSIKCRAVEIPRLRAESPHGHVVGVNEIAIFPEKSATFLKYSLYVDGTFLFNDLSDGVLIATPLGSTAYALSAGGPIVDVRSRVIVIVPVNSAMGRKPYVIPQESV

Detailed Assignment

can be downloaded as PDF by following this link: [WWW]Assignment_full.pdf A short version is available below

PART 0. Enter SEED using one of the three URLs:

http://theseed.uchicago.edu/FIG/index.cgi if your terminal is numbered 1 through 7

http://neisseria.uchicago.edu/FIG/index.cgi if your terminal is numbered 8 through 14

http://shigella.nmpdr.org/FIG/index.cgi a spare, in case somebody crashes one of the two main servers

To be able to annotate genes (Part II and beyond), you need to authenticate yourself in the box “User ID” under the caption Searching for Genes or Functional Roles Using Text. Please type your user ID exactly as it appears in the Assignment table above for your Terminal number (master:Tutorial##). Make sure you use the same username throughout the tutorial.

PART I. FIND A GENE

Every protein-encoding gene (PEG) from every genome in SEED has an individual WEB page containing a variety of data about the protein and the corresponding gene, tools for protein annotation and analysis, links to external resources, etc.

I.1.From the list of proteins above pick the one, which number matches the number on your computer terminal. Copy protein sequence or ID (your choice) and paste into an appropriate window on the main FIG search page:

(i) If you chose to copy an ID, paste it (just the number, omit “gi|”), into the window Searching for Genes or Functional Roles Using Text, and press Search button. To limit your search to the genome of your protein - scroll down the page, highlight this genome, scroll back up and click Search genome selected below button.

(ii) If you chose to copy a sequence, paste it into the window Searching DNA or Protein Sequences (in a selected organism), scroll up the page, highlight the genome of your protein, scroll back down and click Search for matches button (check that Search Program is set for “blastp”)

Both searches should generate a single PEG ID (or a list of IDs) matching your search criteria. A complete PEG ID in SEED looks something like that: fig|562.2.peg.1246, where “fig|562.2” is a genome ID and version and “peg.1246” is an ID of a specific protein in this genome.

I.2. Click on the PEG ID to follow the link to the corresponding PEG page

PART II. ANNOTATE A GENE

II.1. Browse the PEG page to learn basic characteristics of your protein, the immediate genomic neighborhood of the corresponding gene, its current annotation in SEED, as well as in other major genomic databases, check out links to external resources.

II.2. Each PEG page begins with a table we call the context. It represents the region on the chromosome (or fragment of a chromosome called contig). The first column in this table has the label fid, which stands for feature ID. The start and end columns give the exact coordinates of the gene on the contig (not including the stop codon). The size is in bases. The strand is + or -, and the gap is the distance between two genes (genes that overlap have negative values for the gap). The next two columns, fc and neigh, are important. The genes with a FC link in the fc column appear to have some evidence supporting the hypothesis that they tend to co-occur with the gene you are positioned on (see notes on Functional Coupling below). The neigh column will be marked for genes that are known to play closely-related functional roles (e.g., occurring in the same pathway), this evidence is not kept up to date, however.

II.3. Examine graphical depiction of the chromosomal region. The meanings of the colors are as follows: (i) you are positioned on the green gene, (ii) the red genes are apparently unrelated genes, and (iii) blue genes are genes that might be functionally related (there is some evidence based on co-occurrence close to the given gene in several genomes).

II.4. To generate a list of similarities, containing instances of this gene and its close homologs in other genomic databases as well as in SEED - scroll down to the bottom of the PEG page and click Similarities button. Analyze the resultant table of homologs. You will likely see multiple instances of the same proteins with annotations coming from several public archives (note protein IDs characteristic of GenBank, UniProt, KEGG)

II.5. To generate a non-redundant list of similarities containing only homologs of the query protein amenable for annotation in the SEED database (all protein IDs will have a form “fig___._peg___”) choose Just FIG IDs (all) from the drop-down menu, type 50 in “max expand” box, and click Similarities button. Analyze the resultant table of homologs. Feel free to consult Help with SEED similarities options available on each PEG page for a detailed explanation of this tool.

II.6. Tool To compare region allows you to explore chromosomal neighborhoods of the closest homologs of your gene in other genomes. The output will show -- in addition to the genes in the genome you are examining – the corresponding regions in closely related genomes “pinned” around homologs of your gene. The query gene is in red. Other homologous genes in the region are shown by arrows with matching colors and numbers. Genes not conserved within the region are colored gray. Mouse over each arrow for more details. You can expand the area shown or change the number of genomes included in the display by typing new parameters into the corresponding boxes and pressing Reset Parameters button. Pressing Commentary button will activate a page listing groups of homologs in the order of their abundance (#1 = present in all the genomes included in the display, #2 – present in most of the genomes, ... #15 – present in just a few genomes). This page can be used for annotation. Its unique advantage over straight-forward homology projections is in considering chromosomal context: only close homologs in similar chromosomal clusters are assigned with identical specific functions. The table at the very bottom of the Commentary page explains genome abbreviations and lets you pick specific genomes to be included in display. Use check-boxes in show column on the left to select genomes of interest, then press Picked Maps Only button at the bottom. A modified “compare region” display will be generated. If necessary, it can be fine-tuned even further by changing similarity threshold.

II.7. Functional Coupling. Functionally related genes tend to cluster on prokaryotic chromosomes. This fact is the basis for a number of techniques used to gain clues relating to the function of hypothetical proteins. The genes with FC link in the fc column in the context table appear to have some evidence supporting the hypothesis that they tend to co-occur with the gene you are positioned on. Click on the FC link. This produces a visual depiction of the co-occurrences. Although it looks very similar to the output of the Compare region tool, please keep in mind that the two illustrate very different things: while Compare region simply shows you immediate neighborhoods of the closest homologs of your gene in other genomes, regardless of the presence/absence of any other genes in its vicinity; the FC display includes only those genomes in which the two “functionally coupled” genes are co-localized or “clustered” on the chromosome.

II.8. The detailed analysis of your protein described above has probably provided some keys to its potential function. You can annotate your PEG now to reflect this knowledge. Keep in mind that in order for a PEG in SEED to be connected to a Subsystem it's annotation must exactly match the name of the corresponding functional role as it appears in the table of Functional Roles in a SS. There are multiple ways to annotate PEGs in SEED. Choose one of the 3 strategies described below to annotate your protein:

(i) Annotating from a PEG page:

(ii) Annotating using Similarities table:

(iii) Annotating from a Commentary page:

More details on PEG annotating in SEED are available here: [WWW]Assignment_annotation.pdf

Further help with navigating PEG pages in SEED is available here: PegPageHelp.

Part III. FIND A SUBSYSTEM

There are several ways you may use to find a subsystem(s) that involves a functional role (assignment or annotation) of “your” protein:

(i) If your protein has been already included in one (or more) of subsystems, you should see a link on it's PEG page pointing to this subsystem: look under Sybsystems in which this peg is present.

(ii) If this is not the case, check if any homologs of your protein have been included in a subsystem. Such protein(s) in Similarity tables (described in II.4-5) will have a numerical entry (1, 2, etc) in the column In Sub, indicating the number of different Subsystems this PEG is a part of. Go to the respective PEG page (by clicking on its ID) and then follow the subsystem link as described in (i) above.

(iii) You may use the section on the SEED Entry Page: Locate PEGs in Subsystems to search for a relevant subsystem using EC#, function name (if you are lucky), or protein ID (follow instructions on the Entry Page)

(iv) Finally, you may browse a list of subsystems in SEED (or use your browser’s “find in page” functionality) for a potentially relevant term (e.g. NAD biosynthesis, etc). Reach the list of subsystems by clicking on Work on Subsystem button within a Section Work on Subsystems Using New, Experimental Code of the SEED Entry page.

Part IV. EXPLORE A SUBSYSTEM

IV.1. Browse a Subsystem (SS) page. It opens with a Table of Functional Roles constituting this SS. The roles are defined by the most standard descriptive names, for example enzyme names and corresponding Enzyme Classification (EC) numbers, whenever they are available. Note, that role names must exactly match gene annotations in the underlying database. Abbreviations of functional roles are used in Subsystem Spreadsheet below and in SS diagrams.

IV.2. Subsets of Roles table. The concept of sub-sets plays an important role in subsystems encoding and interpretation. They usually represent the most compact units, such as multi-subunit complexes, or variants of pathways. Examples of sub-sets are: “Bacterial type” or “Eukaryotic type”. A star (*) in front of a sub-set abbreviated name causes all the functional roles grouped in it to collapse into a single column in a Subsystem spreadsheet – a useful feature for displaying synonymic functional roles or subunits of multi-subunit complexes

IV.3. Subsystem spreadsheet is simply a table, in which each column represents a functional role in the subsystem, each row represents a specific genome, and cell are populated with proteins that implement specific functional roles in each organism. Protein IDs in the cells are linked to the corresponding PEG pages.

A small set of tools located immediately under the Subsets of Roles table allows the reduction of spreadsheet display to a selected sub-set of functional roles and/or to a selected group of organisms. Try using them.

The main Subsystem visualization/construction tools are located on SS page below the SS Spreadsheet. Try using the following:

(i) sorting. Select option “by_phylo” and press update spreadsheet button below. The organisms in the Spreadsheet will be rearranged according to their phylogeny. Selecting “by_pattern” arranges organisms according to the presence/absence of PEGs in the cells of a spreadsheet – a useful tool in analyzing variations in SS implementation in different organisms

(ii) show clusters. Check the box near “show clusters” and press update spreadsheet button. The cells containing proteins clustered on a chromosome will be highlighted by a matching color.

(iii) color rows by each organism’s attribute. Choose an attribute from the drop-down menu (e.g. “motile”) and click update spreadsheet. See what happens. Legend explaining color usage will also appear.

(iv) color columns by each PEG’s attribute – only one protein attribute, namely membership in PIR protein families, can currently be graphically displayed using this option.

IV.4. NOTES section at the bottom of each SS page contains annotator’s comments, lists open problems identified during SS construction and analysis, and - most importantly - explains variant codes, which are listed for each organism in a Variant code column of a SS spreadsheet . While defining a subsystem, annotators include a collection of functional roles broad enough to cover distinct variations in all relevant organisms. Each subset of functional roles that exists in at least one organism with an operational version of the subsystem constitutes a functional variant. Try sorting SS spreadsheet “by_variant”.

IV.5. Subsystem diagram (graphic representation of a pathway) is often helpful in analyzing a SS and assigning variant codes. Graphic map of your SS can be accessed from the Assignment page on SEED WIKI, from SEED Forum, or by following this link: http://brucella.uchicago.edu/SubsystemForum/showthread.php?t=81. Open it in a new Tab or Window - you'll need it for the next step.

Part V. EXPAND A SUBSYSTEM

V.1. In order to modify an existing Subsystem or to start a new one you will have to enter SEED under your User ID (as it appears on the Assignment page under your Terminal number). If you haven’t yet done so, type your User ID in the “ Enter user” window under Work on Subsystems Using New, Experimental Code caption and press Work on Subsystems button.

V.2. A list of SS available in your version of SEED will be generated. Scroll down this page to SS named NAD and NADP tutorial # where # matches the number of your terminal. All these SSs are copies of the “NAD and NADP tutorial” mother-SS prepared beforehand for this tutorial. Procedure for “cloning” a SS, while changing its ownership will be explained in class.

Note that in addition to “Export full” and “Export assignments” several more functions are now available for you, including “reset”, “delete”, “publish”. Since your User ID matches exactly the User ID it was created with - you are a rightful owner of this SS. Click on the SS name to open it.

V.3. Sort organisms in the spreadsheet by phylogeny. Activate “show clusters”.

V.4. Add a new genome to your SS:

Open the SS diagram (http://brucella.uchicago.edu/SubsystemForum/showthread.php?t=81), examine it, compare the added organism to its nearest relatives – try to predict which functional roles are missing from the spreadsheet in this genome.

V.5. Find candidate genes for the missing functional roles:

This activates automatic search for PEGs in an attempt to fill every empty cell in the spreadsheet row corresponding to the genome specified. In 1 to 8 minutes (depending on the work load on the SEED server) a Missing Entries table with candidate genes will be generated at the bottom of your SS page.

V.7. Carefully examine each candidate by opening the corresponding PEG page (follow the links in PEG column in Missing Entries table). It is convenient to open each PEG page in its own Tab or Window in order not to loose the SS page with search results. Analyse close homologs of the candidate genes, their genome context, annotations of the candidate genes in other databases. Consider functional context as well: using SS diagram rationalize addition of each candidate functional role in the context of NAD(P) metabolism in your specific genome.

V.8. Candidates that you found acceptable need to be re-annotated so that gene functions will match exactly functional roles as they appear in SS. Then candidate PEGs will be automatically connected to your SS spreadsheet.

Candidate genes can be annotated using Similarity table or Commentary page as described in Part II above. But the simplest way to do this now is by using Missing Entries table on your SS page:

V.9. Compare the NAD(P) biosynthesis in the added organism to that in other species, rationalize, try to assess a possible functional variant. Are there any missing genes or open problems?

If you got that far – CONGRATULATIONS!

You are a certified SEED user!


PythonPowered
FindPage by browsing, title search , text search or an index
Or try one of these actions: LikePages, LocalSiteMap, SpellCheck