SEED Wiki   P2pImplementation UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View

Notes on the current (July 2005) implementation of the Peer to Peer exchange mechanism.

This is an attempt to capture the current architecture, with an eye toward cleaning it up.

See the P2pMap page for a cartoon of the pieces, and P2pOverview for a textual description of the processes depicted in the image.

1 The pieces.

1.1 Clearinghouse database

The coordination of the peer to peer tools is accomplished by rendezvous through a SQL database. It has the following tables.

1.1.1 Table seed_registration

Maintains information about the SEED instances that are registered with the P2P system.

CREATE TABLE seed_registration ( 
  seed_id text, 
  display_name text, 
  url text, 
  last_active int(11) default NULL 
)  

1.1.2 Table seed_query

Holds P2P requests. requestor_seed_id is the SEED id of the system requesting a response from the responder, responder_seed_id.

the_query is a pickled Python data structure containing the actual query. The P2P system neither knows nor cares about the internal structure of the request.

file_name is the file on the clearinghouse filesystem that contains the result of the request. It is written by the upload_fulfilled CGI on the clearinghouse.

CREATE TABLE seed_query ( 
  id int(11) NOT NULL auto_increment, 
  requestor_seed_id text, 
  responder_seed_id text, 
  the_query blob, 
  request_time int(11) default NULL, 
  query_status text, 
  file_name text, 
  message text, 
  PRIMARY KEY  (id) 
)  

1.1.3 Table news

CREATE TABLE news ( 
  news_id int(11) NOT NULL auto_increment, 
  add_date int(11) default NULL, 
  target text, 
  teaser text, 
  news_item text, 
  PRIMARY KEY  (news_id) 
) 

1.2 new_seed_update_page.py

Presents the toplevel client interface to the P2P system.

User is presented with a list of seed instances and a set of available operations.

The seed instance list is created by invoking SEED_registration_db.cgi with arguments

    function: "get_list" 
    group: group name 

It returns a tab-delimited list of lines

    seed_id 
    display name 
    seed URL 
    last activity time 

The generated form invokes new_p2p_packager.cgi with the following arguments.

    package_thing: Type of request being made. Currently one of 
         Lightweight_Code, Translation_Rules, Annotations, Assignments 
    source: seed ID of the SEED instance to be queried. 

1.3 new_p2p_packager.cgi

1.3.1 If argument `install` is set

Process the following arguments:

    install: a string containing "Install query %s" where %s is query ID of package to install 
      (this comes from a submit button where the query id embedded in the label) 

Invokes download_a_query.cgi on the clearingouse with the following arguments:

    query_id: query id to download 

The download_a_query returns a URL which is then retrieved. The contents of that URL are unpickled and expected to be a list of dictionaries, of which only the first is examined. The following keys are retrieved from the request dict:

    file_name: Name of  the file on the clearinghouse for this package. 
    name: Type of package this is (translation_rules, annotation, assignments, etc). 

A file url is constructed using the basename of the file_name value from the request. That file is retrieved, and the SEED routine install_<package type> is invoked. If the install succeeds, we update the query status to be "installed", otherwise it is updated as "install failed". The status is written back to the clearinghouse using the set_status.cgi script on the clearinghouse with the following arguments:

    query_id: query id to be updated 
    query_status: new status for the query 
    message: message to be written with teh status 

The result of the install is presented to the user, and the script exits.

1.3.2 If argument `delete` is set

The value of the delete argument is used as a query id, and the delete_pending.cgi script is invoked on the clearinghouse with the following arguments set:

    query_id: query to delete 

The browser is redirected to the new_seed_update_page page and the script exits.

1.3.3 Otherwise...

We process the following arguments:

    source: SEED id to take update from (actually seed_id <tab> name) 
    package_thing: type of package we are requesting 

Based on value of package_thing, we process differently.

1.3.3.1 Requesting assignments

    organisms: List of organisms to pull assignments from 
    user: User to pull assignmets from 
    who:  User to ??? 
    date: Date after which assignments are taken 

If any of organisms, user, or date is missing, make_assignments_page() is invoked to return a form that allows the user to choose these items. An argument list is constructed with the following values:

    user: username 
    who: who name 
    date: date to pull assignments from 
    organisms: list of organisms 

1.3.3.2 Requesting annotations

    organisms: List of organisms to pull assignments from 
    who:  User to ??? 
    date: Date after which assignments are taken 

If any of organisms, who, or date is missing, make_annotations_page() is invoked to return a form that allows the user to choose these items. An argument list is constructed with the following values:

    who: who name 
    date: date to pull assignments from 
    organisms: list of organisms 

1.3.4 Common code

In any case, we add the following arguments:

    requestor_seed_id: This SEED's id 
    responder_seed_id: seed id we are requesting data from 
    source: seed id <tab> seed name 
    package_thing: type of package requested 

The upload_request.cgi script is then invoked on the clearinghouse, and the status printed to the user.

1.4 upload_request.cgi

Clearinghouse CGI. Takes the following arguments:

    package_thing: installation type 
    responder_seed_id: 
    requestor_seed_id: 
    source: 
    user: 
    who: 
    date: 
    organisms: 

Each argument's value above is stored in a Python dict (the "request"). The request is written to the seed_query table in the database with an initial status of "pending" and an empty filename.

1.5 set_status.cgi

Clearinghouse CGI. Accepts arguments

    query_id: query id to update 
    query_status: new status value 
    message: message to tag query with 

and updates the status of the specified query.

1.6 download_a_query.cgi

Clearinghouse CGI. Accepts arguments

    query_id: query to download 

Writes a temp file with a pickled list of queries, where each query is a dict with keys

    query_id: the query id 
    query_status: current status 
    file_name: result filename 
    message: output message 

Returns the URL to the temp file.

1.7 delete_pending.cgi

Deletes a query. Arguments:

    query_id: queyr to delete 

1.8 seed_registration.py

Periodically invokes SEED_registration.cgi on the clearinghouse with arguments

    function: register 
    group: peer group (if found in seed_peer_group.cfg, otherwise None) 
    name: hostname 
    url: local FIG url 

1.9 www-unix/~disz/SEED_registration.cgi

Updates the SEED.reg flatfile (python pickle) with information.

1.10 new_seed_registration.py

Periodically invokes do_register, process_updates, and do_get_news.

1.10.1 SEED registration

do_register() invokes SEED_registration_db_news.cgi on the clearinghouse, with arguments

    seed_id = this seed's ID (a uuid) 
    function = register 
    group = peer group (if found in seed_peer_group.cfg, otherwise None) 
    name = hostname 
    url = local FIG url 

1.10.2 Update processing

process_updates() invokes download_for_responder.cgi on the clearinghouse, passing arguments

    seed_id = this seed's ID 
    status = "pending" 

If there are updates to process, download_for_responder returns a URL to a file containing a set of queries formatted as a python pickled data structure.

We download the query url, and extract the list of queries from the pickle.

An argument list is constructed from the request:

    package_thing: value of the `name` entry 
    user: value of the `user` entry if present 
    who: value of the `who` entry if present 
    date: value of the `date` entry if present 
    organisms: value of the `organisms` entry if present 

The getfilename.cgi script is invoked locally with these arguments. It returns either an error code or a URL. If it returns a URL, the contents of the URL are loaded, and that output uploaded to the clearinghouse using the upload_fulfilled CGI.

1.10.3 News handling

We determine the last news data item by reading the last_news file. Construct an argument list:

    function: "get_news" 
    last_news: last news value 
    seed_id: local SEED id 

Invoke the SEED_registration_db_news CGI on the clearinghouse. This returns a list of news items newer than last_news. Each new item is written to a set of files FIG/var/News/<id>.teaser, <id>.date, <id>.news. The last_news file is updated with the latest news item returned.

1.11 www-unix/~disz/SEED/SEED_registration_db_news.cgi

Accepts the following arguments:

    function: "register", "get_list", "get_news" 

1.11.1 Function register

Accepts the following arguments:

    seed_id: ID of registering SEED 
    group:  
    url: 
    name: 

Updates seed_registration table with information for this SEED.

1.11.2 Function get_list

Returns a set of tab-delimited lines of data, one for each SEED in the seed_registration table:

    seed_id 
    display name 
    SEED url 
    last-active time (integer seconds since the epoch) 

1.11.3 Function get_news

Accepts the following arguments:

    seed_id: SEED id we are retrieving news for 
    last_news: id of last news item retrieved 

Queries the news table for all items with ID greater than last_news and where the news target is either "ALL" or seed_id. Return is a list of tab-delimited news items:

    news_id 
    date added to news 
    teaser line (URL quoted) 
    news item text (URL quoted) 

1.12 www-unix/~disz/SEED/upload_fulfilled.cgi

Handles the uploading of a fulfilled request.

Accepts the following arguments:

    file: CGI file upload protocol file contents 
    query_id: query id we are uploading results foor 

Writes the uploaded file contents to a file, and updates the seed_query table entry for query_id to set the status to 'fulfilled' and the filename to the new filename.

1.13 www-unix/~disz/SEED/download_for_responder.cgi

This clearinghouse CGI is responsible for forwarding any pending requests for a given SEED to that SEED. It accepts the following arguments:

    seed_id: seed id of requesting SEED 
    status: status of records we are interested in 

Executes the following query in the clearinghouse database:

    SELECT *  
    FROM seed_query 
    WHERE responder_seed_id = <seed_id> and query_status = <status> 

if a status is passed, or

    SELECT *  
    FROM seed_query 
    WHERE responder_seed_id = <seed_id> 

otherwise.

Each row of the query result is used to construct a request dict. The request dict is initialized by unpickling the pickled query from the the_query column of the table. The query id, status, file_name, and message columns are inserted into the result dict.

The list of request dicts is pickled into a temporary file, and the URL of that file is returned to the client.

??Why is the pickled dict not just returned?

1.14 getfilename.cgi

SEED CGI. Accepts the following arguments

    package_thing: type of package to process 
    user:  
    who: 
    date 
    organisms: organism list 

Execute a packaging command on the SEED. The command is constructed as

    FIG/bin/package_<package type> <output_file> 

Additional arguments are appended for user, who, and date; these take the form

   argname=argvalue 

For example, user=master:BobO date=01/01/2005

The organism list is passed as the final set of arguments to the script. If all organisms are desired, none are passed.

The command is run, writing its output to the specified output file. If an error occurred, output of "Error: <error message>" is written. Otherwise the URL to the output file is returned.


PythonPowered
FindPage by browsing, title search , text search or an index
Or try one of these actions: LikePages, LocalSiteMap, SpellCheck