Information
|
Licensing
|
Technical Specs
|
Support
|
|
Ortholog-link Setup Instructions
Ortholog-link Setup Instructions
Context
Starting with Pathway Tools release 9.5, the comparative genome
browser allows displaying orthologs between several organisms. To
make this work, the orthologs need to have been precomputed, and the
orthology relation needs to have been recorded by so-called ortholog-links.
Ortholog-links are one type of a dblink, which are stored in the
DBLINKS slot. However, there are 2 distinct ways of storing
ortholog-links: as DBLINKS on gene frames, or in a special MySQL
server (which is advantageous when many organisms need to be compared,
but which is more complicated to set up).
For most PGDBs served from http://biocyc.org/ , ortholog-links are
stored on a separate MySQL server, where they are accessible by the
BioCyc WWW server (and from SRI-internal development images).
For Pathway Tools users outside of SRI, the prerequisite would be that
ortholog data has been precomputed by the user or is otherwise
available to the user, because SRI does not yet have a mechanism for
distributing ortholog data.
Ortholog-link Server Setup Instructions
This section describes how the MySQL server is loaded up with the
ortholog-link data, which happens in 2 stages.
Stage 1. Dumping out Ortholog-link Flatfiles
One or several flatfiles need to be created, which contain the
ortholog-link data. These files will be loaded into MySQL in Stage 2.
The file format is simple, containing 5 tab-delimited columns. Each
ortholog-link is a link between one gene in one PGDB and another gene
in another PGDB. Each such link has to be mentioned in the total set
of flatfiles only once, because the retrieval query will query in both
directions. This cuts in half the number of links that have to be put
in flatfiles and stored in MySQL.
The 5 columns are: GeneID1 , GeneID2 , OrgID1 , OrgID2 ,
PValue . Both GeneID1 and GeneID2 are the frame IDs of gene
frames in their respective PGDBs and need to be unique within their
PGDBs. Both OrgID1 and OrgID2 are the unique IDs for the PGDBs.
PValue is a double float number, containing the PValue of the BLAST
score. It is effectively optional, as Pathway Tools does not currently
use the PValue information for anything. An example line form a
flatfile looks like:
CC0008 CBU_0001 CAULO CBUR227377 .00000000000000000000000000000000000000000000000000000000000023
Stage 2. Populating MySQL from the Ortholog-link Flatfiles
- Ensure that a MySQL server is running, which has proper access
permissions for creating a table in a database and data loading permissions
(See MySQL server details below).
- Create the "orthologs" database schema in your database:
mysql> create database orthologs
NOTE: Whatever you call the ortholog database name,
set this value in ptools-init.dat via the Ortho-RDBMS-Database-Name
configuration directive.
- Start up a Pathway Tools image.
- Ensure that the ec::*ortholog-link-host* variable is set
correctly, pointing to the ortholog-link server. It is set by the
parameter called
Ortho-RDBMS-Server-Hostname in the
ptools-init.dat file, along with 3 more related
parameters.
- The ortholog-link data is stored in one SQL table called
Orthologs . If this table already exists and was used
for a prior version of the data, then this table needs to be dropped,
by running the following at the LISP prompt:
(connect-to-ortholog-link-db-if-needed)
(dbi.mysql:sql "DROP TABLE Orthologs" :db *ortholog-link-db*)
- Create the
Orthologs table, populate from the ortholog-link flatfiles, and build the indices,
by running the following at the LISP prompt:
(init-ortho-link-db "/var/ortholog-link-flat-files/")
Replace the example path "/var/ortholog-link-flat-files/" with the directory location of where the flatfiles are located.
This could take several hours to run to completion.
For the 189 PGDBs of the 9.5 release, this took 37 min., running on cumin. (kr:Nov-6-2005)
For 400 11.5 PGDBs, it took over 7 hrs., running on baharat. (kr:Oct-1-2007).
- The ortholog-link server should now be ready to use.
MySQL server details
- The mysql server usually runs as its own user (mysql). Ensure that:
- Your ortholog data files AND directories are accessible by the mysql user/group.
- Newer Linux distros (in particular Debian based) utilizes a new security
feature called AppArmor that limits which files/directories that services
like mysql can access. AppArmor profiles are usually stored in:
/etc/apparmor.d/usr.sbin.mysqld
Review or adjust the files/directories the paths so that mysql has access to
your data files.
- Ensure you have enough free disk space for your ortholog data on the mysql server.
One of our MySQL servers once ran out of disk space while the indices were being
built. The problem is that it ended just hanging forever, and never
returned any kind of error message regarding the problem. Running
df
should give an indication of whether a disk partition
is used up 100%. Also, the MySQL logs, stored at
/var/log/mysql/ are likely to contain a disk space error
message. However, ordinary users do not have read permissions for
these logs...
- Ensure the user account you use to load the data has sufficient permissions to
load data. Currently, our mysql interface only supports server side loading of
data files.
grant file on *.* to dbuserid@localhost identified by 'dbpassword';
Pathway Tools configuration
In order to make use of the MySQL database for ortholog queries, you must
modify a few Pathway Tools parameters stored in the ptools-init.dat
configuration file.
These are the parameters you need to configure:
- Ortho-RDBMS-Server-Port 3306 (default mysql port, ask your DBA if you're not sure)
- Ortho-RDBMS-Database-Name XXXXX (whatever name you called your ortholog database in stage 2, step #2 above).
- Ortho-RDBMS-Username XXXXX (username to access your mysql DB)
- Ortho-RDBMS-Password XXXXX (password you use to access your mysql DB)
- Get-Orthologs-From-SRI N (If you're behind a firewall, you'll want to set
this to "N", otherwise, each ortholog query will attempt to query SRI's
public ortholog database also.)
Let us please know when you run into trouble with any of this, and we
will help guide you through this. Very few of our users have
experimented with their own ortholog-links, so the setup is not very
user-friendly yet.
|