Manage the pending updates on BIOVIA Direct domain indexes through Oracle scheduled jobs

Background

BIOVIA Direct is an extension to the Oracle RDBMS which allows users to register and search molecules and reactions into the Oracle database. Direct provides an Oracle domain index to index chemical entities so that they can be searched efficiently and fast. 

The calculation of the domain index data is time consuming, and processing them at the time when the chemistry data is registered or updated considerably extends the time of the registration. BIOVIA Direct therefore takes the approach to calculate only partial domain index data in the first place when a record is inserted or updated. These partial index data ensure that the record is found in chemistry searches, however, all records that have only partial index data - the so called pending updates - must be individually checked for a correct match in the search. Depending on the number of pending updates, this can have a major impact on the search performance. Therefore, the missing index data of the pending updates must be calculated on a regular basis as part of the Direct domain index maintenance.

BIOVIA Direct provides the procedures 'mdlaux.updatependingfastsearch' and 'mdlaux.updatependinginversions' for this purpose. For details on these procedures please consult the BIOVIA Direct Administration Guide.  

The update of the pending updates can be initiated while the system is online. The solution below provides an example to create an Oracle scheduled job that will trigger the calculation of the pending updates on a daily basis.


Solution

The following command will create an Oracle scheduled job that runs the 'mdlaux.updatependingfastsearch' and 'mdlaux.updatependinginversions' commands for a particular domain index in a particular Oracle schema. For details on the DBMS_SCHEDULER package and its CREATE_JOB procedure please consult the Oracle documentation.

BEGIN
    DBMS_SCHEDULER.CREATE_JOB (
            job_name => '"CHEMISTRY"."UPDATEPENDING_MYMOLECULES"',
            job_type => 'PLSQL_BLOCK',
            job_action => ' mdlaux.updatependingfastsearch(''MYMOLECULES_MDLIX'',''LOGTABLE=MYMOLECULES_UPDATE_PENDING_LOG'');
                            mdlaux.updatependinginversions(''MYMOLECULES_MDLIX'',''LOGTABLE=MYMOLECULES_UPDATE_PENDING_LOG'');',
            number_of_arguments => 0,
            start_date => NULL,
            repeat_interval => 'FREQ=DAILY;BYTIME=030000',
            end_date => NULL,
            enabled => TRUE,
            auto_drop => FALSE,
            comments => 'This job calculates pending index data of the BIOVIA Direct domain index MYMOLECULES_MDLIX');
   
END;
/

Key:
CHEMISTRY - the Oracle account that owns the domain index

​​​​​​​​​​​​​​UPDATEPENDING_MYMOLECULES - an arbitrary name for the scheduled job

MYMOLECULES_MDLIX - the domain index name in the CHEMISTRY schema

MYMOLECULES_UPDATE_PENDING_LOG - an arbitrary name for a table that logs the outcome of the updatepending commands. The table will be created if it does not exist. If the table exists, the output will be appended. Add the TRUNCATELOGTABLE parameter if you want the logtable to be cleared before each run of an updatepending command. LOGTABLE and TRUNCATELOGTABLE are optional flags.

FREQ=DAILY;BYTIME=030000 - this term specifies the repeat interval of the job, in this case a daily execution at 3:00AM.

It is important to understand that Oracle scheduled jobs run in the context of the account which CREATED the job and not of the account which OWNS it. For instance, if SYSTEM creates the job in the schema CHEMISTRY, SYSTEM is the creator of the job but CHEMISTRY becomes the owner. However, if CHEMISTRY now runs this job, it will run in the SYSTEM account context. With that, the job will typically fail as the SYSTEM account is neither configured (and should not be configured) for the use of Direct, nor does it own the Direct domain index objects that the job is supposed to work on. You therefore must create the scheduled job from the Oracle account that owns the domain index. This requires the account to have - at least temporarily - the CREATE JOB system privilege.