Database Integrated Preassembly Tools

The database runs incremental preassembly on the raw statements to generate the preassembled (PA) Statements. The code to accomplish this task is defined here, principally in DbPreassembler. This module also defines proceedures for running these jobs on AWS.

Database Preassembly (indra_db.preassembly.preassemble_db)

This module defines a class that manages preassembly for a given list of statement types on the local machine.

exception indra_db.preassembly.preassemble_db.IndraDBPreassemblyError[source]
exception indra_db.preassembly.preassemble_db.UserQuit[source]
class indra_db.preassembly.preassemble_db.DbPreassembler(batch_size=10000, s3_cache=None, print_logs=False, stmt_type=None, yes_all=False, ontology=None)[source]

Class used to manage the preassembly pipeline

Parameters

batch_size (int) – Select the maximum number of statements you wish to be handled at a time. In general, a larger batch size will somewhat be faster, but require much more memory.

create_corpus(db, continuing=False)[source]

Initialize the table of preassembled statements.

This method will find the set of unique knowledge represented in the table of raw statements, and it will populate the table of preassembled statements (PAStatements/pa_statements), while maintaining links between the raw statements and their unique (pa) counterparts. Furthermore, the refinement/support relationships between unique statements will be found and recorded in the PASupportLinks/pa_support_links table.

For more detail on preassembly, see indra/preassembler/__init__.py

supplement_corpus(db, continuing=False)[source]

Update the table of preassembled statements.

This method will take any new raw statements that have not yet been incorporated into the preassembled table, and use them to augment the preassembled table.

The resulting updated table is indistinguishable from the result you would achieve if you had simply re-run preassembly on _all_ the raw statements.

indra_db.preassembly.preassemble_db.shash(s)[source]

Get the shallow hash of a statement.

indra_db.preassembly.preassemble_db.make_graph(unique_stmts, match_key_maps)[source]

Create a networkx graph of the statement and their links.

A Class to Manage and Monitor AWS Batch Jobs (indra_db.preassembly.submitter)

Allow a manager to monitor the Batch jobs to prevent runaway jobs, and smooth out job runs and submissions.