Database Schemas
Here are defined the schemas for the principal and readonly databases, as well as some useful mixin classes.
Principal Database Schema (indra_db.schemas.principal_schema
)
The Principal Schema
The Principal database is the core representation of our data, the ultimate authority on what we know. It is heavily optimized for the input and maintenance of our data.
- class indra_db.schemas.principal_schema.PrincipalSchema(Base)[source]
The Principal schema class organizes the table constructors.
The tables can be divided into various groups, with a clear order of creation for many of them.
Core Tables
First are the core tables representing our knowledge:
Statement Attribute Tables
Then there are the tables that represent attributes of statements. The set of tables is identical for the raw statements:
and the preassembled statements:
Curation Table
This table is where we record the curations submitted by ourselves and our users, which we use to improve our results.
Ancillary Tables
We also have several tables that we use to keep track of processing metadata, and some artifacts useful in that processing.
- text_ref()[source]
Represent a piece of text, as per its identifiers.
Each piece of text will be made available in different forms through different services, most commonly abstracts through pubmed and full text through pubmed central. However they are from the same paper, which has various different identifiers, such as pmids, pmcids, and dois.
We do our best to merge the different identifiers and for the most part each paper has exactly one text ref. Where that is not the case it is mostly impossible to automatically reconcile the different identifiers (this often has to do with inconsistent versioning of a paper and mixups over what is IDed).
Size: medium
Basic Columns
These are the core columns representing the different IDs we use to represent a paper.
id
integer PRIMARY KEY
: The primary key of the TextRef entry. Elsewhere this is often referred to as a “text ref ID” or “trid” for short.pmid
varchar(20)
: The identifier from pubmed.pmcid
varchar(20)
: The identifier from PubMed Central (e.g. “PMC12345”)doi
varchar(100)
: The ideally universal identifier.pii
varchar(250)
: The identifier used by Springer.url
varchar UNIQUE
: For sources found exclusively online (e.g. wikipedia) use their URL.manuscript_id
varchar(100) UNIQUE
: The ID assigned documents given to PMC author manuscripts.
Metadata Columns
In addition we also track some basic metadata about the entry and updates to the data in the table.
create_date
timestamp without time zone
: The date the record was added.last_updated
timestamp without time zone
: The most recent time the record was edited.pub_year
integer
: The year the article was published, based on the first report we find (in order of PubMed, PMC, then PMC Manuscripts).
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
pmid-doi:
UNIQUE(pmid, doi)
pmid-pmcid:
UNIQUE(pmid, pmcid)
pmcid-doi:
UNIQUE(pmcid, doi)
Lookup Columns
Some columns are hard to look up when they are in their native string format, so they are processed and broken down into integer parts, as far as possible.
pmid_num
integer
: the int-ified pmid, faster for lookup.pmcid_num
integer
: the int-portion of the PMCID, so “PMC12345” would here be 12345.pmcid_version
integer
: although rarely used, occasionally a PMC ID will have a version, indicated by a dot, e.g. PMC12345.3, in which case the “3” would be stored in this column.doi_ns
integer
: The DOI system works by assigning organizations (such as a journal) namespace IDs, and that organization is then responsible for maintaining a unique ID system internally. These namespaces are always numbers, and are stored here as such.doi_id
varchar
: The custom ID given by the publishing organization.
- mesh_ref_annotations()[source]
Represent the MeSH annotations of papers provided by PubMed.
Each abstract/entry in PubMed is accompanied by human-curated MeSH IDs indicating the topics of the paper. Each paper will have many IDs in general, so a separate table is used, liked to the
text_ref
table by an un-constrained PMID. This make insertion of the data easier because the custom TRIDs need not be retrieved to dump the mesh refs.Size: large
Columns
id
integer PRIMARY KEY
: The primary database-assigned ID of the row.pmid_num
integer NOT NULL
: The int-ified pmid that is used to link entries in this table with those in thetext_ref
table.mesh_num
`integer NOT NULL
: The intified MeSH ID (with the prefix removed). The is_concept column indicates whether the prefix was D (False) or C (True).qual_num
integer
: The qualifier number that is sometimes included with the annotation (Prefix Q).major_topic
boolean DEFAULT false
: The major topic flag indicates whether the ID describes a primary purpose of the paper.is_concept
boolean DEFAUL false
: Indicate whether the prefix was C (true) or D (false).
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
mesh-uniqueness:
UNIQUE(pmid_num, mesh_num, qual_num, is_concept)
- mti_ref_annotaions_test()[source]
Represent the MeSH annotations of abstracts as inferred by MTI.
MTI is a machine learned model that attempts to predict MeSH annotations on new un-annotated abstracts after training on the existing annotations.
Size: medium
Columns
id
integer PRIMARY KEY
: The primary database-assigned ID of the row.pmid_num
integer NOT NULL
: The int-ified pmid that is used to link entries in this table with those in thetext_ref
table.mesh_num
`integer NOT NULL
: The intified MeSH ID (with the prefix removed). The is_concept column indicates whether the prefix was D (False) or C (True).qual_num
integer
: The qualifier number that is sometimes included with the annotation (Prefix Q).major_topic
boolean DEFAULT false
: The major topic flag indicates whether the ID describes a primary purpose of the paper.is_concept
boolean DEFAUL false
: Indicate whether the prefix was C (true) or D (false).
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
mesh-uniqueness:
UNIQUE(pmid_num, mesh_num, qual_num, is_concept)
- text_content()[source]
Represent the content of a text retrieved from a particular source.
For each paper as a logical entity, there are many places where you can acquire the actual article or parts of it. For example you can get an abstract from PubMed for most content, and for a minority subset you can get full text from PubMed Central, either their Open-Access corpus or their author’s Manuscripts.
Both the text itself and the metadata for the source of the text are represented in this table.
Size: large
Basic Columns
id
integer PRIMARY KEY
: The auto-generated primary key of the table. These are elsewhere called Text Content IDs, or TCIDs.text_ref_id
integer NOT NULL
: A foreign-key constrained reference to the appropriate entry in thetext_ref
table.source
varchar(250) NOT NULL
: The name of the source, e.g. “pubmed” or “pmc_oa”. The list of content names can be found in the class attributes in content managers.format
varchar(250) NOT NULL
: The file format of the content, e.g. “XML” or “TEXT”.text_type
varchar(250) NOT NULL
: The type of the text, e.g. “abstract” of “fulltext”.preprint
boolean
: Indicate whether the content is from a preprint.license [
varchar
]: Record the license that applies to the content.content
bytea
: The raw compressed bytes of the content.
Metadata Columns
insert_data
timestamp without time zone
: The date the record was added.last_updated
timestamp without time zone
: The most recent time the record was edited.
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
content-uniqueness:
UNIQUE(text_ref_id, source, format, text_type)
- reading()[source]
Represent a reading of a piece of text.
We have multiple readers and of course many thousands of pieces of text content. Each entry in this table applies to a given reader applied to a given pieces of content.
As such, the primary ID is a hash constructed from the text content ID prepended with integers that are assigned to each reader-reader version pair. The function
generate_reading_id
implements the particular process used. The reader numbers are assigned in thereaders
global, and the reader version number is the index of the version listed for the given reader in thereader_versions
dictionary in the same module.Size: very large
Basic Columns
id
bigint PRIMARY KEY
: A hash ID constructed from a reader number, reader version number, and the text content ID of the content that was read.text_content_id
integer NOT NULL
: A foreign-key constrained reference to the appropriate entry in thetext_content
table.batch_id
integer NOT NULL
: A simple random integer (not unique) that is assigned each batch of inserted readings. It is used in the moments after the insert to easily retrieve the content that was just added, potentially plus some extra.reader
varchar(20) NOT NULL
: The name of the reader, e.g. “REACH” or “SPARSER”.reader_version
varchar(20) NOT NULL
: The version of the reader, which may be any arbitrary string in principle. This allows each reader to define its own versioning scheme.format
varchar(20) NOT NULL
: The file format of the reading result, e.g. “XML” or “JSON”.bytes
bytea
: The raw compressed bytes of the reading result.
Metadata Columns
create_date
timestamp without time zone
: The date the record was added.last_updated
timestamp without time zone
: The most recent time the record was edited.
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
reading-uniqeness:
UNIQUE(text_content_id, reader, reader_version)
- db_info()[source]
Represent the provenance and metadata for an external knowledge base.
INDRA DB takes content not just from our own readings but also merges that with many pre-existing knowledge bases, many of them human curated. These knowledge bases are defined and managed by classes contained in
knowledgebase_manager
.No real data is contained in this column, simply records of which knowledge bases have been added, updated, and when.
Size: very small
Basic Columns
id
integer PRIMARY KEY
: A database-assigned integer unique ID for each database entry. These are elsewhere referred to as db_info_ids or dbids.db_name
varchar NOT NULL
: A short lowercase string that is used internally to identify the knowledge base, e.g. “pc” for Pathway Commons.db_full_name
varchar NOT NULL
: The full name of the knowledge base, neatly formatted, e.g. “Pathway Commons”.source_api
varchar NOT NULL
: The indra source API that was used to extract Statements from the knowledge base, e.g. “biopax”.
Metadata Columns
create_date
timestamp without time zone
: The date the record was added.last_updated
timestamp without time zone
: The most recent time the record was edited.
- raw_statements()[source]
Represent Statements exactly as extracted by their source apis.
INDRA Defines several source APIs for different file types from which we can extract INDRA Statements. The goal of these APIs is primarily to accurately convey the contents of the files, and minimal fixes are made at this stage (e.g. grounding is saved for preassembly).
Thus this table contains statements that are considered “messy” in two key ways:
they have a lot of repetition of information, and
they have whatever grounding the original source gave them.
However these Statements also have the Evidence object JSON contained in their json column, and this Evidence information is NOT copied into the
pa_statements
table, which allows for a flexible incremental updates. A “lateral join” on this table can be used to get the first N evidence associated with each PA Statement.Size: very large
Basic Columns
id
integer PRIMARY KEY
: A database-assigned integer unique ID for each database entry. These are elsewhere referred to as “Statement ID”s, or “sid”s.uuid
varchar UNIQUE NOT NULL
: A UUID generated when a Statement object is first created. This can be used for tracking particular objects through the code.batch_id
integer NOT NULL
: A simple random integer (not unique) that is assigned each batch of inserted Statements. It is used in the moments after the insert to easily retrieve the content that was just added, potentially plus some extra.mk_hash
bigint NOT NULL
: A hash of thematches_key
of a Statement. This should be unique for any statement containing the same information.text_hash
bigint
: A hash of a the evidence text, used to detect exact duplicate Statements (same information from the same exact source, right down to the text) that sometimes occur due to bugssource_hash
bigint NOT NULL
: A hash of the source information.db_info_id
integer
: A foreign key into thedb_info
table, for those statements that come from knowledge bases.reading_id
bigint
: A foreign key into thereading
table, for those statements that come from a reading.type
varchar(100) NOT NULL
: The type of the Statement, e.g. “Phosphorylation”.indra_version
varchar(100) NOT NULL
: The version of INDRA that was used to generate this Statement, specifically as returned byindra.util.get_version.get_version()
.json
bytea NOT NULL
: The bytes of the Statement JSON (including exactly one Evidence JSON)
Metadata Columns
create_date
timestamp without time zone
: The date the Statement was added.
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
reading_raw_statement_uniqueness:
UNIQUE(mk_hash, text_hash, reading_id)
db_info_raw_statement_uniqueness:
UNIQUE(mk_hash, source_hash, db_info_id)
- raw_unique_links()[source]
Represent links between raw statements and preassembled statements.
Each preassembled statement is constructed from multiple raw statements, in general. This maps each
pa_statement
to theraw statements
that were merged to form it. It is through this table that evidence can be gathered for pa_statements.The astute reader may note that the
raw_statements
-to -pa_statement
relationship is many-to-one, which can be represented simply using a foreign-key in the “many” table, in this caseraw_statements
. This is not done because thepa_statement
does not, in general, exist when theraw_statement
is added to the database.Constructed as it is, these links can be copied in bulk during preassembly, as opposed to having to modify as many as a million entries with a newly created foreign-key map.
Size: large
Basic Columns
id
integer PRIMARY KEY
: A database-assigned integer unique ID for each database entry.raw_stmt_id
integer NOT NULL REFERENCES raw_statements(id)
: The Raw Statement ID foreign key to theraw_statements
table.pa_stmt_mk_hash
bigint NOT NULL REFERENCES pa_statements(mk_hash)
: The PA Statement matches-key hash foreign key to thepa_statements
table.
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
stmt-link-uniqueness:
UNIQUE(raw_stmt_id, pa_stmt_mk_hash)
- pa_statements()[source]
Represent preassembled statements.
Preassmebled Statements are generated from Raw Statements using INDRA’s preassembly tools. Specifically:
agents are grounded,
agent groundings are disambiguated (using adeft),
sites are fixed (using protmapper),
and finally, repeated information is consolidated, for example
Phosphorylation(MEK(), ERK())
is represented only once in this corpus, with links to the many instances that information was extracted, which are stored in theraw_statements
table.
Each entry is linked back to the (in general multiple) raw statements it was derived from in the
raw_unique_links
table.Size: medium large
Basic Columns
mk_hash
bigint PRIMARY KEY
: a hash of the statement matches key, which is unique for the _knowledge_ of the Statement.matches_key
varchar NOT NULL
: The matches-key that was hashed.uuid
varchar UNIQUE NOT NULL
: A UUID generated when a Statement object is first created. This can be used for tracking particular objects through the code. The UUID is distinct from any of the raw statement UUIDs that compose this Statement.type
varchar(100) NOT NULL
: The type of the Statement, e.g. “Phosphorylation”.indra_version
varchar(100) NOT NULL
: The version of INDRA that was used to generate this Statement, specifically as returned byindra.util.get_version.get_version()
.json
bytea NOT NULL
: The bytes of the Statement JSON (including exactly one Evidence JSON)
Metadata Columns
create_date
timestamp without time zone
: The date the Statement was added.
- pa_support_links()[source]
Represent the links of support calculated during preassembly.
In INDRA, we look for cases where more specific Statements may lend support to more general Statements, and potentially vice versa, to better gauge whether an extraction is reliable.
Size: large
Basic Columns
id
integer PRIMARY KEY
: A database-assigned integer unique ID for each database entry.supporting_mk_hash
bigint NOT NULL REFERENCES pa_statements(mk_hash)
: A foreign key to the PA Statement that is giving the support (that is, the more specific Statement).supported_mk_hash
bigint NOT NULL REFERENCES pa_statements(mk_hash)
: A foreign key to the PA Statement that is given the support (that is, the more generic Statement).
Constraints
Postgres is extremely efficient at detecting conflicts, and we use this to help ensure our entries do not have any duplicates.
pa_support_links_link_uniqueness:
UNIQUE(supporting_mk_hash, supported_mk_hash)
- curations()[source]
Represent the curations of our content.
At various points in our APIs and UIs it is possible to curate the content we have extracted, recording whether it is an accurate extraction from the source text, and if not the reason why.
Size: small
Basic Columns
id
integer PRIMARY KEY
: A database-assigned integer unique ID for each database entry.pa_hash
bigint REFERENCES pa_statements(mk_hash)
: A reference into thepa_statements
table to the the pa statement whose evidence was curated.source_hash
bigint
: A hash that represents the source of this Statement (e.g. reader and piece of content).tag
varchar
: A text code indicating the type of error curated. The domain of these strings is regulated in code elsewhere.text
varchar
: A free-form text description by the curator of what they think went wrong (or right).curator
varchar NOT NULL
: The identity of the curator. This has elsewhere been standardized to be their email.auth_id
varchar
: [deprecated]source
varchar
: A string indicating where this curation originated, e.g. “DB REST API” for the INDRA Database REST service.ip
inet
: The IP address from which the curation was submitted.date
timestamp without time zone
: The date the curation was added.pa_json
jsonb
: the preassembled Statement JSON that was curated.ev_json
jsonb
: the Evidence JSON that was curated (including the text).
Readonly Database Schema (indra_db.schemas.readonly_schema
)
Defines the get_schema function for the readonly database, which is used by external services to access the Statement knowledge we acquire.
- class indra_db.schemas.readonly_schema.ReadonlySchema(Base)[source]
Schema for the Readonly database.
We use a readonly database to allow fast and efficient load of data, and to add a layer of separation between the processes of updating the content of the database and accessing the content of the database. However, it is not practical to have the views created through sqlalchemy: instead they are generated and updated manually (or by other non-sqlalchemy scripts).
Before building these tables, the
belief
table must already have been loaded into the readonly database.The following views must be built in this specific order (temp):
(
pa_meta
)agent_interaction
Note that the order of views below is determined not by the above order but by constraints imposed by use-case.
Meta Tables
Any table that has “meta” in the name is intended as a primary lookup table. This means it will have both the data indicated in the name of the table, such at (agent) “text”, (agent) “name”, or “source”, but also a collection of columns with metadata essential for sorting and grouping of hashes:
Sorting:
belief
ev_count
agent_count
Grouping:
type_num
activity
is_active
Temporary Tables
There are some intermediate results that it is worthwhile to calculate and store for future table construction. Sometimes these were once permanent tables but are no longer used for their own sake, and it was simply simpler to delete them after their derivatives were completed. In other cases the temporary tables are more principled: created because many future tables draw on them and using a “with” clause for each one would be impractical.
Whatever the reason, deleting the temporary tables greatly reduces the size of the readonly database. Such tables are marked in with “(temp)” at the beginning of their doc string.
- belief()[source]
The belief of preassembled statements, keyed by hash.
Columns
mk_hash
bigint
belief
real
Indices
mk_hash
- evidence_counts()[source]
The evidence counts of pa statements, keyed by hash.
Columns
mk_hash
bigint
ev_count
integer
Indices
mk_hash
- reading_ref_link()[source]
The source metadata for readings, keyed by reading ID.
Columns
trid
integer
pmid
varchar(20)
pmid_num
integer
pmcid
varchar(20)
pmcid_num
integer
pmcid_version
integer
doi
varchar(100)
doi_ns
integer
doi_id
varchar
pii
varchar(250)
url
varchar(250)
manuscript_id
varchar(100)
tcid
integer
source
varchar(250)
rid
integer
reader
varchar(20)
Indices
rid
pmid
pmid_num
pmcid
pmcid_num
doi
doi_ns
doi_id
manuscript_id
tcid
trid
- fast_raw_pa_link()[source]
Join of PA JSONs and Raw JSONs for faster lookup.
Columns
id
integer
raw_json
bytea
reading_id
bigint
db_info_id
integer
mk_hash
bigint
pa_json
bytea
type_num
smallint
src
varchar
Indices
mk_hash
reading_id
db_info_id
src
- pa_agent_counts()[source]
The number of agents for each Statement, keyed by hash.
Columns
mk_hash
bigint
agent_count
integer
Indices
mk_hash
- raw_stmt_src()[source]
The source (e.g. reach, pc) of each raw statement, keyed by SID.
Columns
sid
integer
src
varchar
Indices
sid
src
- pa_stmt_src()[source]
(temp) The number of evidence from each source for a PA Statement.
This table is constructed by forming a column for every source short name present in the
raw_stmt_src
.Columns
mk_hash
bigint
…one column for each source…
integer
Indices
mk_hash
- pa_ref_link()[source]
(temp) A quick-lookup from mk_hash to basic text ref data.
Columns
mk_hash
bigint
trid
integer
pmid_num
varchar
pmcid_num
varchar
source
varchar
reader
varchar
Indices
mk_hash
trid
pmid_num
- mesh_terms()[source]
(temp) All mesh annotations with D prefix, keyed by PMID int.
Columns
mesh_num
integer
pmid_num
integer
Indices
pmid_num
- mesh_concepts()[source]
(temp) All mesh annotations with C prefix, keyed by PMID int.
Columns
mesh_num
integer
pmid_num
integer
Indices
pmid_num
- hash_pmid_counts()[source]
(temp) The number of pmids for each PA Statement, keyed by hash.
Columns
mk_hash
bigint
pmid_count
integer
Indices
mk_hash
- mesh_term_ref_counts()[source]
The D-type mesh IDs with pmid and ref counts, keyed by hash and mesh.
Columns
mk_hash
bigint
mesh_num
integer
ref_count
integer
pmid_count
integer
Indices
mesh_num
mk_hash
- mesh_concept_ref_counts()[source]
The C-type mesh IDs with pmid and ref counts, keyed by hash and mesh.
Columns
mk_hash
bigint
mesh_num
integer
ref_count
integer
pmid_count
integer
Indices
mesh_num
mk_hash
- raw_stmt_mesh_terms()[source]
The D-type mesh number raw statement ID mapping.
Columns
sid
integer
mesh_num
integer
Indices
sid
mesh_num
- raw_stmt_mesh_concepts()[source]
The C-type mesh number raw statement ID mapping.
Columns
sid
integer
mesh_num
integer
Indices
sid
mesh_num
- pa_meta()[source]
(temp) The metadata most valuable for querying PA Statements.
This table is used to generate the more scope-limited
name_meta
,text_meta
, andother_meta
. The reason is that NAME and TEXT (in particular) agent groundings are vastly overrepresented.Columns
ag_id
integer
ag_num
integer
db_name
varchar
db_id
varchar
role_num
smallint
type_num
smallint
mk_hash
bigint
ev_count
integer
belief
real
activity
varchar
is_active
boolean
agent_count
integer
is_complex_dup
boolean
Indices
db_name
mk_hash
- source_meta()[source]
All the source-related metadata condensed using JSONB, keyed by hash.
Columns
mk_hash
bigint
ev_count
integer
belief
real
num_srcs
integer
src_json
json
only_src
varchar
has_rd
boolean
has_db
boolean
type_num
smallint
activity
varchar
is_active
boolean
agent_count
integer
Indices
mk_hash
only_src
activity
type_num
num_srcs
- text_meta()[source]
The metadata most valuable for querying PA Statements by agent TEXT.
This table is generated from
pa_meta
, because TEXT is extremely overrepresented among agent groundings. Removing these and NAMEs from the “OTHER” efficiently narrows the search very rapidly, and for the larger sets of NAME and TEXT removes an index-search.Columns
ag_id
integer
ag_num
integer
db_id
varchar
role_num
smallint
type_num
smallint
mk_hash
bigint
ev_count
integer
belief
real
activity
varchar
is_active
boolean
agent_count
integer
is_complex_dup
boolean
Indices
mk_hash
db_id
type_num
activity
- name_meta()[source]
The metadata most valuable for querying PA Statements by agent NAME.
This table is generated from
pa_meta
, because NAME is overrepresented among agent groundings. Removing these and NAMEs from the “OTHER” efficiently narrows the search very rapidly, and for the larger sets of NAME and TEXT removes an index-search.Columns
ag_id
integer
ag_num
integer
db_id
varchar
role_num
smallint
type_num
smallint
mk_hash
bigint
ev_count
integer
belief
real
activity
varchar
is_active
boolean
agent_count
integer
is_complex_dup
boolean
Indices
mk_hash
db_id
type_num
activity
- other_meta()[source]
The metadata most valuable for querying PA Statements.
This table is a copy of
pa_meta
with rows with agent groundings besides NAME and TEXT removed.Columns
ag_id
integer
ag_num
integer
db_name
varchar
db_id
varchar
role_num
smallint
type_num
smallint
mk_hash
bigint
ev_count
integer
belief
real
activity
varchar
is_active
boolean
agent_count
integer
is_complex_dup
boolean
Indices
mk_hash
db_name
db_id
type_num
activity
- mesh_term_meta()[source]
A lookup for hashes by D-type mesh IDs.
Columns
mk_hash
bigint
mesh_num
integer
tr_count
integer
ev_count
integer
belief
real
type_num
smallint
activity
varchar
is_active
boolean
agent_count
integer
Indices
mk_hash
type_num
activity
- mesh_concept_meta()[source]
A lookup for hashes by C-type mesh IDs.
Columns
mk_hash
bigint
mesh_num
integer
tr_count
integer
ev_count
integer
belief
real
type_num
smallint
activity
varchar
is_active
boolean
agent_count
integer
Indices
mk_hash
type_num
activity
- agent_interactions()[source]
Agent and type data in simple JSONs for rapid lookup, keyed by hash.
This table is used for retrieving interactions, agent pairs, and relations (any kind of return that is more generic than full Statements).
Columns
mk_hash
bigint
ev_count
integer
belief
real
type_num
smallint
activity
varchar
is_active
boolean
agent_count
integer
agent_json
jsonb
src_json
jsonb
is_complex_dup
boolean
Indices
mk_hash
agent_json
type_num
Class Mix-ins (indra_db.schemas.mixins
)
This defines class mixins that are used to add general features to SQLAlchemy table objects via multiple inheritance.
- class indra_db.schemas.mixins.IndraDBTableMetaClass(*args, **kwargs)[source]
This serves as a meta class for all tables, allowing str to be useful.
In particular, this makes it so that the string gives a representation of the SQL table, including columns.
- class indra_db.schemas.mixins.IndraDBRefTable[source]
Define an API and methods for a table of text references.
- classmethod pmid_in(pmid_list, filter_ids=False)[source]
Get sqlalchemy clauses for entries IN a list of pmids.
- classmethod pmid_notin(pmid_list, filter_ids=False)[source]
Get sqlalchemy clauses for entries NOT IN a list of pmids.
- classmethod pmcid_in(pmcid_list, filter_ids=False)[source]
Get the sqlalchemy clauses for entries IN a list of pmcids.
- classmethod pmcid_notin(pmcid_list, filter_ids=False)[source]
Get the sqlalchemy clause for entries NOT IN a list of pmcids.
- classmethod doi_in(doi_list, filter_ids=False)[source]
Get clause for looking up entities IN a list of dois.
- classmethod doi_notin(doi_list, filter_ids=False)[source]
Get clause for looking up entities NOT IN a list of dois.
- classmethod has_ref(id_type, id_list, filter_ids=False)[source]
Get clause for entries IN the given ID list.
Indexes (indra_db.schemas.indexes
)
This defines the classes needed to create and maintain indices in the database, the other part of the infrastructure of which is included in the IndraDBTable class mixin definition.