Compliance with ethical regulations
Human clinical data on adverse drug reactions were obtained from publicly available resources that contain results in aggregated form only. These resources do not provide individually-identifiable healthcare information under the Health Insurance Portability and Accountability Act of 1996 (HIPAA). As such, no institutional review board approval was required to use these resources.
In vitro safety pharmacology assays
Compounds were obtained from the Novartis Institutes of Biomedical Research (NIBR) compound library and tested in a panel of in vitro biochemical and cell-based assays at Eurofins and/or NIBR in concentration-response (eight concentrations, half-log dilutions starting at 10 or 30 µM). Assay formats varied from radioligand binding, to isolated protein, and cellular assays. Normalized concentration-response curves were fitted using a four-parameter logistic equation performed using software developed internally (Helios). The equation used is for a one-site sigmoidal dose response curve: Y = A + ((B-A)/(1 + ((X/C)^D))), where A = min, B = max, C = IC50, D = slope. By default, min is fixed at 0, whereas max is not fixed.
When drugs had no significant biological activity at the highest concentration tested, the AC50 was reported with qualifier >; for example, an AC50 is reported with qualifier > and AC50 value 30 when a compound exhibits no significant activity at concentrations up to 30 µM. Where curve fitting produces an AC50 value below the highest concentration tested, activity is reported with qualifier =.
Mapping chemicals to DrugCentral structure IDs
Multiple chemical substances were tested in the SPD assays. Different substances include distinct lots sourced from chemical vendors or synthesized internally of a given drug, different salt forms, and/or metabolites of the parent drug. In preparation of the dataset released with this study, SMILES representation of substances were de-salted and converted to InChI keys using RDKit’s MolToInchi function (https://rdkit.org/; accessed 09/22/2021; RDKit version 2021_03_5).
The DrugCentral PostgreSQL database dated 09/18/2020 was downloaded (https://drugcentral.org/download; accessed 09/22/2021). The InChI key consists of three parts separated by hyphens, of 14, 10, and 1 character(s), respectively. These correspond to the connectivity information (or graph; 14 characters), remaining layers (10 characters), and protonation state (1 character). For each NIBR structure, matches to DrugCentral were attempted at multiple levels of decreasing stringency: (1) perfect matches: the InChI key obtained on the DrugCentral SMILES matches the key from a NIBR SMILES, (2) match without the protonation part of the key, (3) match using only the graph part of the key, but require a name or synonym match, and finally (4) try to match on name or synonym, and review structures. Level 4 matches are common with complex drugs such as natural products, where drawing errors occurred during the registration of substances. Matches involving names and/or synonyms compare those from DrugCentral to names assigned as part of the NIBR substance registration. Note that DrugCentral sometimes includes both parent drugs and metabolites, both of which were used in the matching process.
Dataset summarization
Substances sharing the same InChI key tested in multiple assay runs were summarized into a single numeric AC50 for a given InChI vs. assay pair. When one or more AC50 values had qualifier =, the geometric mean was computed and reported with qualifier =; N summarized indicates the number of averaged AC50 values, and N total indicates the total number of AC50 values for the InChI vs. assay pair, including AC50 values with qualifier > excluded from the geometric mean computation. In the absence of any AC50 value with qualifier =, the largest value among those with qualifier > was retained. For instance, the AC50 values of >1 and >30 µM are summarized as follows: qualifier >, numeric AC50 value 30, N summarized 2, and N total 2.
Defining assay groups
The SPD database was created over several years, with some targets having multiple assay protocols employed. Each protocol was assigned a unique identifier. Changes in protocol often result in the creation of a new assay, designated with a new identifier, even when both assays effectively measure the same biochemical event. Examples include changes in radioligand, measurement technology (e.g., filter binding vs. TR-FRET), outsourcing to a contract research organization (CRO), etc. To maximize power for detecting statistically significant relationships and simplify the analysis, we defined assay groups that combine assays resulting in concordant AC50 values when comparing results on the same drug substances.
Concordance analysis was performed by evaluating the agreement of assay results where at least ten compounds were tested in each pair of assays having the same target and mode (e.g., both binding, agonist, or antagonist assays). Qualitative agreement (<10 and ≥10 µM) was assessed by calculating the sensitivity of each assay for detecting actives from the other, with a minimum of 0.5 for both assays (i.e., # active in both assays / # active in assay 1 ≥ 0.5 and # active in both assays / # active in assay 2 ≥ 0.5, and Pearson R ≥ 0.7 calculated on 10 or more log AC50 values, where both results had qualifier =. Assay pairs with insufficient overlapping test compounds were not merged. Viewing assays as nodes and concordant assay pairs as edges, all nodes within a connected graph were merged, even if some assay pairs within the group fell below our concordance cutoff (or lacked sufficient overlapping pairs to assess). To increase the number of overlapping test compounds for a pair of assays, this analysis was performed on all results available for the assays, including proprietary compounds not included in the supplement. Supplementary Data 1 is provided at the level of both individual and grouped assays.
Within each assay group, the assay supplying the largest proportion of results was designated as the preferred assay. When results were available for the preferred vs. other group assays, results from the preferred assay were used in downstream analyses.
Integration of external activity results
Activity data from DrugCentral, including annotation of targets as drugs’ mechanism of action (MOA), were obtained from the act_data_full table for humans and mammals (rat, mouse, cow, guinea pig, rabbit, pig, sheep, dog, chicken, and monkey). Drugs from the SPD were mapped using the DrugCentral structure ID as described above. For each DrugCentral target ID, the Swissprot identifier from the target_component was mapped to Entrez gene IDs using the gene2accession file from Entrez gene or the Uniprot ID mapping tool (https://www.uniprot.org/uploadlists/; accessed 09/23/2021) in the absence of an Entrez match. Finally, any non-human Entrez gene IDs were mapped to the human ortholog using a compilation of associations from Ensembl, Homologene, RGD, and MGI. For SPD, assays using non-human proteins were represented with the human Entrez gene ID. Each drug vs. assay pair from SPD was annotated with the median AC50 for all DrugCentral activity records with the same structure ID and human gene ID. This matching did not consider activity mode (inhibitor, antagonist, etc. – action_type in act_data_full), because it was undefined in most cases.
When defining on-target activity, the DrugCentral act_type variable defining a drug’s mode of action at the target was compared to the assay mode. For functional assays, only drugs annotated as having the same mode as the assay were retained (e.g., agonist drugs for agonist assays). For GPCR and nuclear receptor assays having binding or inhibition modes, agonist drugs were removed. Antagonist drugs were retained because binding and functional antagonist readouts are correlated on our panels.
ChEMBL version 27 was downloaded as a PostgreSQL database. To increase the number of matches while allowing variation in structural representations (e.g., ignoring chirality, structure drawing errors, etc.), ChEMBL compound identifiers provided by DrugCentral were supplemented with those matching the SPD InChi keys identified using the multi-criteria match described above. ChEMBL targets identified with Uniprot accession numbers were mapped to SPD assays as described above. Activity types and units suggestive of multiple concentration testing were converted to AC50 values in µM units (pM, nM, µM, mM, and M); results with units of ng/mL, pg/mL, and ug/mL were converted to molarities using the drugfree base RDKit molecular weight. ChEMBL results suggestive of single concentration testing (units of %) were considered separately. Multiple ChEMBL drug vs. target values were summarized as the median, separately for AC50 and single concentration results.
Clarivate Cortellis (https://clarivate.com/cortellis/solutions/pre-clinical-intelligence-analytics/; accessed 8/3/2021) and Excelra GOSTAR (https://gostardb.com/gostar/; accessed 8/3/2021) are subscription-based resources similar to ChEMBL and DrugCentral. The same processing applied to ChEMBL was used to identify median AC50 values for SPD drug vs. assay pairs.
Selecting representative SPD result for each drug vs. assay group pair
Multiple InChi keys are available for certain drugs, and multiple assays within assay groups. To simplify the analysis, we denoted one result as a representative among all available for a given prescribed (parent) drug vs. assay group pair. For a given drug vs. assay group pair, the algorithm selects assay results for the active metabolite over any collected for the parent drug (e.g., take activity results for enalaprilat when describing the activity of enalapril). The algorithm favors high quality structural matches between DrugCentral and SPD and selects from the preferred assay within the assay group. Detailed logic is available in the Jupyter notebook make_all_prescribable_drug_activity_dataset.ipynb. The selected activity records have column representative_result_drug_assay_group = TRUE in Supplementary Data 1.
Comparison of activity results to ChEMBL
To model assay and target characteristics that may affect concordance between SPD and literature-reported activity results, we assembled a dataset matching SPD activity results with each individual ChEMBL activity reported for a given drug and target pair (i.e., not averaging results for a given drug and target across publications). Only pairs where the activity qualifier was = in both sources were retained. SPD assay results were annotated with assay characteristics reported in Supplementary Data 3, and ChEMBL results with the ChEMBL assay type (e.g., B for binding or F for functional), assay format using the Bioassay Ontology terminology, and standard_type variables from the ChEMBL activity database table. Results were excluded when they were represented by fewer than 500 pairs in the dataset of ca 22 000 SPD vs. ChEMBL result pairs: ABCB11 (SPD-Event = incorporation assay), SCN5A and CACNA1C (SPD Readout = electroanalytical readout), ACHE (SPD Readout = absorbance readout), SPD Protein Class = Protease, ChEMBL bao_format of organism, mitochondrion, microsome, cell-free (blood assays), ChEMBL standard_type not one of Ki, Kd, IC50, EC50, ChEMBL assay_type of T or A. Several annotations were merged: SPD Mode of Binding and inhibition, ChEMBL bao_format = subcellular format (mostly brain synaptosomes) with tissue-based format. The final dataset consisted of 21 596 matched SPD vs. individual ChEMBL activity results involving the same drug substance and target (including ortholog matches across species). Activities were converted to pAC50 values (negative log10 of AC50 in molar units), and the absolute difference was calculated. The activity difference was modeled using continuous variables of SPD pAC50, ChEMBL pAC50, a binary variable same_species (1 = yes, 0 = no), and multi-level categorical variables SPD Protein Class (e.g., GPCR), SPD-Event (e.g., protein binding assay), SPD Format (e.g., in vitro assay with cellular components), SPD Mode (e.g., Binding), SPD Readout (e.g., radioactivity), ChEMBL assay_type (e.g., = B for binding assays), ChEMBL bao_format (e.g., assay format), ChEMBL standard_type (e.g., IC50). All categorical variables were converted to binary dummy variables with the R library fastDummies. A penalized lasso regression model was fit using the R library glmnet to identify variables that were associated with activity differences. Variables discussed in the text were present in models up to and including the 1se model (model with the fewest variables within 1 standard error of the best model) and had an absolute coefficient greater or equal to 0.1. The modeled dataset is provided in Supplementary Data 10.
Human drug exposure (Cmax) and plasma protein binding
The maximal drug exposure (Cmax) at the highest approved dosage was curated from the primary literature for 487 drugs. To broaden the dataset, we extracted human Cmax values from Pharmapendium (https://pharmapendium.com; accessed 10/26/2021). These are often from heterogeneous sources, and include results from lower doses, metabolites, pediatric studies, Cmax at steady state, etc. Selecting a representative value could be achieved by calculating the median, average, 3rd quartile, or 90th percentile. We employed the 487 manually curated results as a reference set and found maximal correlation when Pharmapendium values were summarized as the 3rd quartile. This supplied a further 451 drug Cmax results. Finally, we used values reported in ref. 15 for a further 146 drugs, yielding a dataset of 1084 marketed drug Cmax values (Supplementary Data 2).
To calculate free Cmax values, we employed a similar approach for compiling plasma protein binding (PPB) %: primary literature curation (572 drugs), Pharmapendium (332), DrugCentral (90), and Smit et al. (52), providing a dataset of 1045 plasma protein binding. For Pharmapendium, results reported as albumin or glycoprotein binding were excluded because correlation vs. our curated dataset was low; the 3rd quartile provided the highest concordance vs. curation, as observed for Cmax.
Cmax(free) was determined as the product (100-PPB%)× Cmax(tot) for 940 drugs having both parameters available.
ADR annotation using FAERS and SIDER
Annotation of drugs as being positive or negative for MedDRA-coded ADRs was performed using two sources. In the absence of freely available high-quality manual curation of ADRs and their frequency from labels for all FDA-approved drugs64, surrogate approaches must be employed. The FDA adverse drug reaction reporting system (FAERS) is often used in pharmacovigilance research to detect ADRs. In this work, we used FAERS data from DrugCentral28 without any further post-processing. ADRs with a likelihood ratio test (LRT) ≥5 times the drug-specific threshold value were deemed positive for a given drug14; otherwise, the drug was labeled as negative for the ADR. Results obtained using an LRT cutoff of 2 are broadly similar (Supplementary Notes).
The SIDER database provides annotation of drug ADRs obtained from text mining applied to drug labels16. SIDER uses drug annotation from STITCH, a sister database (http://stitch.embl.de/download/chemicals.v5.0.tsv.gz; accessed 09/23/2021). SMILES from STICH were converted to InChi keys and matched to DrugCentral structures using the multi-criteria matched described above. The mapping is provided in Supplementary Data 11.
Drug ADRs were obtained from the file meddra_all_se.tsv.gz available from the SIDER website (http://sideeffects.embl.de; accessed 09/24/2021). Drugs were mapped to DrugCentral using the stereo_id using Supplementary Data 11. SIDER ADRs were labeled using UMLS CUIs; they were mapped to MedDRA preferred term (PT) codes using the UMLS REST API (https://documentation.uts.nlm.nih.gov/rest/home.html; accessed 09/24/2021). A version of meddra_all_se.tsv, using DrugCentral struct_id and MedDRA PT codes is provided as a supplementary dataset with the Jupyter notebooks (final_sider_map_to_drugcentral_meddra.txt).
Creating ADR training sets
For each ADR, positive drugs (causing the ADR) and negative drugs (not causing the ADR) must be defined. To study the association between ADRs and assays, ADR terms from SIDER and FAERS reported as MedDRA preferred terms (PTs) were mapped to MedDRA high-level terms (HT) and high-level group terms (HG). The mappings were obtained using the UMLS API. Mapping to higher terms results in more drugs labeled as positive (i.e., higher power to detect an effect), but potentially combining PTs with distinct target (assay) risk factors. We therefore modeled relationships at three levels: PT, HT, and HG.
Depending on the level of MedDRA terms, different strategies were employed for defining ADR-negative drugs (i.e., drugs that do not cause a given ADR). For HGs, any drug not positive for one of the underlying PT terms was considered a negative. For HTs, any drug not positive for the term and not positive for a sister HT under the current HG was considered a negative. For PTs, any drug not positive for the term and not positive for a sister PT under the current HT was considered negative. This strategy was used to avoid labeling as negative a drug that produces a similar ADR to the one under study. For example, negatives for PT 0044066 (Torsade de pointes) would exclude drugs that are positive for PT 10047302 (Ventricular tachycardia), because both PTs share the HT 10047283 (Ventricular arrhythmias and cardiac arrest). The Jupyter notebook make_ADR_training_sets.ipynb was used to automate the definition of positive and negative drugs for each PT, HT, and HG term.
Because drugs sharing an active metabolite may have different ADR annotations (e.g., betamethasone dipropionate vs. betamethasone valerate), each was treated separately. Annotation of drugs with a selected activity record from SPD for each assay was performed using the multi-criteria approach described above, using the Jupyter notebook make_ADR_vs_activity_dataset.ipynb. It should be noted that this differs from make_all_prescribable_drug_activity_dataset.ipynb, where we selected a single representative drug form (betamethasone) among all those tested.
Univariate ADR vs. assay association
To establish the strength of association between drugs’ status for ADRs (Boolean) and assay activity measure (continuous measures of AC50, free and total margin), the Kruskal–Wallis (KW) test and ROC AUC computation was performed for each ADR vs. assay pair; KW p values were not adjusted for multiple hypothesis testing. ROC AUC values were calculated with the sklearn.metrics.roc_auc_score function using each of the three activity measures as predicted values, and the ADR class (positive = 2, negative = 1) as actual values.
Activity measures of AC50, total margin, and free margin frequently have qualifier >, indicating that measured AC50 was estimated to exceed the highest concentration tested in the assay. The maximum tested concentrations of 10 and 30 µM were employed for most assays. To calculate a rank-based association test between assays and ADRs, it was necessary to select an AC50 cutoff and replace all values in excess with the cutoff value (truncating). Values with qualifier > but AC50 below the cutoff were excluded. For AC50 values, the numeric distribution for qualifier = and > were largely non-overlapping, the natural cutoff is 10 or 30 µM depending on the assay, and few values needed to be truncated or excluded. Because drug total and free Cmax vary over a wide range, safety margin distributions overlap significantly for qualifier = and >. This makes the selection of cutoff more difficult: too low and one loses the ability to distinguish ranks for drugs with a safety margin above the cutoff, but too high and one must exclude from analysis many values with qualifier > below the threshold (and hence a loss of power). We performed tests using cutoffs of 10 and 30 µM for AC50, 2 and 10 for total margin, and 10 and 100 for free margin.
For assessing the statistical significance of literature-reported assay vs. ADR associations (see below; Supplementary Data 12), we retained the threshold giving the smallest (most significant) KW p value. For systematic analysis of all assay vs. ADR pairs, we selected a single threshold per assay in order to minimize the number of tests performed (i.e., increasing the false discovery rate). For each assay and activity measure (AC50, total margin, free margin), the cutoff providing the largest total number of assay vs. ADR associations with ROC AUC ≥0.7 and KW p value ≤1e-06 was selected and used for all further analyses. The higher cutoffs (30/10/100 for AC50/total margin/free margin) were generally selected (37 vs. 23 assays for AC50, 28 vs. 14 assays for total margin, and 35 vs. 5 assays for free margin cutoff).
Excluded from analysis were all combinations of assays, activity measures (AC50, free margin, total margin), and ADRs not meeting the following criteria: ten or more positive drugs (i.e., drugs with the ADR), 50 or more negative drugs and ten or more non-qualified activity values. Univariate analyses were performed using the Jupyter notebook calc_ADR_vs_assay_score.ipynb.
Multivariate modeling of ADRs
Multiple assays and activity measures (AC50, total margin, free margin) may show significant association with a given ADR. Because assay activity is often correlated across related targets (e.g., targets with similar binding pockets), variable selection strategies can be used to select a smaller number of assay and activity measures among all those which met our univariate threshold (KW p value ≤1e-06 and ROC AUC ≥0.7). We used L1-penalized (Lasso) logistic regression to model each ADR outcome (positive or negative) using the subset of assay + activity measures selected by univariate analysis. AC50 and margins were log10 transformed prior to modeling.
When modeling ADR outcomes with multiple assays, missing activity values occur when some drugs were not tested in all assays. For each ADR, we required individual assays to reach 70% or greater coverage compared to the assay with maximal drug count; missing values were imputed by using the median.
For each dataset (ADR class as dependent variable and assay activities as independent variables), a sequence of penalties “C” was generated. The average and standard error of ROC AUC at each penalty were determined using 50 trials of leave 20%-out cross-validation. The smallest “C” (or largest penalty) producing a model with ROC AUC within 1 standard error of the best model was selected. This led to the creation of models with few variables, to discern the principal contributors to ADR risk. Variables with zero coefficients have no significant role in explaining the odds of being positive for a given ADR, after accounting for the contributions of variables with non-zero coefficients.
Because small values of AC50 or margins indicate activity in an assay, variables with large negative coefficients in the logistic regression model represent assays for which increasing activity results in higher odds of being positive for a given ADR. For coefficients ≥−0.08, the variable was tagged as not in the model. This corresponds to interpreting a 10-fold decrease in AC50 or margin being associated with a smaller than 20% increase in odds ratio of observing the ADR. There were occurrences of coefficients >0.1, i.e., indicating decreased risk of the ADR for activity in the assay. These were almost exclusively in models where an expected negative association was present for another activity parameter of the same assay (i.e., AC50 had a large negative coefficient and free margin had a small positive coefficient). These were considered excluded from the model (i.e., coefficient of 0 in Supplementary Data 7). Multivariate analyses were performed with the Jupyter notebook build_ADR_vs_assay_model.ipynb.
Literature-reported target vs. ADR associations
Target-ADR relationships, as published in three key reviews, were obtained from the supplementary material in ref. 15. Because they did not provide the direction of association (target activation vs. inhibition), we reviewed associations from the three publications. Smit et al. mapped terminology from the reviews to MedDRA preferred terms (PT). In reviewing their results, we added some missing associations and refined mappings to MedDRA codes. These are denoted as pre-analysis supplemental terms in Supplementary Data 12.
Because the selection of MedDRA PTs from the literature reviews may differ from their representation in SIDER or FAERS, we examined the frequency of each literature PT code in SIDER and FAERS. Some terms with suspiciously low frequency triggered searches for better terminology. For instance, “Intestinal transit time decreased” is a valid MedDRA PT used in Lynch et al., however, it is not used in SIDER or FAERS. However, both “Gastrointestinal disorder” and “Diarrhea” were identified as substitutes. These additional mappings were added to the Literature vs. MedDRA PT term mapping.
For each combination of MedDRA code and target from the literature, we examined the significance of the association in the SPD. Each association was tested across all SPD assays for the target (median 1, range 1–6), two sources (SIDER, FAERS), three activity measures (total margin, free margin, AC50), and two activity cutoffs for denoting active vs. inactive results (total margin 10 vs. 2, free margin 100 vs. 10, AC50 30 vs. 10 µM). As such, 12 to 72 tests (median of 12) were conducted per literature association, and we classified as “marginally significant” those associations with KW p value between 0.05 and 0.001 (nominal p value of 0.001 with 72 tests yields a Bonferroni-corrected p value ~0.07). We only tested associations having at least 10 positive drugs and 50 negative drugs with available assay results; ADRs with counts below these thresholds were typically rare (e.g., death) and were classified as “not tested”.
Failure to achieve significance might be due to a poor selection of MedDRA PT for the ADR. For associations classified as marginal, not significant, or not tested, we repeated the statistical testing described above for each PT that shares a given HT with the literature-derived term. For example, Smit et al. mapped “urinary contraction” (given in the Bowes review as an ADR for CHRM3 activation) to the term “Bladder spasm” (MedDRA 10048994). Our dataset only contained six drugs annotated with this ADR (classification “not tested”). However, several MedDRA PTs sharing the same HT met our criteria of KW p value ≤0.001 and ROC AUC ≥0.6. Ordered by increasing p value (most significant first), these include “Urinary retention” (10046555), “Urinary hesitation” (10046542), “Micturition disorder” (10027561), and “Strangury” (10042170). The selection algorithm sorted all related terms by p value, accumulating the number of tests performed, and stopping at first satisfying the above criteria. In Supplementary Data 6, the association of CHRM3 activation with “urinary contraction” (literature ID 364) contains the p value and ROC AUC for “Urinary retention” and is classified as highly significant (p ≤ 1e-06), with distance 1 (i.e., the significant PT and starting PT are connected via 1 intermediate in the network, via the shared HT).
When no PT terms sharing a given HT were identified, terms sharing a high-level group term (HG) were examined. These are encoded as distance 2 in Supplementary Data 6 (i.e., the starting and significant terms are linked via two intermediates: HT then HG. For both the HT and HG expansion, we used a two-step process: first identifying possible related terms, then manually reviewing and confirming them. This is especially important for distance 2 relations where very broad (and sometimes opposite) effects are grouped at the HG level. Only 14 target-ADR pairs were found significant via a shared HG term (distance = 2), but not distance 0 or 1; 11 of these used the term “Ileus paralytic” (10021333) ascribed to “constipation”, “gastrointestinal motility decreased”, “gastrointestinal transit decreased”) reported in the three reviews.
The Jupyter notebook calc_lit_AE_vs_assay_score.ipynb was used to perform this analysis.
The final set of literature-derived target vs. ADR annotations, including those obtained from Smit et al. and our additions via the common HT and HG terms, are provided in Supplementary Data 6. Tabulation, as significant, marginally significant, not significant, or not tested in results, uses the strongest association for any of the individual MedDRA PTs regardless of their source.
Developmental ADRs from Lynch et al.9 are included in Supplementary Data 6 for completeness but were excluded from result summaries described throughout. Because these are rare effects, only three of these associations meet the criteria of 10 positives and 50 negatives in SIDER and/or FAERS.
Assay and ADR characteristics vs. validation of target-ADR pairs
Establishing statistical significance of a given target vs. ADR pair requires having a sufficient number of drugs that are positive and negative for the ADR, and a sufficient number of drugs with measurable activity in the assay. We selected minimal but arbitrary requirements of 10 positives, 50 negatives, and 10 or more non-qualified activity values to test the association. Failure to find significance may simply reflect the limited power of the dataset, i.e., the above cutoffs were not set high enough.
Because the majority of significant literature-reported associations were supported by the AC50 activity measure (rather than free or total margin), we repeated the selection process described above to identify the most significant MedDRA code, source (SIDER or FAERS), assay, and activity cutoff for the AC50 activity measure only. This avoided combining AC50 and margin-based activity measures having different scales, and for which standard percentile values would be non-comparable (free margin = 1 and AC50 = 1 µM are not comparable).
Literature-reported target vs. ADR pairs were classified as significant (222 pairs) or non-significant (497 pairs), using the criteria KW p value ≤0.001 and ROC AUC ≥0.6. Since this analysis used only the AC50 activity measure, there are fewer significant associations compared to Supplementary Data 6 (which used AC50, free and total margin). To identify families of ADRs more or less likely to be significant, MedDRA PTs were mapped to system organ classes (SOCs), and each literature association was annotated as assigned to (1) or not (0) a given SOC. Some PTs map to multiple SOCs (e.g., “Metabolism and nutrition disorders”, “Endocrine disorders”). Several summary statistics that capture the proportion of drugs with potent activity in the assay were selected: percentile values (2.5, 5, 10, 25, or 50th; e.g., assays with many potent drug activities will be represented with smaller percentile values), count of AC50 values less than or equal to 100, 500, or 1 μM, and count of drugs with assay results that are positive or negative for an ADR. The dataset used for this modeling is provided in Supplementary Data 13.
Lasso-penalized logistic regression modeling was performed using the R package Glmnet using default parameters [glmnet(x,y,family = “binomial”)] and the AUC metric for cross-validation [cv.glmnet(x,y,family = “binomial”, type.measure = “auc”)].
Unpublished assay results vs. known drug ADRs
Unpublished SPD activity results, i.e., drug-target pairs not reported in the sources we considered, were compared to known drug ADRs to determine if these unpublished activities might explain the ADRs. The following criteria were applied: (1) the drug has AC50 < 5 µM at a given target that is not reported in ChEMBL, DrugCentral, or the subscription resources, (2) the drug is known to cause a given ADR according to SIDER and/or FAERS, (3) literature-reported target vs. ADR relationship is significant in SPD (p ≤ 0.001) on one of more of AC50, free or total margin, (4) the drug’s activity on one or more of those significant measures is in the top three quartiles among all drugs active in the assay and having the ADR, (5) the drug’s on-target activity is not associated with this ADR, (6) any known off-target activities associated with the ADR have significantly lower potency compared to the new activity (≥10-fold). The final criteria avoid flagging unpublished weak activities unlikely to make significant additive contributions to the ADR risk.
Data analysis
IPython version 7.29 was used to run the provided Jupyter notebooks. Other analyses were performed with R 4.2.1. Plots were prepared in R, Excel (Office 365), and TIBCO Spotfire 11.8. Concentration-response curves were produced with GraphPad Prism 9.5.1. Final figures were assembled with Adobe Illustrator 27.6.1.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

