From COVID-19 vaccines to anticancer drugs and gene therapies for rare diseases, drug discovery is a meticulous process with research and therapy-specific milestones. A well-established drug discovery pipeline can help pave the path to market.
What is the Drug Discovery Pipeline?
Understanding the Drug Discovery Pipeline
A drug discovery pipeline involves several stages, including preclinical and clinical studies, submission to regulatory authorities and post-market surveillance.
A typical pipeline starts with target identification, which involves characterizing the genomic and biochemical abnormalities underlying the disease. The clinical significance of a potential target must be validated by investigating the therapeutic value of correcting the target activity.
After establishing the target, researchers start with a hit discovery process, screening libraries with hundreds of thousands of compounds. Here, assay development assists with characterizing the compounds' impact on the target(s) and disease phenotypes while revealing potential off-target interactions. Top-ranking compounds are modified to optimize potency and derisk adverse interactions in the hit-to-lead phase.
The hits are continuously optimized until the lead compound is selected to proceed with preclinical development, which comprises animal studies to delineate the efficacy, safety, pharmacokinetics (Pk, what the drug does to the body), and pharmacodynamics (Pd, what the body does to the drug).
With preclinical data collection, a clinical trial design is established and submitted with an Investigational New Drug (IND) application. Once approved, clinical trials are conducted in the following sequence:
- Phase 1: Involves healthy volunteers, assesses safety and Pk/Pd profiles
- Phase 2: Recruit patients to evaluate efficacy.
- Phase 3: Extends the phase 2 trials to a larger patient population and provides a more comprehensive efficacy and safety assessment.
After the drug development process, a new drug application (NDA) composed of preclinical and clinical reports is submitted to the authorities.
Even after approval, the pharmaceutical company continues bilateral communication with the regulatory authorities. At this stage, a phase 4 study must be conducted with post-marketing surveillance to analyze the drug’s long-term safety and efficacy.
When companies adhere to the standard drug discovery pipeline format, they can timely and effectively deliver life-saving drugs to the market. The pipeline ensures that patients can benefit from novel drugs without the risk of adverse events.
How is Target Identification Conducted
The Role of Biological Targets in Drug Discovery
A biological target is a gene, protein or enzyme that functions abnormally in a diseased state but can be restored to a healthy phenotype when treated with a drug candidate.
Given the immense time and financial resources required for drug discovery, development and clinical studies, the target must be correctly identified and validated.
Methods for Identifying Drug Targets
Target identification runs on genomic and proteomic levels. On a genomic scale, sequences in a genome can be probed with enhancers or suppressors to identify the gene or set of genes conferring the most therapeutic outcome. However, considering post-transcriptional and post-translational modifications from gene to protein, proteomics-based target discovery can be more informative when targeting structural and functional protein changes between healthy, diseased and drug-treated states.
Both methods can be significantly accelerated by utilizing high-throughput screening (HTS), which involves testing many molecules for their impact on the target’s activity. HTS workflows comprise automated equipment, which maximizes reproducibility and output.
Challenges in Target Identification
Several challenges lie between a target and its drug treatment. Firstly, diseases involve multiple pathways, making it challenging to pinpoint a single target. Even when a potential target is found, it may not be druggable because of its structural properties that shield potential drug-binding sites. Cell lines and animal models do not sufficiently recapitulate human physiology, which may lead to identifying false targets or target-drug interactions. Finally, the analysis of large transcriptomics and proteomics datasets remains a challenge in understanding disease mechanisms.
The Importance of Assay Development in Drug Discovery
A key milestone in a drug discovery pipeline is the in vitro efficacy and safety assessment of drug candidates using human cell and tissue models.
Types of Assays Used in Drug Discovery
Biochemical assays inspect the direct binding between the drug and its target to determine binding affinity and the drug's subsequent activating or inhibitory effects.
Cell-based or in vitro assays test the drug molecule on cell cultures to understand how the drug influences cell viability, cell membrane integrity, proliferation, morphological features, gene expression, ion channel function (for neuronal and cardiac cell models) and metabolic activity.
Computation-based or in silico assays are performed via computer simulation software to predict binding score, drug-target interactions, the free energy of binding, and the holistic impact of the drug-target introduction on the entire biological network.
How Assays Help Further the Drug Discovery Pipeline
The high rate of drugs failing at clinical trials underscores the importance of assay development in documenting various aspects of the drug mechanism of action. These assays ensure that the drug molecule is worth the years of investment while guiding researchers on how to optimize drug performance. From this perspective, assay development is key to time—and cost-efficient drug discovery pipelines.
How Do Databases Support Drug Discovery Pipelines?
Types of Databases in Drug Discovery
Several databases play a crucial role in drug discovery by storing vast amounts of data that can assist at all stages of drug discovery. Researchers can benefit from these databases to identify biologically relevant targets, optimize lead compounds, and streamline preclinical and clinical studies. Thus, the balance of maximum efficacy and minimum adverse effects is obtained.
They can be categorized as follows:
- Biological databases contain genomic, proteomic, and pathway-related information for target identification. Examples include GenBank, UniProt, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome.
- Chemical databases store information on chemical structures, properties, and bioactivities to assist in hit discovery and optimization. Examples include PubChem, ChEMBL and ZINC.
- Drug Databases, such as DrugBank and BindingDB, provide insights into many known drugs, their targets and mechanisms of action.
- Toxicity and ADME Databases provide information about toxicity and ADME (Absorption, Distribution, Metabolism, and Excretion) properties to help predict safety and drug metabolism. SwissADME and ToxCast are commonly used for this purpose.
- Clinical Databases store clinical trial information, drug responses and disease-related data. For example, ClinicalTrials.gov lists ongoing and completed clinical trials.
The Future of Databases in Drug Discovery
Novel databases are emerging to foster personalized medicine and AI-driven drug discovery.
UK Biobank is a patient-powered platform encompassing a broad range of objective (e.g., data on patient samples) and subjective (e.g., patient lifestyle information) information. With the contribution of many UK-based participants, the database bolsters a deeper understanding of diseases on an individual level.
Another initiative is Chan Zuckerberg’s CELLxGENE database, which hosts a large pool of single-cell transcriptomics data. Researchers can visualize, compare and interpret large-scale single-cell RNA sequencing (scRNA-seq) datasets for more accurate target identification and interpretation of disease mechanisms and drug responses.
The diverse and high-quality datasets from these databases are invaluable input for AI-driven drug discovery platforms to identify novel drug targets and biomarkers while predicting drug toxicity and drug design. For example, UK Biobank was used in research to identify targets that highly influence lipid levels in heart failure.1 Meanwhile, single-cell RNA-seq datasets of different cancer subtypes were compiled and uploaded on CELLxGENE to aid the construction of predictive models of cancer cell responses to immune checkpoint inhibitors.²
What are the Key Steps in Clinical Development and Approval?
Significant Milestones of Phase 1-4 Clinical Trials
Clinical trials are conducted in four phases to evaluate safety, efficacy, benefit-risk profile, and longitudinal effects.
Phase 1 trials employ 20-100 healthy volunteers to uncover safety, side effects, and maximum tolerated dose.
Phase 2 tests drug efficacy and safety on 100-500 patients with the target condition. Here, optimal therapeutic dose and short-term side effects are determined.
Phase 3 is conducted over multiple sites with 1000-5000 patients to confirm long-term efficacy and safety and record rare adverse events. Upon successful completion, the submission of a New Drug Application (NDA) or Biologics License Application (BLA) to regulatory agencies (FDA, EMA) is made possible.
Phase 4 involves post-market surveillance of the millions of patients using the drug worldwide. It can provide insight into the long-term effects while creating room for repurposing for new indications.
Below are brief explanations of various applications associated with a drug discovery pipeline submitted to regulatory agencies.
Investigational New Drugs (IND) applications are submitted to the FDA to allow the pharmaceutical company to commence clinical trials. The application must include preclinical data, manufacturing information, and proposed clinical trial protocols.
A Biological License Application (BLA) is submitted to the FDA to approve vaccines, monoclonal antibodies, gene therapies and recombinant proteins for marketing. It is a comprehensive report that includes product characterization, manufacturing information, quality control, preclinical and clinical data and risk management plans.
A New Drug Application (NDA) is similar to a BLA in content but is requested for FDA approval for pharmaceutical drugs (small molecules).
See how Danaher Life Sciences can help
FAQs
What are the stages of drug discovery?
The drug discovery process includes target identification, hit discovery, lead optimization, preclinical testing and clinical trials (Phases 1–3). If successful, the drug undergoes regulatory approval (e.g., NDA/BLA submission) before post-marketing surveillance (Phase 4).
What is the role of artificial intelligence in drug discovery?
AI accelerates drug discovery by predicting drug-target interactions, analyzing biological data and optimizing molecule design. It improves efficiency in virtual screening, biomarker discovery and clinical trial optimization.
What are the main factors influencing the success rate of new drugs?
Key factors include drug efficacy, safety, target selection, trial design, regulatory submissions and market competition.
How long does the drug discovery process typically take?
On average, it takes 10–15 years from discovery to approval, with high costs (~$1–2 billion) and a low success rate (~10%) from Phase 1 to market.
References
- Xiao J, Ji J, Zhang N, Yang X, Chen K, Chen L, et al. Association of genetically predicted lipid traits and lipid-modifying targets with heart failure. Eur J Prev Cardiol 2023;30(4):358-366.
- Gondal MN, Cieslik M, Chinnaiyan AM. Integrated cancer cell-specific single-cell RNA-seq datasets of immune checkpoint blockade-treated patients. Sci Data 2025;12(1):139.
recent-articles