Phenomenex

Phenomenex

Duration: 23:13 Min

Chromatographic Determination of mRNA Critical Quality Attributes

Transcript

0:01    My name is Ramesh Indarkanti. I'm a Biologics Business Development Manager with Phenomenex.
0:05    Thank you for coming. And mRNA is an important drug modality, and it was seen with the recent
0:12    COVID vaccine development, which was where rapid deployment and development was possible
0:17    because of many unique properties. Also, last week, a Nobel Prize in Physiology and Medicine
0:23    was awarded for the development and discovery of mRNA vaccines. So, like any other drug modality
0:31    out there, we have to understand some of the critical quality attributes of mRNA for it to
0:37    be useful as a drug. And for today's presentation, we'll focus on some of the chromatographic methods
0:43    by themselves or coupled with mass spectrometry to understand the critical quality attributes of
0:49    the mRNA molecule. Most of this work was carried out by Roxanna, our application scientist, and
0:55    I have the opportunity to present it to you guys here.
0:59    So, this is the brief overview of the presentation here. We'll just start with an introduction of
1:04    the critical quality attributes of mRNA drugs and vaccines, solely focusing on the mRNA molecule
1:11    itself and not the LNP component. Then we move on to looking more in depth about the 5' cap
1:18    characterization and efficiency, which is important for mRNA's efficacy. Then we look at the ways we
1:24    can use enzymatic sequencing as well as mass spectrometry to understand the primary structural
1:30    integrity of mRNA. Then we want to talk about the poly(A) length distribution and heterogeneity,
1:35    which are important for the life of the mRNA in the cells in the organism. So,
1:43    then finally, look at the mRNA aggregation as a way to establish the drug substance
1:51    product quality. So, here we're looking at the mRNA critical quality attributes, and
1:59    as you can see here, mRNA contains a 5' cap, which is a highly methylated chemical structure,
2:06    we'll see in the coming slides. And the cap in itself determines the mRNA's efficacy, the
2:11    translational efficiency there, because this is where the transcription factors bind and express
2:17    the protein. Translation factors bind and express the protein. So, the amount of cap highly influences
2:24    the mRNA's expression. And also, it differentiates the host endogenous mRNA from those of the
2:31    pathogens. So, one of the early discoveries was the importance of the cap in having the synthetic
2:36    mRNAs to make them useful as vaccines and as well as therapies. Then this is followed by the
2:43    3' UTR region, untranslated region, which is also a regulatory, has contained several regulatory
2:49    elements, followed by the open reading frame or decoding sequence, as well as the 3' UTR,
2:57    or untranslated region. And we need to understand the sequence integrity of this in order to
3:03    establish the sequence integrity of the translated protein as well. Finally, there is a poly(A) tail
3:09    here that determines it is important in mRNA translocation, as well as mRNA life is also
3:17    heavily influenced by the length of the poly(A) tail itself. So, we'll look at methods to understand
3:23    each one of these critical quality attributes of the mRNA. And here we are looking at the
3:30    overview of the workflow that's used in this present set of experiments here,
3:34    which starts with heat denaturing the mRNA, and then in the presence of urea as a denaturing,
3:40    then digesting it down to smaller oligonucleotides that are much more amenable to mass spectrometry analysis
3:47    using RNase 4. And since RNase 4 cleaves between the U and the A or U and G,
3:54    and leaves the 3' phosphate, we are incorporating a T4 polynucleotide kinase that removes the 3'
4:00    phosphate, as well as 2'-3' cyclic phosphates that are formed during the digestion process.
4:05    Now, this results in a much more simplified hydroxylated pool, as opposed to having a
4:11    phosphate and a cyclic phosphate combination that generates a much more complex pool of
4:17    shorter oligos. Now, we're going to subject this to LC-MS/MS. First, we're going to, you know,
4:23    use good chromatography on our ion-pair reverse phase system using the biotin oligo column,
4:28    as you'll see, and then couple that to a high-resolution mass spectrometry,
4:32    CyX-Xenotop 7600 instrument. So, this is the comprehensive overview of the workflow.
4:39    Now, let's take a little bit of a closer look at the type of enzymes and how we would choose
4:45    various nucleases for the mRNA characterization workflow. If I were to use human RNase 4,
4:54    that cleaves between U and A and U and G, that's the 3' to U, followed by A or G, then I actually,
5:02    in case of EGFP mRNA example that we'll be looking at in this study, you actually end up generating a
5:10    nice, decent-sized oligonucleotide, 18 nucleotides long without the cap and about
5:17    19 nucleotides long with the cap, which is perfectly useful for mass spectrometry-based sequencing
5:23    and quantification. But on another hand, if I were to use an RNase T1, which cleaves the 3' terminal
5:29    to G residues, I'll end up generating really short oligonucleotide fragments that are not as
5:36    useful for sequencing as well as quantifying applications. On the extreme end, if I were to
5:40    use the E. coli MASF, that actually cleaves 5' to the ACA triplets, I'll end up generating about
5:47    a 90-nucleotide oligonucleotide, which makes it very difficult to do sequencing using mass spectrometry.
5:55    So, all in all, it's really important to choose the right type of nuclease based on your
6:01    understanding of the mRNA. In the present study, we are going to be using humanized RNase 4 in
6:06    combination with T4 polynucleotide kinase to remove the phosphates formed on the 3'.
6:15    Here is the sequence of the, I mean, mRNAs can be complex, can be a variety of lines,
6:20    but for the present study, we are using the EGFP enhanced green fluorescent protein sequence.
6:26    Then it's about 908 nucleotides long with a mass of about 294,000 Daltons. And as you can see,
6:34    if you were to use RNase 4, human RNase 4, it cleaves between the U and G, as well as U and
6:42    A residues, you'll generate a nice 5-terminal fragment with a cap on it that will allow you to
6:48    do sequencing, MS/MS based sequencing, as well as do good quantification using mass spectrometry
6:54    and chromatography. On the poly(A) tail side, you actually, you can do cleavage between this U and
7:02    A and you'll end up with the idea of a poly(A) tail length that you can analyze by HPLC mass
7:10    spectrometry as well. Now, obviously, good chromatography and choosing the appropriate
7:18    column for the type of analyte you're working with is going to be very important. And all
7:23    your nucleotide HPLC columns' requirements can be very different from those of proteins and small
7:28    molecules. And in this regard, Biosyn Phenomenex offers our Biosyn oligo HPLC column, which is
7:35    based on our core shell technology that has a solid impermeable inner core and a porous outer
7:41    core. And the porous outer core is the one that's responsible for separation. So this is a C18
7:46    column that incorporates a hybrid particle technology that offers extreme pH stability.
7:52    It comes in our BioTI titanium hardware to reduce sample loss and non-specific binding.
7:58    And it's stable up to pH 12, which is very important when you're working with oligonucleotides.
8:04    In addition, some of the offerings out there for the oligonucleotide columns are based on
8:10    fully porous particles. And fully porous particles, since they have a longer diffusion path, result in
8:16    greater band broadening. The core shell particles, on the other hand, can have shorter
8:24    diffusion paths, as a result, give you higher efficiency as well as higher resolution.
8:32    Now, we'll be looking at three different things. One is the 5' cap. The second is the
8:38    sequence integrity. And the third one is the poly(A) distributions. And since all these three different
8:43    studies require three different types of mass spectrometry-based experiments, we've chosen the 7600
8:50    ZenoTOF here. This offers, for the cap, it offers the MRM-based quantification abilities,
8:58    along with high-resolution measurements, which uses accurate quantification here.
9:03    And for the sequencing capabilities, it offers a data-dependent acquisition that will allow us to
9:10    get a complete, good sequence coverage for the mRNA. And for the poly(A) length distribution,
9:18    we have the accurate mass measurements that will give us information about the
9:23    poly(A) heterogeneity itself. Now, let's get a little bit deeper into the
9:29    mRNA cap characterization. Like I mentioned before, the mRNA 5' cap is very important to ensure
9:36    accurate translation of mRNA, as well as efficacy. And it also differentiates the
9:42    host endogenous nucleic acids from those of the pathogens. Since our nucleic acids from virus and
9:48    bacteria don't have the 5' cap, that's how our immune system can differentiate those from the
9:53    endogenous mRNA molecules. And the 5' cap comprises of N7-methylguanosine that is linked
10:02    via this triphosphate linkage, a 5' triphosphate linkage to the first nucleotide of
10:08    the mRNA. And in some cases, there could be a free hydroxyl in the 2' of the first nucleotide. We
10:15    call that a cap 0. And in cases where there is a methylation of the 2', we call that cap 1.
10:21    And in the present study, we'll be focusing on the cap 0, which is part of the EGFP mRNA
10:28    we use in these experiments. Now, let's take a closer look at how we calculate the percent cap
10:36    efficiency. So this is a structure here. Obviously, if you were using human RNase 4, RNase 4,
10:43    like I mentioned before, it cleaves between the U and G residues. Since we are using T4 PNK,
10:48    that's the polynucleotide kinase T4, that removes the phosphate that we end up with is about 18 or
10:54    19 more nucleotides long, depending on whether it's uncapped or capped. And you can also end up
11:00    with various other degradants as shown here, and these are the accurate masses. So we'll incorporate
11:06    a combination of MRM and accuracy measurements to understand the levels of the cap versus uncapped
11:14    that's present in the samples here. And we're going to use this formula here for calculating the
11:19    capping efficiency in the mRNA samples. Now, a good analysis, mass spectrometry analysis,
11:29    starts with good chromatography, and that's what we're seeing here. Running these samples
11:34    in MRM mode on our bioanalytical column, we can get a nice separation between the cap 0,
11:40    which is actually capped, and there's no cap, right? Remember, the cap 0 is actually non-methylated
11:46    bipyramid cap. So, and the no cap is eluding around 41.5 minutes, and the fully,
11:53    the cap mRNA is eluding around 45 minutes. And for good, robust chromatography is important,
12:00    so we can see across the replicates we are having very consistent retention times,
12:06    giving us confidence in the robustness of the method. And also, when looking at the
12:11    peak areas for the cap versus uncapped across various replicates, we also have very
12:15    consistent results, and that gives us confidence in the robustness of our method as well.
12:22    Here, we are looking at the, you know, the accurate mass data of the no cap sequence
12:30    that's generated from the RNase T4 digestion. This is the deconvoluted spectral data. In other words,
12:36    it's a neutral mass data, and you can see a nice isotopic resolution, even at an 18
12:43    oligonucleotide here, which is going to be very important to understand the,
12:48    to have a more precise understanding of the sequence as well. On the right, we are seeing
12:54    the CID spectral data, and these blue L's represent the fragments that are generated,
13:00    the five-prime fragments that are generated due to the CID in the collision cell, and the red
13:07    L's indicate the three-prime fragments. So, if you're looking at this, you know, either by the
13:12    red or the blue themselves, that is the five-prime fragments or the three-prime fragments by
13:16    themselves, don't give us complete sequence information, but if you were to combine these
13:21    two data sets, you actually get a complete sequence coverage. So, we're not only able to
13:27    quantify the mRNA, but also are able to use the CID capabilities of this instrument to
13:34    get a complete sequence of this five-prime capped oligo. Here, we're looking at, going back here,
13:42    the, we estimated the uncapped oligo to be about 14 percent, and using, along the same lines here,
13:50    we're looking at the M7G capped oligo here, that's a 19 mer oligo, and we also get a nice
13:57    complete sequence coverage for this particular sequence, so it gives us confidence in our
14:02    results. And using this, using the MRM experiments, we estimate the amount of the
14:09    capped oligo to be about 85 percent. Let's move on to looking at the sequence mapping
14:18    information, that's for the sequence coverage, because we need to establish the primary
14:22    structure identity, which gives us confidence in the sequence of the protein that are being
14:27    expressed. One of the things to keep in mind is when you were to use, if you were to use nucleases,
14:32    and since oligonucleotides only contain, mRNA oligos contain only four nucleotide AUGC,
14:39    we can end up with the multiple sequences variants that will have the same exact mass, so
14:44    it's important to have a good chromatography to be able to separate these sequence variants in
14:49    order to establish their identity and get good sequence coverage. Here we are looking at the
14:56    RNase T4 digested mRNA that's run on our Biosyn oligo column using ion-pair reverse phase
15:02    chromatography method, using hexyl propanol and isopropylamine as mobile phase modifiers in a
15:09    water acetonitrile gradient system. And you know, if you're a chromatographer, you can really
15:14    appreciate the quality of the data we are getting here. Well-separated peaks with the nice, you know,
15:20    sharp peaks were well-separated, and even for a complex mixture like this, you're getting a good
15:25    distribution of these, all these peaks. On the later end part of the chromatogram here,
15:30    you see that the poly(A) tail and all these little bumps, as well as the
15:36    the big peak here is the poly(A) tail, as we will see in the later slides.
15:40    Now, if you were to subject this to data-dependent acquisition on the
15:45    CyX 7600 Xenotop mass spectrometer, we can get the CID spectral data, which will allow us to
15:53    establish the sequence of all of this. And from this present experiment using the RNase 4
16:00    nuclease and the nuclease, we are able to get about 96% coverage for this. And obviously, I'm not
16:06    including the poly(A) tail in this 96% number. That will be discussed later separately, and that's a
16:12    separate set of experiments. And as I mentioned before, the short nucleotides can have slightly
16:18    different sequences, but would have the same exact mass, in which case, on the mass spectrometer,
16:23    they're indistinguishable because they have the same exact mass. There, it becomes important to
16:28    chromatographically separate these short sequences and then use the mass spectrometry to get their sequence
16:34    identity. And what we're seeing here is three oligonucleotides that have the same exact mass,
16:39    but have different base locations. We call them the sequence isomers. And these sequence isomers
16:47    are nicely separated in our Biosyn oligo column, and we also have very consistent retention times
16:53    across replicates as well, giving us confidence in the robustness of the method. So, taking a
16:58    little bit of a deeper look at that, the data we've shown in the previous slides, we can see
17:04    in the previous slide here, we have peaks one to three that have the same exact mass, that are
17:09    indistinguishable in the mass spectrometry. But by separating them chromatographically, we pick
17:14    negatively charged two charges, two ions for each one of these peaks, and subject this to CID
17:20    fragmentation, and we can get the complete sequence information for each one of these. So, thereby, we
17:26    are improving overall sequence coverage of our mRNA using this nuclease digestion, and we are able to
17:32    get 96% sequence coverage. Now, with that, you know, little bit of data on the sequence coverage,
17:41    let's move on to the poly(A) tail length distribution and heterogeneity itself. Like I said,
17:47    the poly(A) tail is very important in enhancing the life of the mRNA itself, and also in its
17:55    cellular translocation. So, it's an important attribute to measure in your drug substance as
18:03    well as drug product. Now, for this, we're still using our RNase 4 that is coming between this U
18:12    and A, and generating a long sequence, because there are various degradants and varying different
18:19    poly(A) tail lengths. And taking a closer look at the later part of the chromatogram that we saw a
18:23    few slides ago, and here you see a lot of bumps here, which we'll zoom into in the next
18:29    slide, that are actually coming from the different poly(A) chain lengths. And this big
18:38    peak has multiple poly(A) chains as well. And if you were to take a look at the MS data, it looks
18:43    extremely complex, but if you were to do a deconvolution on the spectral m over z data,
18:49    and convert this to the mass domain, we can see nice even spacing that corresponds to
18:55    an adenosine nucleotide giving us confidence that this is the poly(A) tail. Here, we're looking at
19:01    the power of the chromatography itself to separate all these various poly(A) tail lengths, and we can
19:08    nicely separate up to 61 for the poly(A) tail length. But as you get to longer and longer
19:15    oligos, the difference between the N-1 and full lengths is small, and as a result, the separation
19:21    becomes more difficult. But nonetheless, we can still use the mass spectrometry to deconvolute this and
19:26    get additional information. So for chromatographically, we are able to separate up to
19:31    61 nucleotides in this. And if we were to take this big peak here and perform deconvolution,
19:38    we see that the peak spacing is equal to that of an adenosine, telling us that this is, again,
19:44    the poly(A) tail. And we were able to detect a poly(A) tail length of up to 18 nucleotides in
19:51    this recent study. Now let's finally take a look at an additional quality attribute,
19:57    which is the aggregation of mRNA itself, which…
20:00    poly(A) tail. Here, we're looking at the power of the chromatography itself to separate
20:05    all these various poly(A) tail lengths, and we can nicely separate up to 61 for the poly(A)
20:12    tail length. But as you get to longer and longer oligos, the difference between the
20:17    N-1 and full lengths is small, and as a result, the separation becomes more difficult. But
20:22    nonetheless, we can still use the mass spectrometry to deconvolute this and get additional information.
20:28    So, for chromatographically, we are able to separate up to 61 nucleotides in this.
20:33    And if we were to take this big peak here and perform deconvolution, we see that the
20:39    peak spacing is equal to that of an adenosine, telling us that this is, again, the poly(A) tail.
20:48    And we were able to detect a poly(A) tail length of up to 18 nucleotides in this recent study.
20:53    Now, let's finally take a look at an additional quality attribute, which is the aggregation of
20:58    mRNA itself, which, according to USP guidelines, is a product quality one needs to establish in
21:05    your mRNA samples. Now, obviously, size exclusion chromatography is well-suited for separating
21:13    mRNA and its aggregates, since the monomers, the dimers, and the trimers are going to be
21:17    two times, three times, and so on. The molecular weight of the monomeric peak and the size
21:23    exclusion is going to be a very useful way to separate this. And for this application,
21:28    we have used our Biosyn-DSeq7 size exclusion column, which is a 700 angstrom pore size column,
21:34    and it incorporates our BioTI titanium hardware, and it comes in various lengths and dimensions,
21:42    depending on your requirements. Here, we are looking at the EGFP mRNA that is separated on
21:48    the Biosyn-DSeq7 HPLC column. And as you can see here, the first peak is coming most likely
21:56    from the aggregate, and the really tall peak is coming from the monomer. But how do we know this
22:01    is aggregate? So, if you were to take this sample and heat it, actually, the levels of the aggregate
22:06    go down, whereas the levels of the monomer go up, suggesting that this is a hydrogen bonding
22:12    type of aggregation is happening, and heating it to 70 degrees is actually decreasing the aggregate
22:18    levels, right? And that's great that we were looking at the UV data in the previous slide,
22:24    but what if I want additional information? So, you can couple your DSeq7 column to something
22:29    called a multi-angle light scattering detector, or MALS, that can give you molecular weight
22:35    information as well. So, from this, we can see that the first major peak is the monomer,
22:41    and the larger aggregates are coming, the dimer, trimer, tetramer, and so on,
22:49    giving us the confidence that we are detecting the aggregation accurately.
22:54    In summary, I hope I was able to convince that the phenomenal solutions for mRNA characterization
23:01    and critical quality attribute determination encompassing the oligo sequence mapping,
23:06    5'-cap efficiency, poly(A) tail length distribution, as well as aggregate determination
23:13    can be very helpful in your day-to-day work. Thank you for your time.