Phenomenex
Duration: 23:13 Min
Chromatographic Determination of mRNA Critical Quality Attributes
Transcript
| 0:01 | My name is Ramesh Indarkanti. I'm a Biologics Business Development Manager with Phenomenex. |
| 0:05 | Thank you for coming. And mRNA is an important drug modality, and it was seen with the recent |
| 0:12 | COVID vaccine development, which was where rapid deployment and development was possible |
| 0:17 | because of many unique properties. Also, last week, a Nobel Prize in Physiology and Medicine |
| 0:23 | was awarded for the development and discovery of mRNA vaccines. So, like any other drug modality |
| 0:31 | out there, we have to understand some of the critical quality attributes of mRNA for it to |
| 0:37 | be useful as a drug. And for today's presentation, we'll focus on some of the chromatographic methods |
| 0:43 | by themselves or coupled with mass spectrometry to understand the critical quality attributes of |
| 0:49 | the mRNA molecule. Most of this work was carried out by Roxanna, our application scientist, and |
| 0:55 | I have the opportunity to present it to you guys here. |
| 0:59 | So, this is the brief overview of the presentation here. We'll just start with an introduction of |
| 1:04 | the critical quality attributes of mRNA drugs and vaccines, solely focusing on the mRNA molecule |
| 1:11 | itself and not the LNP component. Then we move on to looking more in depth about the 5' cap |
| 1:18 | characterization and efficiency, which is important for mRNA's efficacy. Then we look at the ways we |
| 1:24 | can use enzymatic sequencing as well as mass spectrometry to understand the primary structural |
| 1:30 | integrity of mRNA. Then we want to talk about the poly(A) length distribution and heterogeneity, |
| 1:35 | which are important for the life of the mRNA in the cells in the organism. So, |
| 1:43 | then finally, look at the mRNA aggregation as a way to establish the drug substance |
| 1:51 | product quality. So, here we're looking at the mRNA critical quality attributes, and |
| 1:59 | as you can see here, mRNA contains a 5' cap, which is a highly methylated chemical structure, |
| 2:06 | we'll see in the coming slides. And the cap in itself determines the mRNA's efficacy, the |
| 2:11 | translational efficiency there, because this is where the transcription factors bind and express |
| 2:17 | the protein. Translation factors bind and express the protein. So, the amount of cap highly influences |
| 2:24 | the mRNA's expression. And also, it differentiates the host endogenous mRNA from those of the |
| 2:31 | pathogens. So, one of the early discoveries was the importance of the cap in having the synthetic |
| 2:36 | mRNAs to make them useful as vaccines and as well as therapies. Then this is followed by the |
| 2:43 | 3' UTR region, untranslated region, which is also a regulatory, has contained several regulatory |
| 2:49 | elements, followed by the open reading frame or decoding sequence, as well as the 3' UTR, |
| 2:57 | or untranslated region. And we need to understand the sequence integrity of this in order to |
| 3:03 | establish the sequence integrity of the translated protein as well. Finally, there is a poly(A) tail |
| 3:09 | here that determines it is important in mRNA translocation, as well as mRNA life is also |
| 3:17 | heavily influenced by the length of the poly(A) tail itself. So, we'll look at methods to understand |
| 3:23 | each one of these critical quality attributes of the mRNA. And here we are looking at the |
| 3:30 | overview of the workflow that's used in this present set of experiments here, |
| 3:34 | which starts with heat denaturing the mRNA, and then in the presence of urea as a denaturing, |
| 3:40 | then digesting it down to smaller oligonucleotides that are much more amenable to mass spectrometry analysis |
| 3:47 | using RNase 4. And since RNase 4 cleaves between the U and the A or U and G, |
| 3:54 | and leaves the 3' phosphate, we are incorporating a T4 polynucleotide kinase that removes the 3' |
| 4:00 | phosphate, as well as 2'-3' cyclic phosphates that are formed during the digestion process. |
| 4:05 | Now, this results in a much more simplified hydroxylated pool, as opposed to having a |
| 4:11 | phosphate and a cyclic phosphate combination that generates a much more complex pool of |
| 4:17 | shorter oligos. Now, we're going to subject this to LC-MS/MS. First, we're going to, you know, |
| 4:23 | use good chromatography on our ion-pair reverse phase system using the biotin oligo column, |
| 4:28 | as you'll see, and then couple that to a high-resolution mass spectrometry, |
| 4:32 | CyX-Xenotop 7600 instrument. So, this is the comprehensive overview of the workflow. |
| 4:39 | Now, let's take a little bit of a closer look at the type of enzymes and how we would choose |
| 4:45 | various nucleases for the mRNA characterization workflow. If I were to use human RNase 4, |
| 4:54 | that cleaves between U and A and U and G, that's the 3' to U, followed by A or G, then I actually, |
| 5:02 | in case of EGFP mRNA example that we'll be looking at in this study, you actually end up generating a |
| 5:10 | nice, decent-sized oligonucleotide, 18 nucleotides long without the cap and about |
| 5:17 | 19 nucleotides long with the cap, which is perfectly useful for mass spectrometry-based sequencing |
| 5:23 | and quantification. But on another hand, if I were to use an RNase T1, which cleaves the 3' terminal |
| 5:29 | to G residues, I'll end up generating really short oligonucleotide fragments that are not as |
| 5:36 | useful for sequencing as well as quantifying applications. On the extreme end, if I were to |
| 5:40 | use the E. coli MASF, that actually cleaves 5' to the ACA triplets, I'll end up generating about |
| 5:47 | a 90-nucleotide oligonucleotide, which makes it very difficult to do sequencing using mass spectrometry. |
| 5:55 | So, all in all, it's really important to choose the right type of nuclease based on your |
| 6:01 | understanding of the mRNA. In the present study, we are going to be using humanized RNase 4 in |
| 6:06 | combination with T4 polynucleotide kinase to remove the phosphates formed on the 3'. |
| 6:15 | Here is the sequence of the, I mean, mRNAs can be complex, can be a variety of lines, |
| 6:20 | but for the present study, we are using the EGFP enhanced green fluorescent protein sequence. |
| 6:26 | Then it's about 908 nucleotides long with a mass of about 294,000 Daltons. And as you can see, |
| 6:34 | if you were to use RNase 4, human RNase 4, it cleaves between the U and G, as well as U and |
| 6:42 | A residues, you'll generate a nice 5-terminal fragment with a cap on it that will allow you to |
| 6:48 | do sequencing, MS/MS based sequencing, as well as do good quantification using mass spectrometry |
| 6:54 | and chromatography. On the poly(A) tail side, you actually, you can do cleavage between this U and |
| 7:02 | A and you'll end up with the idea of a poly(A) tail length that you can analyze by HPLC mass |
| 7:10 | spectrometry as well. Now, obviously, good chromatography and choosing the appropriate |
| 7:18 | column for the type of analyte you're working with is going to be very important. And all |
| 7:23 | your nucleotide HPLC columns' requirements can be very different from those of proteins and small |
| 7:28 | molecules. And in this regard, Biosyn Phenomenex offers our Biosyn oligo HPLC column, which is |
| 7:35 | based on our core shell technology that has a solid impermeable inner core and a porous outer |
| 7:41 | core. And the porous outer core is the one that's responsible for separation. So this is a C18 |
| 7:46 | column that incorporates a hybrid particle technology that offers extreme pH stability. |
| 7:52 | It comes in our BioTI titanium hardware to reduce sample loss and non-specific binding. |
| 7:58 | And it's stable up to pH 12, which is very important when you're working with oligonucleotides. |
| 8:04 | In addition, some of the offerings out there for the oligonucleotide columns are based on |
| 8:10 | fully porous particles. And fully porous particles, since they have a longer diffusion path, result in |
| 8:16 | greater band broadening. The core shell particles, on the other hand, can have shorter |
| 8:24 | diffusion paths, as a result, give you higher efficiency as well as higher resolution. |
| 8:32 | Now, we'll be looking at three different things. One is the 5' cap. The second is the |
| 8:38 | sequence integrity. And the third one is the poly(A) distributions. And since all these three different |
| 8:43 | studies require three different types of mass spectrometry-based experiments, we've chosen the 7600 |
| 8:50 | ZenoTOF here. This offers, for the cap, it offers the MRM-based quantification abilities, |
| 8:58 | along with high-resolution measurements, which uses accurate quantification here. |
| 9:03 | And for the sequencing capabilities, it offers a data-dependent acquisition that will allow us to |
| 9:10 | get a complete, good sequence coverage for the mRNA. And for the poly(A) length distribution, |
| 9:18 | we have the accurate mass measurements that will give us information about the |
| 9:23 | poly(A) heterogeneity itself. Now, let's get a little bit deeper into the |
| 9:29 | mRNA cap characterization. Like I mentioned before, the mRNA 5' cap is very important to ensure |
| 9:36 | accurate translation of mRNA, as well as efficacy. And it also differentiates the |
| 9:42 | host endogenous nucleic acids from those of the pathogens. Since our nucleic acids from virus and |
| 9:48 | bacteria don't have the 5' cap, that's how our immune system can differentiate those from the |
| 9:53 | endogenous mRNA molecules. And the 5' cap comprises of N7-methylguanosine that is linked |
| 10:02 | via this triphosphate linkage, a 5' triphosphate linkage to the first nucleotide of |
| 10:08 | the mRNA. And in some cases, there could be a free hydroxyl in the 2' of the first nucleotide. We |
| 10:15 | call that a cap 0. And in cases where there is a methylation of the 2', we call that cap 1. |
| 10:21 | And in the present study, we'll be focusing on the cap 0, which is part of the EGFP mRNA |
| 10:28 | we use in these experiments. Now, let's take a closer look at how we calculate the percent cap |
| 10:36 | efficiency. So this is a structure here. Obviously, if you were using human RNase 4, RNase 4, |
| 10:43 | like I mentioned before, it cleaves between the U and G residues. Since we are using T4 PNK, |
| 10:48 | that's the polynucleotide kinase T4, that removes the phosphate that we end up with is about 18 or |
| 10:54 | 19 more nucleotides long, depending on whether it's uncapped or capped. And you can also end up |
| 11:00 | with various other degradants as shown here, and these are the accurate masses. So we'll incorporate |
| 11:06 | a combination of MRM and accuracy measurements to understand the levels of the cap versus uncapped |
| 11:14 | that's present in the samples here. And we're going to use this formula here for calculating the |
| 11:19 | capping efficiency in the mRNA samples. Now, a good analysis, mass spectrometry analysis, |
| 11:29 | starts with good chromatography, and that's what we're seeing here. Running these samples |
| 11:34 | in MRM mode on our bioanalytical column, we can get a nice separation between the cap 0, |
| 11:40 | which is actually capped, and there's no cap, right? Remember, the cap 0 is actually non-methylated |
| 11:46 | bipyramid cap. So, and the no cap is eluding around 41.5 minutes, and the fully, |
| 11:53 | the cap mRNA is eluding around 45 minutes. And for good, robust chromatography is important, |
| 12:00 | so we can see across the replicates we are having very consistent retention times, |
| 12:06 | giving us confidence in the robustness of the method. And also, when looking at the |
| 12:11 | peak areas for the cap versus uncapped across various replicates, we also have very |
| 12:15 | consistent results, and that gives us confidence in the robustness of our method as well. |
| 12:22 | Here, we are looking at the, you know, the accurate mass data of the no cap sequence |
| 12:30 | that's generated from the RNase T4 digestion. This is the deconvoluted spectral data. In other words, |
| 12:36 | it's a neutral mass data, and you can see a nice isotopic resolution, even at an 18 |
| 12:43 | oligonucleotide here, which is going to be very important to understand the, |
| 12:48 | to have a more precise understanding of the sequence as well. On the right, we are seeing |
| 12:54 | the CID spectral data, and these blue L's represent the fragments that are generated, |
| 13:00 | the five-prime fragments that are generated due to the CID in the collision cell, and the red |
| 13:07 | L's indicate the three-prime fragments. So, if you're looking at this, you know, either by the |
| 13:12 | red or the blue themselves, that is the five-prime fragments or the three-prime fragments by |
| 13:16 | themselves, don't give us complete sequence information, but if you were to combine these |
| 13:21 | two data sets, you actually get a complete sequence coverage. So, we're not only able to |
| 13:27 | quantify the mRNA, but also are able to use the CID capabilities of this instrument to |
| 13:34 | get a complete sequence of this five-prime capped oligo. Here, we're looking at, going back here, |
| 13:42 | the, we estimated the uncapped oligo to be about 14 percent, and using, along the same lines here, |
| 13:50 | we're looking at the M7G capped oligo here, that's a 19 mer oligo, and we also get a nice |
| 13:57 | complete sequence coverage for this particular sequence, so it gives us confidence in our |
| 14:02 | results. And using this, using the MRM experiments, we estimate the amount of the |
| 14:09 | capped oligo to be about 85 percent. Let's move on to looking at the sequence mapping |
| 14:18 | information, that's for the sequence coverage, because we need to establish the primary |
| 14:22 | structure identity, which gives us confidence in the sequence of the protein that are being |
| 14:27 | expressed. One of the things to keep in mind is when you were to use, if you were to use nucleases, |
| 14:32 | and since oligonucleotides only contain, mRNA oligos contain only four nucleotide AUGC, |
| 14:39 | we can end up with the multiple sequences variants that will have the same exact mass, so |
| 14:44 | it's important to have a good chromatography to be able to separate these sequence variants in |
| 14:49 | order to establish their identity and get good sequence coverage. Here we are looking at the |
| 14:56 | RNase T4 digested mRNA that's run on our Biosyn oligo column using ion-pair reverse phase |
| 15:02 | chromatography method, using hexyl propanol and isopropylamine as mobile phase modifiers in a |
| 15:09 | water acetonitrile gradient system. And you know, if you're a chromatographer, you can really |
| 15:14 | appreciate the quality of the data we are getting here. Well-separated peaks with the nice, you know, |
| 15:20 | sharp peaks were well-separated, and even for a complex mixture like this, you're getting a good |
| 15:25 | distribution of these, all these peaks. On the later end part of the chromatogram here, |
| 15:30 | you see that the poly(A) tail and all these little bumps, as well as the |
| 15:36 | the big peak here is the poly(A) tail, as we will see in the later slides. |
| 15:40 | Now, if you were to subject this to data-dependent acquisition on the |
| 15:45 | CyX 7600 Xenotop mass spectrometer, we can get the CID spectral data, which will allow us to |
| 15:53 | establish the sequence of all of this. And from this present experiment using the RNase 4 |
| 16:00 | nuclease and the nuclease, we are able to get about 96% coverage for this. And obviously, I'm not |
| 16:06 | including the poly(A) tail in this 96% number. That will be discussed later separately, and that's a |
| 16:12 | separate set of experiments. And as I mentioned before, the short nucleotides can have slightly |
| 16:18 | different sequences, but would have the same exact mass, in which case, on the mass spectrometer, |
| 16:23 | they're indistinguishable because they have the same exact mass. There, it becomes important to |
| 16:28 | chromatographically separate these short sequences and then use the mass spectrometry to get their sequence |
| 16:34 | identity. And what we're seeing here is three oligonucleotides that have the same exact mass, |
| 16:39 | but have different base locations. We call them the sequence isomers. And these sequence isomers |
| 16:47 | are nicely separated in our Biosyn oligo column, and we also have very consistent retention times |
| 16:53 | across replicates as well, giving us confidence in the robustness of the method. So, taking a |
| 16:58 | little bit of a deeper look at that, the data we've shown in the previous slides, we can see |
| 17:04 | in the previous slide here, we have peaks one to three that have the same exact mass, that are |
| 17:09 | indistinguishable in the mass spectrometry. But by separating them chromatographically, we pick |
| 17:14 | negatively charged two charges, two ions for each one of these peaks, and subject this to CID |
| 17:20 | fragmentation, and we can get the complete sequence information for each one of these. So, thereby, we |
| 17:26 | are improving overall sequence coverage of our mRNA using this nuclease digestion, and we are able to |
| 17:32 | get 96% sequence coverage. Now, with that, you know, little bit of data on the sequence coverage, |
| 17:41 | let's move on to the poly(A) tail length distribution and heterogeneity itself. Like I said, |
| 17:47 | the poly(A) tail is very important in enhancing the life of the mRNA itself, and also in its |
| 17:55 | cellular translocation. So, it's an important attribute to measure in your drug substance as |
| 18:03 | well as drug product. Now, for this, we're still using our RNase 4 that is coming between this U |
| 18:12 | and A, and generating a long sequence, because there are various degradants and varying different |
| 18:19 | poly(A) tail lengths. And taking a closer look at the later part of the chromatogram that we saw a |
| 18:23 | few slides ago, and here you see a lot of bumps here, which we'll zoom into in the next |
| 18:29 | slide, that are actually coming from the different poly(A) chain lengths. And this big |
| 18:38 | peak has multiple poly(A) chains as well. And if you were to take a look at the MS data, it looks |
| 18:43 | extremely complex, but if you were to do a deconvolution on the spectral m over z data, |
| 18:49 | and convert this to the mass domain, we can see nice even spacing that corresponds to |
| 18:55 | an adenosine nucleotide giving us confidence that this is the poly(A) tail. Here, we're looking at |
| 19:01 | the power of the chromatography itself to separate all these various poly(A) tail lengths, and we can |
| 19:08 | nicely separate up to 61 for the poly(A) tail length. But as you get to longer and longer |
| 19:15 | oligos, the difference between the N-1 and full lengths is small, and as a result, the separation |
| 19:21 | becomes more difficult. But nonetheless, we can still use the mass spectrometry to deconvolute this and |
| 19:26 | get additional information. So for chromatographically, we are able to separate up to |
| 19:31 | 61 nucleotides in this. And if we were to take this big peak here and perform deconvolution, |
| 19:38 | we see that the peak spacing is equal to that of an adenosine, telling us that this is, again, |
| 19:44 | the poly(A) tail. And we were able to detect a poly(A) tail length of up to 18 nucleotides in |
| 19:51 | this recent study. Now let's finally take a look at an additional quality attribute, |
| 19:57 | which is the aggregation of mRNA itself, which… |
| 20:00 | poly(A) tail. Here, we're looking at the power of the chromatography itself to separate |
| 20:05 | all these various poly(A) tail lengths, and we can nicely separate up to 61 for the poly(A) |
| 20:12 | tail length. But as you get to longer and longer oligos, the difference between the |
| 20:17 | N-1 and full lengths is small, and as a result, the separation becomes more difficult. But |
| 20:22 | nonetheless, we can still use the mass spectrometry to deconvolute this and get additional information. |
| 20:28 | So, for chromatographically, we are able to separate up to 61 nucleotides in this. |
| 20:33 | And if we were to take this big peak here and perform deconvolution, we see that the |
| 20:39 | peak spacing is equal to that of an adenosine, telling us that this is, again, the poly(A) tail. |
| 20:48 | And we were able to detect a poly(A) tail length of up to 18 nucleotides in this recent study. |
| 20:53 | Now, let's finally take a look at an additional quality attribute, which is the aggregation of |
| 20:58 | mRNA itself, which, according to USP guidelines, is a product quality one needs to establish in |
| 21:05 | your mRNA samples. Now, obviously, size exclusion chromatography is well-suited for separating |
| 21:13 | mRNA and its aggregates, since the monomers, the dimers, and the trimers are going to be |
| 21:17 | two times, three times, and so on. The molecular weight of the monomeric peak and the size |
| 21:23 | exclusion is going to be a very useful way to separate this. And for this application, |
| 21:28 | we have used our Biosyn-DSeq7 size exclusion column, which is a 700 angstrom pore size column, |
| 21:34 | and it incorporates our BioTI titanium hardware, and it comes in various lengths and dimensions, |
| 21:42 | depending on your requirements. Here, we are looking at the EGFP mRNA that is separated on |
| 21:48 | the Biosyn-DSeq7 HPLC column. And as you can see here, the first peak is coming most likely |
| 21:56 | from the aggregate, and the really tall peak is coming from the monomer. But how do we know this |
| 22:01 | is aggregate? So, if you were to take this sample and heat it, actually, the levels of the aggregate |
| 22:06 | go down, whereas the levels of the monomer go up, suggesting that this is a hydrogen bonding |
| 22:12 | type of aggregation is happening, and heating it to 70 degrees is actually decreasing the aggregate |
| 22:18 | levels, right? And that's great that we were looking at the UV data in the previous slide, |
| 22:24 | but what if I want additional information? So, you can couple your DSeq7 column to something |
| 22:29 | called a multi-angle light scattering detector, or MALS, that can give you molecular weight |
| 22:35 | information as well. So, from this, we can see that the first major peak is the monomer, |
| 22:41 | and the larger aggregates are coming, the dimer, trimer, tetramer, and so on, |
| 22:49 | giving us the confidence that we are detecting the aggregation accurately. |
| 22:54 | In summary, I hope I was able to convince that the phenomenal solutions for mRNA characterization |
| 23:01 | and critical quality attribute determination encompassing the oligo sequence mapping, |
| 23:06 | 5'-cap efficiency, poly(A) tail length distribution, as well as aggregate determination |
| 23:13 | can be very helpful in your day-to-day work. Thank you for your time. |