Phenomenex
Duration: 23:13 Min
Chromatographic Determination of mRNA Critical Quality Attributes
Transcript
0:01 | My name is Ramesh Indarkanti. I'm a Biologics Business Development Manager with Phenomenex. |
0:05 | Thank you for coming. And mRNA is an important drug modality, and it was seen with the recent |
0:12 | COVID vaccine development, which was where rapid deployment and development was possible |
0:17 | because of many unique properties. Also, last week, a Nobel Prize in Physiology and Medicine |
0:23 | was awarded for the development and discovery of mRNA vaccines. So, like any other drug modality |
0:31 | out there, we have to understand some of the critical quality attributes of mRNA for it to |
0:37 | be useful as a drug. And for today's presentation, we'll focus on some of the chromatographic methods |
0:43 | by themselves or coupled with mass spectrometry to understand the critical quality attributes of |
0:49 | the mRNA molecule. Most of this work was carried out by Roxanna, our application scientist, and |
0:55 | I have the opportunity to present it to you guys here. |
0:59 | So, this is the brief overview of the presentation here. We'll just start with an introduction of |
1:04 | the critical quality attributes of mRNA drugs and vaccines, solely focusing on the mRNA molecule |
1:11 | itself and not the LNP component. Then we move on to looking more in depth about the 5' cap |
1:18 | characterization and efficiency, which is important for mRNA's efficacy. Then we look at the ways we |
1:24 | can use enzymatic sequencing as well as mass spectrometry to understand the primary structural |
1:30 | integrity of mRNA. Then we want to talk about the poly(A) length distribution and heterogeneity, |
1:35 | which are important for the life of the mRNA in the cells in the organism. So, |
1:43 | then finally, look at the mRNA aggregation as a way to establish the drug substance |
1:51 | product quality. So, here we're looking at the mRNA critical quality attributes, and |
1:59 | as you can see here, mRNA contains a 5' cap, which is a highly methylated chemical structure, |
2:06 | we'll see in the coming slides. And the cap in itself determines the mRNA's efficacy, the |
2:11 | translational efficiency there, because this is where the transcription factors bind and express |
2:17 | the protein. Translation factors bind and express the protein. So, the amount of cap highly influences |
2:24 | the mRNA's expression. And also, it differentiates the host endogenous mRNA from those of the |
2:31 | pathogens. So, one of the early discoveries was the importance of the cap in having the synthetic |
2:36 | mRNAs to make them useful as vaccines and as well as therapies. Then this is followed by the |
2:43 | 3' UTR region, untranslated region, which is also a regulatory, has contained several regulatory |
2:49 | elements, followed by the open reading frame or decoding sequence, as well as the 3' UTR, |
2:57 | or untranslated region. And we need to understand the sequence integrity of this in order to |
3:03 | establish the sequence integrity of the translated protein as well. Finally, there is a poly(A) tail |
3:09 | here that determines it is important in mRNA translocation, as well as mRNA life is also |
3:17 | heavily influenced by the length of the poly(A) tail itself. So, we'll look at methods to understand |
3:23 | each one of these critical quality attributes of the mRNA. And here we are looking at the |
3:30 | overview of the workflow that's used in this present set of experiments here, |
3:34 | which starts with heat denaturing the mRNA, and then in the presence of urea as a denaturing, |
3:40 | then digesting it down to smaller oligonucleotides that are much more amenable to mass spectrometry analysis |
3:47 | using RNase 4. And since RNase 4 cleaves between the U and the A or U and G, |
3:54 | and leaves the 3' phosphate, we are incorporating a T4 polynucleotide kinase that removes the 3' |
4:00 | phosphate, as well as 2'-3' cyclic phosphates that are formed during the digestion process. |
4:05 | Now, this results in a much more simplified hydroxylated pool, as opposed to having a |
4:11 | phosphate and a cyclic phosphate combination that generates a much more complex pool of |
4:17 | shorter oligos. Now, we're going to subject this to LC-MS/MS. First, we're going to, you know, |
4:23 | use good chromatography on our ion-pair reverse phase system using the biotin oligo column, |
4:28 | as you'll see, and then couple that to a high-resolution mass spectrometry, |
4:32 | CyX-Xenotop 7600 instrument. So, this is the comprehensive overview of the workflow. |
4:39 | Now, let's take a little bit of a closer look at the type of enzymes and how we would choose |
4:45 | various nucleases for the mRNA characterization workflow. If I were to use human RNase 4, |
4:54 | that cleaves between U and A and U and G, that's the 3' to U, followed by A or G, then I actually, |
5:02 | in case of EGFP mRNA example that we'll be looking at in this study, you actually end up generating a |
5:10 | nice, decent-sized oligonucleotide, 18 nucleotides long without the cap and about |
5:17 | 19 nucleotides long with the cap, which is perfectly useful for mass spectrometry-based sequencing |
5:23 | and quantification. But on another hand, if I were to use an RNase T1, which cleaves the 3' terminal |
5:29 | to G residues, I'll end up generating really short oligonucleotide fragments that are not as |
5:36 | useful for sequencing as well as quantifying applications. On the extreme end, if I were to |
5:40 | use the E. coli MASF, that actually cleaves 5' to the ACA triplets, I'll end up generating about |
5:47 | a 90-nucleotide oligonucleotide, which makes it very difficult to do sequencing using mass spectrometry. |
5:55 | So, all in all, it's really important to choose the right type of nuclease based on your |
6:01 | understanding of the mRNA. In the present study, we are going to be using humanized RNase 4 in |
6:06 | combination with T4 polynucleotide kinase to remove the phosphates formed on the 3'. |
6:15 | Here is the sequence of the, I mean, mRNAs can be complex, can be a variety of lines, |
6:20 | but for the present study, we are using the EGFP enhanced green fluorescent protein sequence. |
6:26 | Then it's about 908 nucleotides long with a mass of about 294,000 Daltons. And as you can see, |
6:34 | if you were to use RNase 4, human RNase 4, it cleaves between the U and G, as well as U and |
6:42 | A residues, you'll generate a nice 5-terminal fragment with a cap on it that will allow you to |
6:48 | do sequencing, MS/MS based sequencing, as well as do good quantification using mass spectrometry |
6:54 | and chromatography. On the poly(A) tail side, you actually, you can do cleavage between this U and |
7:02 | A and you'll end up with the idea of a poly(A) tail length that you can analyze by HPLC mass |
7:10 | spectrometry as well. Now, obviously, good chromatography and choosing the appropriate |
7:18 | column for the type of analyte you're working with is going to be very important. And all |
7:23 | your nucleotide HPLC columns' requirements can be very different from those of proteins and small |
7:28 | molecules. And in this regard, Biosyn Phenomenex offers our Biosyn oligo HPLC column, which is |
7:35 | based on our core shell technology that has a solid impermeable inner core and a porous outer |
7:41 | core. And the porous outer core is the one that's responsible for separation. So this is a C18 |
7:46 | column that incorporates a hybrid particle technology that offers extreme pH stability. |
7:52 | It comes in our BioTI titanium hardware to reduce sample loss and non-specific binding. |
7:58 | And it's stable up to pH 12, which is very important when you're working with oligonucleotides. |
8:04 | In addition, some of the offerings out there for the oligonucleotide columns are based on |
8:10 | fully porous particles. And fully porous particles, since they have a longer diffusion path, result in |
8:16 | greater band broadening. The core shell particles, on the other hand, can have shorter |
8:24 | diffusion paths, as a result, give you higher efficiency as well as higher resolution. |
8:32 | Now, we'll be looking at three different things. One is the 5' cap. The second is the |
8:38 | sequence integrity. And the third one is the poly(A) distributions. And since all these three different |
8:43 | studies require three different types of mass spectrometry-based experiments, we've chosen the 7600 |
8:50 | ZenoTOF here. This offers, for the cap, it offers the MRM-based quantification abilities, |
8:58 | along with high-resolution measurements, which uses accurate quantification here. |
9:03 | And for the sequencing capabilities, it offers a data-dependent acquisition that will allow us to |
9:10 | get a complete, good sequence coverage for the mRNA. And for the poly(A) length distribution, |
9:18 | we have the accurate mass measurements that will give us information about the |
9:23 | poly(A) heterogeneity itself. Now, let's get a little bit deeper into the |
9:29 | mRNA cap characterization. Like I mentioned before, the mRNA 5' cap is very important to ensure |
9:36 | accurate translation of mRNA, as well as efficacy. And it also differentiates the |
9:42 | host endogenous nucleic acids from those of the pathogens. Since our nucleic acids from virus and |
9:48 | bacteria don't have the 5' cap, that's how our immune system can differentiate those from the |
9:53 | endogenous mRNA molecules. And the 5' cap comprises of N7-methylguanosine that is linked |
10:02 | via this triphosphate linkage, a 5' triphosphate linkage to the first nucleotide of |
10:08 | the mRNA. And in some cases, there could be a free hydroxyl in the 2' of the first nucleotide. We |
10:15 | call that a cap 0. And in cases where there is a methylation of the 2', we call that cap 1. |
10:21 | And in the present study, we'll be focusing on the cap 0, which is part of the EGFP mRNA |
10:28 | we use in these experiments. Now, let's take a closer look at how we calculate the percent cap |
10:36 | efficiency. So this is a structure here. Obviously, if you were using human RNase 4, RNase 4, |
10:43 | like I mentioned before, it cleaves between the U and G residues. Since we are using T4 PNK, |
10:48 | that's the polynucleotide kinase T4, that removes the phosphate that we end up with is about 18 or |
10:54 | 19 more nucleotides long, depending on whether it's uncapped or capped. And you can also end up |
11:00 | with various other degradants as shown here, and these are the accurate masses. So we'll incorporate |
11:06 | a combination of MRM and accuracy measurements to understand the levels of the cap versus uncapped |
11:14 | that's present in the samples here. And we're going to use this formula here for calculating the |
11:19 | capping efficiency in the mRNA samples. Now, a good analysis, mass spectrometry analysis, |
11:29 | starts with good chromatography, and that's what we're seeing here. Running these samples |
11:34 | in MRM mode on our bioanalytical column, we can get a nice separation between the cap 0, |
11:40 | which is actually capped, and there's no cap, right? Remember, the cap 0 is actually non-methylated |
11:46 | bipyramid cap. So, and the no cap is eluding around 41.5 minutes, and the fully, |
11:53 | the cap mRNA is eluding around 45 minutes. And for good, robust chromatography is important, |
12:00 | so we can see across the replicates we are having very consistent retention times, |
12:06 | giving us confidence in the robustness of the method. And also, when looking at the |
12:11 | peak areas for the cap versus uncapped across various replicates, we also have very |
12:15 | consistent results, and that gives us confidence in the robustness of our method as well. |
12:22 | Here, we are looking at the, you know, the accurate mass data of the no cap sequence |
12:30 | that's generated from the RNase T4 digestion. This is the deconvoluted spectral data. In other words, |
12:36 | it's a neutral mass data, and you can see a nice isotopic resolution, even at an 18 |
12:43 | oligonucleotide here, which is going to be very important to understand the, |
12:48 | to have a more precise understanding of the sequence as well. On the right, we are seeing |
12:54 | the CID spectral data, and these blue L's represent the fragments that are generated, |
13:00 | the five-prime fragments that are generated due to the CID in the collision cell, and the red |
13:07 | L's indicate the three-prime fragments. So, if you're looking at this, you know, either by the |
13:12 | red or the blue themselves, that is the five-prime fragments or the three-prime fragments by |
13:16 | themselves, don't give us complete sequence information, but if you were to combine these |
13:21 | two data sets, you actually get a complete sequence coverage. So, we're not only able to |
13:27 | quantify the mRNA, but also are able to use the CID capabilities of this instrument to |
13:34 | get a complete sequence of this five-prime capped oligo. Here, we're looking at, going back here, |
13:42 | the, we estimated the uncapped oligo to be about 14 percent, and using, along the same lines here, |
13:50 | we're looking at the M7G capped oligo here, that's a 19 mer oligo, and we also get a nice |
13:57 | complete sequence coverage for this particular sequence, so it gives us confidence in our |
14:02 | results. And using this, using the MRM experiments, we estimate the amount of the |
14:09 | capped oligo to be about 85 percent. Let's move on to looking at the sequence mapping |
14:18 | information, that's for the sequence coverage, because we need to establish the primary |
14:22 | structure identity, which gives us confidence in the sequence of the protein that are being |
14:27 | expressed. One of the things to keep in mind is when you were to use, if you were to use nucleases, |
14:32 | and since oligonucleotides only contain, mRNA oligos contain only four nucleotide AUGC, |
14:39 | we can end up with the multiple sequences variants that will have the same exact mass, so |
14:44 | it's important to have a good chromatography to be able to separate these sequence variants in |
14:49 | order to establish their identity and get good sequence coverage. Here we are looking at the |
14:56 | RNase T4 digested mRNA that's run on our Biosyn oligo column using ion-pair reverse phase |
15:02 | chromatography method, using hexyl propanol and isopropylamine as mobile phase modifiers in a |
15:09 | water acetonitrile gradient system. And you know, if you're a chromatographer, you can really |
15:14 | appreciate the quality of the data we are getting here. Well-separated peaks with the nice, you know, |
15:20 | sharp peaks were well-separated, and even for a complex mixture like this, you're getting a good |
15:25 | distribution of these, all these peaks. On the later end part of the chromatogram here, |
15:30 | you see that the poly(A) tail and all these little bumps, as well as the |
15:36 | the big peak here is the poly(A) tail, as we will see in the later slides. |
15:40 | Now, if you were to subject this to data-dependent acquisition on the |
15:45 | CyX 7600 Xenotop mass spectrometer, we can get the CID spectral data, which will allow us to |
15:53 | establish the sequence of all of this. And from this present experiment using the RNase 4 |
16:00 | nuclease and the nuclease, we are able to get about 96% coverage for this. And obviously, I'm not |
16:06 | including the poly(A) tail in this 96% number. That will be discussed later separately, and that's a |
16:12 | separate set of experiments. And as I mentioned before, the short nucleotides can have slightly |
16:18 | different sequences, but would have the same exact mass, in which case, on the mass spectrometer, |
16:23 | they're indistinguishable because they have the same exact mass. There, it becomes important to |
16:28 | chromatographically separate these short sequences and then use the mass spectrometry to get their sequence |
16:34 | identity. And what we're seeing here is three oligonucleotides that have the same exact mass, |
16:39 | but have different base locations. We call them the sequence isomers. And these sequence isomers |
16:47 | are nicely separated in our Biosyn oligo column, and we also have very consistent retention times |
16:53 | across replicates as well, giving us confidence in the robustness of the method. So, taking a |
16:58 | little bit of a deeper look at that, the data we've shown in the previous slides, we can see |
17:04 | in the previous slide here, we have peaks one to three that have the same exact mass, that are |
17:09 | indistinguishable in the mass spectrometry. But by separating them chromatographically, we pick |
17:14 | negatively charged two charges, two ions for each one of these peaks, and subject this to CID |
17:20 | fragmentation, and we can get the complete sequence information for each one of these. So, thereby, we |
17:26 | are improving overall sequence coverage of our mRNA using this nuclease digestion, and we are able to |
17:32 | get 96% sequence coverage. Now, with that, you know, little bit of data on the sequence coverage, |
17:41 | let's move on to the poly(A) tail length distribution and heterogeneity itself. Like I said, |
17:47 | the poly(A) tail is very important in enhancing the life of the mRNA itself, and also in its |
17:55 | cellular translocation. So, it's an important attribute to measure in your drug substance as |
18:03 | well as drug product. Now, for this, we're still using our RNase 4 that is coming between this U |
18:12 | and A, and generating a long sequence, because there are various degradants and varying different |
18:19 | poly(A) tail lengths. And taking a closer look at the later part of the chromatogram that we saw a |
18:23 | few slides ago, and here you see a lot of bumps here, which we'll zoom into in the next |
18:29 | slide, that are actually coming from the different poly(A) chain lengths. And this big |
18:38 | peak has multiple poly(A) chains as well. And if you were to take a look at the MS data, it looks |
18:43 | extremely complex, but if you were to do a deconvolution on the spectral m over z data, |
18:49 | and convert this to the mass domain, we can see nice even spacing that corresponds to |
18:55 | an adenosine nucleotide giving us confidence that this is the poly(A) tail. Here, we're looking at |
19:01 | the power of the chromatography itself to separate all these various poly(A) tail lengths, and we can |
19:08 | nicely separate up to 61 for the poly(A) tail length. But as you get to longer and longer |
19:15 | oligos, the difference between the N-1 and full lengths is small, and as a result, the separation |
19:21 | becomes more difficult. But nonetheless, we can still use the mass spectrometry to deconvolute this and |
19:26 | get additional information. So for chromatographically, we are able to separate up to |
19:31 | 61 nucleotides in this. And if we were to take this big peak here and perform deconvolution, |
19:38 | we see that the peak spacing is equal to that of an adenosine, telling us that this is, again, |
19:44 | the poly(A) tail. And we were able to detect a poly(A) tail length of up to 18 nucleotides in |
19:51 | this recent study. Now let's finally take a look at an additional quality attribute, |
19:57 | which is the aggregation of mRNA itself, which… |
20:00 | poly(A) tail. Here, we're looking at the power of the chromatography itself to separate |
20:05 | all these various poly(A) tail lengths, and we can nicely separate up to 61 for the poly(A) |
20:12 | tail length. But as you get to longer and longer oligos, the difference between the |
20:17 | N-1 and full lengths is small, and as a result, the separation becomes more difficult. But |
20:22 | nonetheless, we can still use the mass spectrometry to deconvolute this and get additional information. |
20:28 | So, for chromatographically, we are able to separate up to 61 nucleotides in this. |
20:33 | And if we were to take this big peak here and perform deconvolution, we see that the |
20:39 | peak spacing is equal to that of an adenosine, telling us that this is, again, the poly(A) tail. |
20:48 | And we were able to detect a poly(A) tail length of up to 18 nucleotides in this recent study. |
20:53 | Now, let's finally take a look at an additional quality attribute, which is the aggregation of |
20:58 | mRNA itself, which, according to USP guidelines, is a product quality one needs to establish in |
21:05 | your mRNA samples. Now, obviously, size exclusion chromatography is well-suited for separating |
21:13 | mRNA and its aggregates, since the monomers, the dimers, and the trimers are going to be |
21:17 | two times, three times, and so on. The molecular weight of the monomeric peak and the size |
21:23 | exclusion is going to be a very useful way to separate this. And for this application, |
21:28 | we have used our Biosyn-DSeq7 size exclusion column, which is a 700 angstrom pore size column, |
21:34 | and it incorporates our BioTI titanium hardware, and it comes in various lengths and dimensions, |
21:42 | depending on your requirements. Here, we are looking at the EGFP mRNA that is separated on |
21:48 | the Biosyn-DSeq7 HPLC column. And as you can see here, the first peak is coming most likely |
21:56 | from the aggregate, and the really tall peak is coming from the monomer. But how do we know this |
22:01 | is aggregate? So, if you were to take this sample and heat it, actually, the levels of the aggregate |
22:06 | go down, whereas the levels of the monomer go up, suggesting that this is a hydrogen bonding |
22:12 | type of aggregation is happening, and heating it to 70 degrees is actually decreasing the aggregate |
22:18 | levels, right? And that's great that we were looking at the UV data in the previous slide, |
22:24 | but what if I want additional information? So, you can couple your DSeq7 column to something |
22:29 | called a multi-angle light scattering detector, or MALS, that can give you molecular weight |
22:35 | information as well. So, from this, we can see that the first major peak is the monomer, |
22:41 | and the larger aggregates are coming, the dimer, trimer, tetramer, and so on, |
22:49 | giving us the confidence that we are detecting the aggregation accurately. |
22:54 | In summary, I hope I was able to convince that the phenomenal solutions for mRNA characterization |
23:01 | and critical quality attribute determination encompassing the oligo sequence mapping, |
23:06 | 5'-cap efficiency, poly(A) tail length distribution, as well as aggregate determination |
23:13 | can be very helpful in your day-to-day work. Thank you for your time. |