Daan J. A. Crommelin, Robert D. Sindelar and Bernd Meibohm (eds.)Pharmaceutical Biotechnology4th ed. 2013Fundamentals and Applications10.1007/978-1-4614-6486-0_2
© Springer Science+Business Media New York 2013

2. Biophysical and Biochemical Analysis of Recombinant Proteins

Tsutomu Arakawa  and John S. Philo2
(1)
Department of Protein Chemistry, Alliance Protein Laboratories, 3957 Corte Cancion, Thousand Oaks, CA 91360, USA
(2)
Department of Biophysical Chemistry, Alliance Protein Laboratories, 3957 Corte Cancion, Thousand Oaks, CA 91360, USA
 
 
Tsutomu Arakawa
Abstract
For a recombinant protein to become a human therapeutic, its biophysical and biochemical characteristics must be well understood. These properties serve as a basis for comparison of lot-to-lot reproducibility; for establishing the range of conditions to stabilize the protein during production, storage, and shipping; and for identifying characteristics useful for monitoring stability during long-term storage.

Introduction

For a recombinant protein to become a human therapeutic, its biophysical and biochemical characteristics must be well understood. These properties serve as a basis for comparison of lot-to-lot reproducibility; for establishing the range of conditions to stabilize the protein during production, storage, and shipping; and for identifying characteristics useful for monitoring stability during long-term storage.
A number of techniques can be used to determine the biophysical properties of proteins and to examine their biochemical and biological integrity. Where possible, the results of these experiments are compared with those obtained using naturally occurring proteins in order to be confident that the recombinant protein has the desired characteristics of the naturally occurring one.

Protein Structure

Primary Structure

Most proteins which are developed for therapy perform specific functions by interacting with other small and large molecules, e.g., cell-surface receptors, binding proteins, nucleic acids, carbohydrates, and lipids. The functional properties of proteins are derived from their folding into distinct three-dimensional structures. Each protein fold is based on its specific polypeptide sequence in which different amino acids are connected through peptide bonds in a specific way. This alignment of the 20 amino acids, called a primary sequence, has in general all the information necessary for folding into a distinct tertiary structure comprising different secondary structures such as α-helices and β-sheets (see below). Because the 20 amino acids possess different side chains, polypeptides with widely diverse properties are obtained.
All of the 20 amino acids consist of a Cα carbon to which an amino group, a carboxyl group, a hydrogen, and a side chain bind in L configuration (Fig. 2.1). These amino acids are joined by condensation to yield a peptide bond consisting of a carboxyl group of an amino acid joined with the amino group of the next amino acid (Fig. 2.2).
A273058_4_En_2_Fig1_HTML.gif
Figure 2.1 ■ 
Structure of L-amino acids.
A273058_4_En_2_Fig2_HTML.gif
Figure 2.2 ■ 
Structure of peptide bond.
The condensation gives an amide group, NH, at the N-terminal side of Cα and a carbonyl group, C = O, at the C-terminal side. These groups, as well as the amino acyl side chains, play important roles in protein folding. Due to their ability to form hydrogen bonds, they make major energetic contributions to the formation of two important secondary structures, α-helix and β-sheet. The peptide bonds between various amino acids are very much equivalent, however, so that they do not determine which part of a sequence should form an α-helix or β-sheet. Sequence-dependent secondary structure formation is determined by the side chains.
The 20 amino acids commonly found in proteins are shown in Fig. 2.3. They are described by their full names and three- and one-letter codes. Their side chains are structurally different in such a way that at neutral pH, aspartic and glutamic acid are negatively charged and lysine and arginine are positively charged. Histidine is positively charged to an extent that depends on the pH. At pH 7.0, on average, about half of the histidine side chains are positively charged. Tyrosine and cysteine are protonated and uncharged at neutral pH, but become negatively charged above pH 10 and 8, respectively.
A273058_4_En_2_Fig3a_HTML.gifA273058_4_En_2_Fig3b_HTML.gif
Figure 2.3 ■ 
NB (a and b) Structure of 20 amino acids.
Polar amino acids consist of serine, threonine, asparagine, and glutamine, as well as cysteine, while nonpolar amino acids consist of alanine, valine, phenylalanine, proline, methionine, leucine, and isoleucine. Glycine behaves neutrally while cystine, the oxidized form of cysteine, is characterized as hydrophobic. Although tyrosine and tryptophan often enter into polar interactions, they are better characterized as nonpolar, or hydrophobic, as described later.
These 20 amino acids are incorporated into a unique sequence based on the genetic code, as the example in Fig. 2.4 shows. This is an amino acid sequence of granulocyte-colony-stimulating factor (G-CSF), which selectively regulates proliferation and maturation of neutrophils. Although the exact properties of this protein depend on the location of each amino acid and hence the location of each side chain in the three-dimensional structure, the average properties can be estimated simply from the amino acid composition, as shown in Table 2.1, i.e., a list of the total number of each type of amino acid contained in this protein molecule.
A273058_4_En_2_Fig4_HTML.gif
Figure 2.4 ■ 
Amino acid sequence of granulocyte-colony-stimulating factor.
Using the pKa values of these side chains and one amino and carboxyl terminus, one can calculate total charges (positive plus negative charges) and net charges (positive minus negative charges) of a protein as a function of pH, i.e., a titration curve. Since cysteine can be oxidized to form a disulfide bond or can be in a free form, accurate calculation above pH 8 requires knowledge of the status of cysteinyl residues in the protein. The titration curve thus obtained is only an approximation, since some charged residues may be buried and the effective pKa values depend on the local environment of each residue. Nevertheless, the calculated titration curve gives a first approximation of the overall charged state of a protein at a given pH and hence its solution property. Other molecular parameters, such as isoelectric point (pI, where the net charge of a protein becomes zero), molecular weight, extinction coefficient, partial specific volume, and hydrophobicity, can also be estimated from the amino acid composition, as shown in Table 2.1.
The primary structure of a protein, i.e., the sequence of the 20 amino acids, can lead to the three-dimensional structure because the amino acids have diverse physical properties. First, each type of amino acid has the tendency to be more preferentially incorporated into certain secondary structures. The frequencies with which each amino acid is found in α-helix, β-sheet, and β-turn, secondary structures that are discussed later in this chapter, can be calculated as an average over a number of proteins whose three-dimensional structures have been solved. These frequencies are listed in Table 2.2. The β-turn has a distinct configuration consisting of four sequential amino acids and there is a strong preference for specific amino acids in these four positions. For example, asparagine has an overall high frequency of occurrence in a β-turn and is most frequently observed in the first and third position of a β-turn. This characteristic of asparagine is consistent with its side chain being a potential site of N-linked glycosylation. Effects of glycosylation on the biological and physicochemical properties of proteins are extremely important. However, their contribution to structure is not readily predictable based on the amino acid composition.
Table 2.1 ■ 
Amino acid composition and structural parameters of granulocyte-colony-stimulating factor.
Parameter
Value
Molecular weight
18,673
Total number of amino acids
174
1 μg
53.5 picomoles
Molar extinction coefficient
15,820
1 A (280)
1.18 mg/ml
Isoelectric point
5.86
Charge at pH 7
−3.39
Amino acid
Number
% By weight
% By frequency
A Ala
19
7.23
10.92
C Cys
5
2.76
2.87
D Asp
4
2.47
2.30
E Glu
9
6.22
5.17
F Phe
6
4.73
3.45
G Gly
14
4.28
8.05
H His
5
3.67
2.87
1 Me
4
2.42
2.30
K Lys
4
2.75
2.30
L Leu
33
20.00
18.97
M Met
3
2.11
1.72
N Asn
0
0.00
0.00
P Pro
13
6.76
7.47
Q GIn
17
11.66
9.77
R Arg
5
4.18
2.87
S Ser
14
6.53
8.05
T Thr
7
3.79
4.02
V Val
7
3.71
4.02
W Trp
2
1.99
1.15
Y Tyr
3
2.62
1.72
Table 2.2 ■ 
Frequency of occurrence of 20 amino acids in α-helix, β-sheet, and β-turn.
α-Helix
β-Sheet
β-Turn
β-Turn position 1
β-Turn position 2
β-Turn position 3
β-Turn position 4
Glu
1.51
Val
1.70
Asn
1.56
Asn
0.161
Pro
0.301
Asn
0.191
Trp
0.167
Met
1.45
Lie
1.60
Gly
1.56
Cys
0.149
Ser
0.139
Gly
0.190
Gly
0.152
Ala
1.42
Tyr
1.47
Pro
1.52
Asp
0.147
Lys
0.115
Asp
0.179
Cys
0.128
Leu
1.21
Phe
1.38
Asp
1.46
His
0.140
Asp
0.110
Ser
0.125
Tyr
0.125
Lys
1.16
Trp
1.37
Ser
1.43
Ser
0.120
Thr
0.108
Cys
0.117
Ser
0.106
Phe
1.13
Leu
1.30
Cys
1.19
Pro
0.102
Arg
0.106
Tyr
0.114
Gin
0.098
Gin
1.11
Cys
1.19
Tyr
1.14
Gly
0.102
Gin
0.098
Arg
0.099
Lys
0.095
Trp
1.08
Thr
1.19
Lys
1.01
Thr
0.086
Gly
0.085
His
0.093
Asn
0.091
IIe
1.08
Gin
1.10
Gin
0.98
Tyr
0.082
Asn
0.083
Glu
0.077
Arg
0.085
Val
1.06
Met
1.05
Thr
0.96
Trp
0.077
Met
0.082
Lys
0.072
Asp
0.081
Asp
1.01
Arg
0.93
Trp
0.96
Gin
0.074
Ala
0.076
Tyr
0.065
Thr
0.079
His
1.00
Asn
0.89
Arg
0.95
Arg
0.070
Tyr
0.065
Phe
0.065
Leu
0.070
Arg
0.98
His
0.87
His
0.95
Met
0.068
Glu
0.060
Trp
0.064
Pro
0.068
Thr
0.83
Ala
0.83
Glu
0.74
Val
0.062
Cys
0.053
Gin
0.037
Phe
0.065
Ser
0.77
Ser
0.75
Ala
0.66
Leu
0.061
Val
0.048
Leu
0.036
Glu
0.064
Cys
0.70
Gly
0.75
Met
0.60
Ala
0.060
His
0.047
Ala
0.035
Ala
0.058
Tyr
0.69
Lys
0.74
Phe
0.60
Phe
0.059
Phe
0.041
Pro
0.034
IIe
0.056
Asn
0.67
Pro
0.55
Leu
0.59
Glu
0.056
IIe
0.034
Val
0.028
Met
0.055
Pro
0.57
Asp
0.54
Val
0.50
Lys
0.055
Leu
0.025
Met
0.014
His
0.054
Gly
0.57
Glu
0.37
IIe
0.47
IIe
0.043
Trp
0.013
IIe
0.013
Val
0.053
Taken and edited from Chou PY, Fasman GD (1978) Empirical predictions of protein conformation. Ann Rev Biochem 47: 251–276 with permission from Annual Reviews, Inc.
Based on these frequencies, one can predict for particular polypeptide segments which type of secondary structure they are likely to form. As shown in Fig. 2.5a, there are a number of methods developed to predict the secondary structure from the primary sequence of the proteins. Using G-CSF (Fig. 2.5b) as an example, regions of α-helix, β-sheets, turns, hydrophilicity, and antigen sites can be suggested.
A273058_4_En_2_Fig5a_HTML.gifA273058_4_En_2_Fig5b_HTML.gif
Figure 2.5 ■ 
(a) Predicted secondary structure of granulocyte-colony-stimulating factor. Obtained using a program “DNASTAR” (DNASTAR Inc., Madison, WI). (b) Secondary structure of Filgrastim (recombinant G-CSF). Filgrastim is a 175-amino acid polypeptide. Its four antiparallel alpha helices (A, B, C, and D) and short 3-to-10 type helix (310) form a helical bundle. The two biologically active sites (α and αL) are remote from modifications at the N-terminus of the α-helix and the sugar chain attached to loops C–D. Note: Filgrastim is not glycosylated; the sugar chain is included to illustrate its location in endogenous G-CSF.
Another property of amino acids, which impacts on protein folding, is the hydrophobicity of their side chains. Although nonpolar amino acids are basically hydrophobic, it is important to know how hydrophobic they are. This property has been determined by measuring the partition coefficient or solubility of amino acids in water and organic solvents and normalizing such parameters relative to glycine. Relative to the side chain of glycine, a single hydrogen, such normalization shows how strongly the side chains of nonpolar amino acids prefer the organic phase to the aqueous phase. A representation of such measurements is shown in Table 2.3. The values indicate that the free energy increases as the side chain of tryptophan and tyrosine are transferred from an organic solvent to water and that such transfer is thermodynamically unfavorable. Although it is unclear how comparable the hydrophobic property is between an organic solvent and the interior of protein molecules, the hydrophobic side chains favor clustering together, resulting in a core structure with properties similar to an organic solvent. These hydrophobic characteristics of nonpolar amino acids and hydrophilic characteristics of polar amino acids generate a partition of amino acyl residues into a hydrophobic core and hydrophilic surface, resulting in overall folding.
Table 2.3 ■ 
Hydrophobicity scale: transfer free energies of amino acid side chains from organic solvent to water.
Amino acid side chain
Cal/mol
Tryptophan
3,400
Norleucine
2,600
Phenylalanine
2,500
Tyrosine
2,300
Dihydroxyphenylalanine
1,800
Leucine
1,800
Valine
1,500
Methionine
1,300
Histidine
500
Alanine
500
Threonine
400
Serine
−300
Taken from Nozaki Y, Tanford C (1971) The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. J Biol Chem 246:2211–2217 with permission from American Society of Biological Chemists

Secondary Structure

α-Helix

Immediately evident in the primary structure of a protein is that each amino acid is linked by a peptide bond. The amide, NH, is a hydrogen donor and the carbonyl, C = O, is a hydrogen acceptor, and they can form a stable hydrogen bond when they are positioned in an appropriate configuration of the polypeptide chain. Such structures of the polypeptide chain are called secondary structure. Two main structures, α-helix and β-sheet, accommodate such stable hydrogen bonds. The main chain forms a right-handed helix, because only the L-form of amino acids is in proteins and makes one turn per 3.6 residues. The overall length of α-helices can vary widely. Figure 2.6 shows an example of a short α-helix. In this case, the C = O group of residue 1 forms a hydrogen bond to the NH group of residue 5 and C = O group of residue 2 forms a hydrogen bond with the NH group of residue 6. Thus, at the start of an α-helix, four amide groups are always free, and at the end of an α-helix, four carboxyl groups are also free. As a result, both ends of an α-helix are highly polar.
A273058_4_En_2_Fig6_HTML.gif
Figure 2.6 ■ 
Schematic illustration of the structure of α-helix.
Moreover, all the hydrogen bonds are aligned along the helical axis. Since both peptide NH and C = O groups have electric dipole moments pointing in the same direction, they will add to a substantial dipole moment throughout the entire α-helix, with the negative partial charge at the C-terminal side and the positive partial charge at the N-terminal side.
The side chains project outward from the α-helix. This projection means that all the side chains surround the outer surface of an α-helix and interact both with each other and with side chains of other regions which come in contact with these side chains. These interactions, so-called long-range interactions, can stabilize the α-helical structure and help it to act as a folding unit. Often an α-helix serves as a building block for the three-dimensional structure of globular proteins by bringing hydrophobic side chains to one side of a helix and hydrophilic side chains to the opposite side of the same helix. Distribution of side chains along the α-helical axis can be viewed using the helical wheel. Since one turn in an α-helix is 3.6 residues long, each residue can be plotted every 360°/3.6 = 100° around a circle (viewed from the top of α-helix), as shown in Fig. 2.7. Such a plot shows the projection of the position of the residues onto a plane perpendicular to the helical axis. One of the predicted helices in erythropoietin is shown in Fig. 2.7, using an open circle for hydrophobic side chains and an open rectangle for hydrophilic side chains. It becomes immediately obvious that one side of the α-helix is highly hydrophobic, suggesting that this side forms an internal core, while the other side is relatively hydrophilic and is hence most likely exposed to the surface. Since many biologically important proteins function by interacting with other macromolecules, the information obtained from the helical wheel is extremely useful. For example, mutations of amino acids in the solvent-exposed side may lead to identification of regions responsible for biological activity, while mutations in the internal core may lead to altered protein stability.
A273058_4_En_2_Fig7_HTML.gif
Figure 2.7 ■ 
Helical wheel analysis of erythropoietin sequence, from His94 to Ala111 (Elliott S, personal communication, 1990).

β-Sheet

The second major secondary structural element found in proteins is the β-sheet. In contrast to the α-helix, which is built up from a continuous region with a peptide hydrogen bond linking every fourth amino acid, the β-sheet is comprised of peptide hydrogen bonds between different regions of the polypeptide that may be far apart in sequence. β-strands can interact with each other in one of the two ways shown in Fig. 2.8, i.e., either parallel or antiparallel. In a parallel β-sheet, each strand is oriented in the same direction with peptide hydrogen bonds formed between the strands, while in antiparallel β-sheets, the polypeptide sequences are oriented in the opposite direction. In both structures, the C = O and NH groups project into opposite sides of the polypeptide chain, and hence, a β-strand can interact from either side of that particular chain to form peptide hydrogen bonds with adjacent strands. Thus, more than two β-strands can contact each other either in a parallel or in an antiparallel manner, or even in combination. Such clustering can result in all the β-strands lying in a plane as a sheet. The β-strands which are at the edges of the sheet have unpaired alternating C = O and NH groups.
A273058_4_En_2_Fig8_HTML.gif
Figure 2.8 ■ 
Schematic illustration of the structure of antiparallel (left side) and parallel (right side) β-sheet. Arrow indicates the direction of amino acid sequence from the N-terminus to C-terminus.
Side chains project perpendicularly to this plane in opposite directions and can interact with other side chains within the same β-sheet or with other regions of the molecule, or may be exposed to the solvent.
In almost all known protein structures, β-strands are right-handed twisted. This way, the β-strands adapt into widely different conformations. Depending on how they are twisted, all the side chains in the same strand or in different strands do not necessarily project in the same direction.

Loops and Turns

Loops and turns form more or less linear structures and interact with each other to form a folded three-dimensional structure. They are comprised of an amino acid sequence which is usually hydrophilic and exposed to the solvent. These regions consist of β-turns (reverse turns), short hairpin loops, and long loops. Many hairpin loops are formed to connect two antiparallel β-strands.
As shown in Fig. 2.5a, the amino acid sequences which form β-turns are relatively easy to predict, since turns must be present periodically to fold a linear sequence into a globular structure. Amino acids found most frequently in the β-turn are usually not found in α-helical or β-sheet structures. Thus, proline and glycine represent the least-observed amino acids in these typical secondary structures. However, proline has an extremely high frequency of occurrence at the second position in the β-turn, while glycine has a high preference at the third and fourth position of a β-turn.
Although loops are not as predictable as β-turns, amino acids with high frequency for β-turns also can form a long loop. Even though difficult to predict, loops are an important secondary structure, since they form a highly solvent-exposed region of the protein molecules and allow the protein to fold onto itself.

Tertiary Structure

Combination of the various secondary structures in a protein results in its three-dimensional structure. Many proteins fold into a fairly compact, globular structure.
The folding of a protein molecule into a distinct three-dimensional structure determines its function. Enzyme activity requires the exact coordination of catalytically important residues in the three-dimensional space. Binding of antibody to antigen and binding of growth factors and cytokines to their receptors all require a distinct, specific surface for high-affinity binding. These interactions do not occur if the tertiary structures of antibodies, growth factors, and cytokines are altered.
A unique tertiary structure of a protein can often result in the assembly of the protein into a distinct quaternary structure consisting of a fixed stoichiometry of protein chains within the complex. Assembly can occur between the same proteins or between different polypeptide chains. Each molecule in the complex is called a subunit. Actin and tubulin self-associate into F-actin and microtubule, while hemoglobin is a tetramer consisting of two α- and two β-subunits. Among the cytokines and growth factors, interferon-γ is a homodimer, while platelet-derived growth factor is a homodimer of either A or B chains or a heterodimer of the A and B chain. The formation of a quaternary structure occurs via non-covalent interactions or through disulfide bonds between the subunits.

Forces

Interactions occurring between chemical groups in proteins are responsible for formation of their specific secondary, tertiary, and quaternary structures. Either repulsive or attractive interactions can occur between different groups. Repulsive interactions consist of steric hindrance and electrostatic effects. Like charges repel each other and bulky side chains, although they do not repel each other, cannot occupy the same space. Folding is also against the natural tendency to move toward randomness, i.e., increasing entropy. Folding leads to a fixed position of each atom and hence a decrease in entropy. For folding to occur, this decrease in entropy, as well as the repulsive interactions, must be overcome by attractive interactions, i.e., hydrophobic interactions, hydrogen bonds, electrostatic attraction, and van der Waals interactions. Hydration of proteins, discussed in the next section, also plays an important role in protein folding.
These interactions are all relatively weak and can be easily broken and formed. Hence, each folded protein structure arises from a fine balance between these repulsive and attractive interactions. The stability of the folded structure is a fundamental concern in developing protein therapeutics.

Hydrophobic Interactions

The hydrophobic interaction reflects a summation of the van der Waals attractive forces among nonpolar groups in the protein interior, which change the surrounding water structure necessary to accommodate these groups if they become exposed. The transfer of nonpolar groups from the interior to the surface requires a large decrease in entropy so that hydrophobic interactions are essentially entropically driven. The resulting large positive free energy change prevents the transfer of nonpolar groups from the largely sheltered interior to the more solvent-exposed exterior of the protein molecule. Thus, nonpolar groups preferentially reside in the protein interior, while the more polar groups are exposed to the surface and surrounding environment. The partitioning of different amino acyl residues between the inside and outside of a protein correlates well with the hydration energy of their side chains, that is, their relative affinity for water.

Hydrogen Bonds

The hydrogen bond is ionic in character since it depends strongly on the sharing of a proton between two electronegative atoms (generally oxygen and nitrogen atoms). Hydrogen bonds may form either between a protein atom and a water molecule or exclusively as protein intramolecular hydrogen bonds. Intramolecular interactions can have significantly more favorable free energies (because of entropic considerations) than intermolecular hydrogen bonds, so the contribution of all hydrogen bonds in the protein molecule to the stability of protein structures can be substantial. In addition, when the hydrogen bonds occur in the interior of protein molecules, the bonds become stronger due to the hydrophobic environment.

Electrostatic Interactions

Electrostatic interactions occur between any two charged groups. According to Coulomb’s law, if the charges are of the same sign, the interaction is repulsive with an increase in energy, but if they are opposite in sign, it is attractive, with a lowering of energy. Electrostatic interactions are strongly dependent upon distance, according to Coulomb’s law, and inversely related to the dielectric constant of the medium. Electrostatic interactions are much stronger in the interior of the protein molecule because of a lower dielectric constant. The numerous charged groups present on protein molecules can provide overall stability by the electrostatic attraction of opposite charges, for example, between negatively charged carboxyl groups and positively charged amino groups. However, the net effects of all possible pairs of charged groups must be considered. Thus, the free energy derived from electrostatic interactions is actually a property of the whole structure, not just of any single amino acid residue or cluster.

Van der Waals Interactions

Weak van der Waals interactions exist between atoms (except the bare proton), whether they are polar or nonpolar. They arise from net attractive interactions between permanent dipoles and/or induced (temporary and fluctuating) dipoles. However, when two atoms approach each other too closely, the repulsion between their electron clouds becomes strong and counterbalances the attractive forces.

Hydration

Water molecules are bound to proteins internally and externally. Some water molecules occasionally occupy small internal cavities in the protein structure and are hydrogen bonded to peptide bonds and side chains of the protein and often to a prosthetic group, or cofactor, within the protein. The protein surface is large and consists of a mosaic of polar and nonpolar amino acids, and it binds a large number of water molecules, i.e., it is hydrated, from the surrounding environment. As described in the previous section, water molecules trapped in the interior of protein molecules are bound more tightly to hydrogen-bonding donors and acceptors because of a lower dielectric constant.
Solvent around the protein surface clearly has a general role in hydrating peptide and side chains but might be expected to be rather mobile and nonspecific in its interactions. Well-ordered water molecules can make significant contributions to protein stability. One water molecule can hydrogen bond to two groups distant in the primary structure on a protein molecule, acting as a bridge between these groups. Such a water molecule may be highly restricted in motion and can contribute to the stability, at least locally, of the protein, since such tight binding may exist only when these groups assume the proper configuration to accommodate a water molecule that is present only in the native state of the protein. Such hydration can also decrease the flexibility of the groups involved.
There is also evidence for solvation over hydrophobic groups on the protein surface. So-called hydrophobic hydration occurs because of the unfavorable nature of the interaction between water molecules and hydrophobic surfaces, resulting in the clustering of water molecules. Since this clustering is energetically unfavorable, such hydrophobic hydration does not contribute to the protein stability. However, this hydrophobic hydration facilitates hydrophobic interaction. This unfavorable hydration is diminished as the various hydrophobic groups come in contact either intramolecularly or intermolecularly, leading to the folding of intrachain structures or to protein-protein interactions.
Both the loosely and strongly bound water molecules can have an important impact, not only on protein stability but also on protein function. For example, certain enzymes function in nonaqueous solvent provided that a small amount of water, just enough to cover the protein surface, is present. Bound water can modulate the dynamics of surface groups, and such dynamics may be critical for enzyme function. Dried enzymes are, in general, inactive and become active after they absorb 0.2 g water per g protein. This amount of water is only sufficient to cover surface polar groups, yet may give sufficient flexibility for function.
Evidence that water bound to protein molecules has a different property from bulk water can be demonstrated by the presence of non-freezable water. Thus, when a protein solution is cooled below −40 °C, a fraction of water, ~0.3 g water/g protein, does not freeze and can be detected by high-resolution NMR. Several other techniques also detect a similar amount of bound water. This unfreezable water reflects the unique property of bound water that prevents it from adopting an ice structure.
Proteins are immersed under physiological conditions or in test tubes in aqueous solutions containing not only water but also other solution components, e.g., salts, metals, amino acids, sugars, and many other minor components. These components also interact with the protein surface and affect protein folding and stability. For examples, sugars and amino acids are known to enhance folding and stability of the proteins, as described below.

Protein Folding

Proteins become functional only when they assume a distinct tertiary structure. Many physiologically and therapeutically important proteins present their surface for recognition by interacting with molecules such as substrates, receptors, signaling proteins, and cell-surface adhesion macromolecules. When recombinant proteins are produced in Escherichia coli, they often form inclusion bodies into which they are deposited as insoluble proteins. Formation of such insoluble states does not naturally occur in cells where they are normally synthesized and transported. Therefore, an in vitro process is required to refold insoluble recombinant proteins into their native, physiologically active state. This is usually accomplished by solubilizing the insoluble proteins with detergents or denaturants, followed by the purification and removal of these reagents concurrent with refolding the proteins (see Chap.​ 3).
Unfolded states of proteins are usually highly stable and soluble in the presence of denaturing agents. Once the proteins are folded correctly, they are also relatively stable. During the transition from the unfolded form to the native state, the protein must go through a multitude of other transition states in which it is not fully folded, and denaturants or solubilizing agents are at low concentrations or even absent.
The refolding of proteins can be achieved in various ways. The dilution of proteins at high denaturant concentration into aqueous buffer will decrease both denaturant and protein concentration simultaneously. The addition of an aqueous buffer to a protein-denaturant solution also causes a decrease in concentrations of both denaturant and protein. The difference in these procedures is that, in the first case, both denaturant and protein concentrations are the lowest at the beginning of dilution and gradually increase as the process continues. In the second case, both denaturant and protein concentrations are highest at the beginning of dilution and gradually decrease as the dilution proceeds. Dialysis or the diafiltration of protein in the denaturant against an aqueous buffer resembles the second case, since the denaturant concentration decreases as the procedure continues. In this case, however, the protein concentration remains unchanged. Refolding can also be achieved by first binding the protein in denaturants to a solid phase, i.e., to a column matrix, and then equilibrating it with an aqueous buffer. In this case, protein concentrations are not well defined. Each procedure has advantages and disadvantages and may be applicable for one protein, but not to another.
If proteins in the native state have disulfide bonds, cysteines must be correctly oxidized. Such oxidation may be done in various ways, e.g., air oxidation, glutathione-catalyzed disulfide exchange, or mixed-disulfide formation followed by reduction and oxidation or by disulfide reshuffling.
Protein folding has been a topic of intensive research since Anfinsen’s demonstration that ribonuclease can be refolded from the fully reduced and denatured state in in vitro experiments. This folding can be achieved only if the amino acid sequence itself contains all information necessary for folding into the native structure. This is the case, at least partially, for many proteins. However, a lot of other proteins do not refold in a simple one-step process. Rather, they refold via various intermediates which are relatively compact and possess varying degrees of secondary structures, but which lack a rigid tertiary structure. Intrachain interactions of these preformed secondary structures eventually lead to the native state. However, the absence of a rigid structure in these preformed secondary structures can also expose a cluster of hydrophobic groups to those of other polypeptide chains, rather than to their own polypeptide segments, resulting in intermolecular aggregation. High efficiency in the recovery of native protein depends to a large extent on how this aggregation of intermediate forms is minimized. The use of chaperones or polyethylene glycol has been found quite effective for this purpose. The former are proteins, which aid in the proper folding of other proteins by stabilizing intermediates in the folding process and the latter serves to solvate the protein during folding and diminishes interchain aggregation events.
Protein folding is often facilitated by cosolvents, such as polyethylene glycol. As described above, proteins are functional and highly hydrated in aqueous solutions. True physiological solutions, however, contain not only water but also various ions and low- and high-molecular-weight solutes, often at very high concentrations. These ions and other solutes play a critical role in maintaining the functional structure of the proteins. When isolated from their natural environment, the protein molecules may lose these stabilizing factors and hence must be stabilized by certain compounds, often at high concentrations. These solutes are also used in vitro to assist in protein folding and to help stabilize proteins during large-scale purification and production as well as for long-term storage. Such solutes are often called cosolvents when used at high concentrations, since at such high concentrations they also serve as a solvent along with water molecules. These solutes encompass sugars, amino acids, inorganic and organic salts, and polyols. They may not strongly bind to proteins, but instead typically interact weakly with the protein surface to provide significant stabilizing energy without interfering with their functional structure.
When recombinant proteins are expressed in eukaryotic cells and secreted into media, the proteins are generally folded into the native conformation. If the proteins have sites for N-linked or O-linked glycosylation, they undergo varying degrees of glycosylation depending on the host cells used and level of expression. For many glycoproteins, glycosylation is not essential for folding, since they can be refolded into the native conformation without carbohydrates, nor is glycosylation often necessary for receptor binding and hence biological activity. However, glycosylation can alter important biological and physicochemical properties of proteins, such as pharmacokinetics, solubility, and stability.

Techniques Specifically Suitable for Characterizing Protein Folding

Conventional spectroscopic techniques used to obtain information on the folded structure of proteins are circular dichroism (CD), fluorescence, and Fourier transform infrared spectroscopies (FTIR). CD and FTIR are widely used to estimate the secondary structure of proteins. The α-helical content of a protein can be readily estimated by CD in the far-UV region (180–260 nm) and by FTIR. FTIR signals from loop structures, however, occasionally overlap with those arising from an α-helix. The β-sheet gives weak CD signals, which are variable in peak positions and intensities due to twists of interacting β-strands, making far-UV CD unreliable for evaluation of these structures. On the other hand, FTIR can reliably estimate the β-structure content as well as distinguish between parallel and antiparallel forms.
CD in the near-UV region (250–340 nm) reflects the environment of aromatic amino acids, i.e., tryptophan, tyrosine, and phenylalanine, as well as that of disulfide structures. Fluorescence spectroscopy yields information on the environment of tyrosine and tryptophan residues. CD and fluorescence signals in many cases are drastically altered upon refolding and hence can be used to follow the formation of the tertiary structure of a protein.
None of these techniques can give the folded structure at the atomic level, i.e., they give no information on the exact location of each amino acyl residue in the three-dimensional structure of the protein. This information can only be determined by X-ray crystallography or NMR. However, CD, FTIR, and fluorescence spectroscopic methods are fast and require lower protein concentrations than either NMR or X-ray crystallography and are amenable for the examination of the protein under widely different conditions. When a naturally occurring form of the protein is available, these techniques, in particular near-UV CD and fluorescence spectroscopies, can quickly address whether the refolded protein assumes the native folded structure.
Temperature dependence of these spectroscopic properties also provides information about protein folding. Since the folded structures of proteins are built upon cooperative interactions of many side chains and peptide bonds in a protein molecule, elimination of one interaction by heat can cause cooperative elimination of other interactions, leading to the unfolding of protein molecules. Thus, many proteins undergo a cooperative thermal transition over a narrow temperature range. Conversely, if the proteins are not fully folded, they may undergo noncooperative thermal transitions as observed by a gradual signal change over a wider range of temperature.
Such a cooperative structure transition can also be examined by differential scanning calorimetry. When the structure unfolds, it requires heat. Such heat absorption can be determined using this highly sensitive calorimetry technique.
Hydrodynamic properties of proteins change greatly upon folding, going from elongated and expanded structures to compact globular ones. Sedimentation velocity and size-exclusion chromatography (see section “Analytical Techniques”) are two frequently used techniques for the evaluation of hydrodynamic properties, although the latter is much more accessible. The sedimentation coefficient (how fast a molecule migrates in a centrifugal field) is a function of both the molecular weight and hydrodynamic size of the proteins, while elution position in size-exclusion chromatography (how fast it migrates through pores) depends only on the hydrodynamic size (see Chap.​ 3). In both methods, comparison of the sedimentation coefficient or elution position with that of a globular protein with an identical molecular weight (or upon appropriate molecular-weight normalization) gives information on how compactly the protein is folded.
For oligomeric proteins, the determination of molecular weight of the associated states and acquisition of the quaternary structure can be used to assess the folded structure. For strong interactions, specific protein association requires that intersubunit contact surfaces perfectly match each other. Such an associated structure, if obtained by covalent bonding, may be determined simply by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. If protein association involves non-covalent interactions, sedimentation equilibrium or light scattering experiments can assess this phenomenon. Although these techniques have been used for many decades with some difficulty, emerging technologies in analytical ultracentrifugation and laser light scattering, and appropriate software for analyzing the results, have greatly facilitated their general use, as described in detail below.
Two fundamentally different light scattering techniques can be used in characterizing recombinant proteins. “Static” light scattering measures the intensity of the scattered light. “Dynamic” light scattering measures the fluctuations in the scattered light intensity as molecules diffuse in and out of a very small scattering region (Brownian motion).
Static light scattering is often used online in conjunction with size-exclusion chromatography (SEC). The scattering signal is proportional to the product of molecular mass times weight concentration. Dividing this signal by one proportional to the concentration, such as obtained from an UV absorbance or refractive index detector, then gives a direct and absolute measure of the mass of each peak eluting from the column, independent of molecular conformation and elution position. This SEC-static scattering combination allows rapid identification of whether the native state of a protein is a monomer or an oligomer and the stoichiometry of multi-protein complexes. It is also very useful in identifying the mass of aggregates which may be present and thus is useful for evaluating protein stability.
Dynamic light scattering (DLS) measures the diffusion rate of the molecules, which can be translated into the Stokes radius, a measure of hydrodynamic size. Although the Stokes radius is strongly correlated with molecular mass, it is also strongly influenced by molecular shape (conformation), and thus, DLS is far less accurate than static scattering for measuring molecular mass. The great strength of DLS is its ability to cover a very wide size range in one measurement and to detect very small amounts of large aggregates (<0.01 % by weight). Other important advantages over static scattering with SEC are a wide choice of buffer conditions and no potential loss of species through sticking to a column.
An analytical ultracentrifuge incorporates an optical system and special rotors and cells in a high-speed centrifuge to permit measurement of the concentration of a sample versus position within a spinning centrifuge cell. There are two primary strategies: analyzing either the sedimentation velocity or the sedimentation equilibrium. When analyzing the sedimentation velocity, the rotor is spun at very high speed, so the protein sample will completely sediment and form a pellet. The rate at which the protein pellets is measured by the optical system to derive the sedimentation coefficient, which depends on both mass and molecular conformation. When more than one species is present (e.g., a monomer plus a covalent dimer degradation product), a separation is achieved based on the relative sedimentation coefficient of each species.
Proteins form not only small oligomers that can be measured by the above techniques but also much larger aggregates, called subvisible and visible particles, which are present in minute quantities. As their size approaches the size of virus, they become highly immunogenic (cf. Chap.​ 6) and hence determination of their size and amount becomes critical for developing pharmaceutical protein products. Such determination requires imaging of the particles, as the hydrodynamic techniques such as dynamic light scattering and sedimentation velocity have neither sensitivity nor resolution for such large aggregates heterogeneous in size distribution. Normally such particles are present in minute quantity, but yet cause serious immunogenic responses due to their large size.
Because the sedimentation coefficient is sensitive to molecular conformation and can be measured with high precision (~0.5 %), sedimentation velocity can detect even fairly subtle differences in conformation. This ability can be used, for example, to confirm that a recombinant protein has the same conformation as the natural wild-type protein or to detect small changes in structure with changes in the pH or salt concentration that may be too subtle to detect by other techniques, such as CD or differential scanning calorimetry.
In sedimentation equilibrium, a much lower rotor speed and milder centrifugal force is used than for sedimentation velocity. The protein still accumulates toward the outside of the rotor, but no pellet is formed. This concentration gradient across the cell is continuously opposed by diffusion, which tries to restore a uniform concentration. After spinning for a long time (usually 12–36 h), an equilibrium is reached where sedimentation and diffusion are balanced and the distribution of protein no longer changes with time. At sedimentation equilibrium, the concentration distribution depends only on the molecular mass and is independent of molecular shape. Thus, self-association for the formation of dimers or higher oligomers (whether reversible or irreversible) is readily detected, as are binding interactions between different proteins. For reversible association, it is possible to determine the strength of the binding interaction by measuring samples over a wide range of protein concentrations.
In biotechnology applications, sedimentation equilibrium is often used as the “gold standard” for confirming that a recombinant protein has the expected molecular mass and biologically active state of oligomerization in solution. It can also be used to determine the average amount of glycosylation or conjugation of moieties such as polyethylene glycol. The measurement of binding affinities for receptor-cytokine, antigen-antibody, or other interaction can also sometimes serve as a functional characterization of recombinant proteins (although some of these interactions are too strong to be measured by this method).
Site-specific chemical modification and proteolytic digestion are also powerful techniques for studying the folding of proteins. The extent of chemical modification or proteolytic digestion depends on whether the specific sites are exposed to the solvent or are buried in the interior of the protein molecules and are thus inaccessible to these modifications. For example, trypsin cleaves peptide bonds on the C-terminal side of basic residues. Although most proteins contain several basic residues, brief exposure of the native protein to trypsin usually generates only a few peptides, as cleavage occurs only at the accessible basic residues, whereas the same treatment can generate many more peptides when done on the denatured (unfolded) protein, since all the basic residues are now accessible (see also peptide mapping in section “Mass Spectrometry”).

Protein Stability

Although freshly isolated proteins may be folded into a distinct three-dimensional structure, this folded structure is not necessarily retained indefinitely in aqueous solution. The reason is that proteins are neither chemically nor physically stable. The protein surface is chemically highly heterogeneous and contains reactive groups. Long-term exposure of these groups to environmental stresses causes various chemical alterations. Many proteins, including growth factors and cytokines, have cysteine residues. If some of them are in a free or sulfhydryl form, they may undergo oxidation and disulfide exchange. Oxidation can also occur on methionyl residues. Hydrolysis can occur on peptide bonds and on amides of asparagine and glutamine residues. Other chemical modifications can occur on peptide bonds, tryptophan, tyrosine, and amino and carboxyl groups. Table 2.4 lists both a number of reactions that can occur during purification and storage of proteins and methods that can be used to detect such changes.
Table 2.4 ■ 
Common reactions affecting stability of proteins.
 
Physical property effected
Method of analysis
Oxidation
Hydrophobicity size
RP-HPLC, SDS-PAGE, size-exclusion chromatography, and mass spectrometry
 Cys
Hydrophobicity
  Disulfide
 
   Intrachain
   Interchain
 Met, Trp, Tyr
Peptide bond hydrolysis
Size
Size-exclusion chromatography SDS-PAGE
N to O migration
Hydrophobicity
RP-HPLC inactive in Edman reaction
 Ser, Thr
Chemistry
α-carboxy to β-carboxy migration
Hydrophobicity
RP-HPLC inactive in Edman reaction
 Asp, Asn
Chemistry
Deamidation
Charge
Ion-exchange chromatography
 Asn, Gln
Acylation
Charge
Ion-exchange chromatography Mass spectrometry
 α-amino group, ε-amino group
Esterification/carboxylation
Charge
Ion-exchange chromatography Mass spectrometry
 Glu, Asp, C-terminal
Secondary structure changes
Hydrophobicity
RP-HPLC
Size
Size-exclusion chromatography
Sec/tert structure
CD
Sec/tert structure
FTIR
Aggregation
Light scattering
Sec/tert structure, Aggregation
Analytical ultracentrifugation
Physical stability of a protein is expressed as the difference in free energy, ΔG U, between the native and denatured states. Thus, protein molecules are in equilibrium between the above two states. As long as this unfolding is reversible and ΔG U is positive, it does not matter how small the ΔG U is. In many cases, this reversibility does not hold. This is often seen when ΔG U is decreased by heating. Most proteins denature upon heating and subsequent aggregation of the denatured molecules results in irreversible denaturation. Thus, unfolding is made irreversible by aggregation:
 $$ \text{Native}\,\text{state}\text{\hspace{0.05em}}\stackrel{\Delta {G}_{\text{U}}}{\iff }\text{\hspace{0.05em}}\text{Denatured}\,\text{state}\text{\hspace{0.05em}}\stackrel{k}{\Rightarrow }\text{\hspace{0.05em}}\text{Aggregated}\,\text{state}$$
Therefore, any stress that decreases ΔG U and increases k will cause the accumulation of irreversibly inactivated forms of the protein. Such stresses may include chemical modifications as described above and physical parameters, such as pH, ionic strength, protein concentration, and temperature. Development of a suitable formulation that prolongs the shelf life of a recombinant protein is essential when it is to be used as a human therapeutic.
The use of protein stabilizing agents to enhance storage stability of proteins has become customary. These compounds affect protein stability by increasing ΔG U. These compounds, however, may also increase k and hence their net effect on long-term storage of proteins may vary among proteins, as well as on the storage conditions.
When unfolding is irreversible due to aggregation, minimizing the irreversible step should increase the stability, and often, this may be attained by the addition of mild detergents. Prior to selecting the proper detergent concentration and type, however, their effects on ΔG U must be carefully evaluated.
Another approach for enhancing storage stability of proteins is to lyophilize, or freeze-dry, the proteins (see Chap.​ 4). Lyophilization can minimize the aggregation step during storage, since both chemical modification and aggregation are reduced in the absence of water. The effects of a lyophilization process itself on ΔG U and k are not fully understood and hence such a process must be optimized for each protein therapeutic.

Analytical Techniques

In one of the previous sections on “Techniques Specifically Suitable for Characterizing Folding,” a number of (spectroscopic) techniques were mentioned that can be specifically used to monitor protein folding. These were CD, FTIR, fluorescence spectroscopy, and DSC. Moreover, analytical ultracentrifugation and light scattering techniques were discussed in more detail. In this section, other techniques will be discussed.

Blotting Techniques

Blotting methods form an important niche in biotechnology. They are used to detect very low levels of unique molecules in a milieu of proteins, nucleic acids, and other cellular components. They can detect aggregates or breakdown products occurring during long-term storage and they can be used to detect components from the host cells used in producing recombinant proteins.
Biomolecules are transferred to a membrane (“blotting”), and this membrane is then probed with specific reagents to identify the molecule of interest. Membranes used in protein blots are made of a variety of materials including nitrocellulose, nylon, and polyvinylidene difluoride (PVDF), all of which avidly bind protein.
Liquid samples can be analyzed by methods called dot blots or slot blots. A solution containing the biomolecule of interest is filtered through a membrane which captures the biomolecule. The difference between a dot blot and a slot blot is that the former uses a circular or disk format, while the latter is a rectangular configuration. The latter method allows for a more precise quantification of the desired biomolecule by scanning methods and relating the integrated results to that obtained with known amounts of material.
Often, the sample is subjected to some type of fractionation, such as polyacrylamide gel electrophoresis, prior to the blotting step. An early technique, Southern blotting, named after the discoverer, E.M. Southern, is used to detect DNA fragments. When this procedure was adapted to RNA fragments and to proteins, other compass coordinates were chosen as labels for these procedures, i.e., northern blots for RNA and western blots for proteins. Western blots involve the use of labeled antibodies to detect specific proteins.

Transfer of Proteins

Following polyacrylamide gel electrophoresis, the transfer of proteins from the gel to the membrane can be accomplished in a number of ways. Originally, blotting was achieved by capillary action. In this commonly used method, the membrane is placed between the gel and absorbent paper. Fluid from the gel is drawn toward the absorbent paper and the protein is captured by the intervening membrane. A blot, or impression, of the protein within the gel is thus made.
The transfer of proteins to the membrane can occur under the influence of an electric field, as well. The electric field is applied perpendicularly to the original field used in separation so that the maximum distance the protein needs to migrate is only the thickness of the gel, and hence, the transfer of proteins can occur very rapidly. This latter method is called electroblotting.

Detection Systems

Once the transfer has occurred, the next step is to identify the presence of the desired protein. In addition to various colorimetric staining methods, the blots can be probed with reagents specific for certain proteins, as for example, antibodies to a protein of interest. This technique is called immunoblotting. In the biotechnology field, immunoblotting is used as an identity test for the product of interest. An antibody that recognizes the desired protein is used in this instance. Secondly, immunoblotting is sometimes used to show the absence of host proteins. In this instance, the antibodies are raised against proteins of the organism in which the recombinant protein has been expressed. This latter method can attest to the purity of the desired protein.
Table 2.5 lists major steps needed for the blotting procedure to be successful. Once the transfer of proteins is completed, residual protein binding sites on the membrane need to be blocked so that antibodies used for detection react only at the location of the target molecule, or antigen, and not at some nonspecific location. After blocking, the specific antibody is incubated with the membrane.
Table 2.5 ■ 
Major steps in blotting proteins to membranes.
1. Transfer protein to membrane, e.g., by electroblotting
2. Block residual protein binding sites on membrane with extraneous proteins such as milk proteins
3. Treat membrane with antibody which recognizes the protein of interest. If this antibody is labeled with a detecting group, then go to step 5
4. Incubate membrane with secondary antibody which recognizes primary antibody used in step 3. This antibody is labeled with a detecting group
5. Treat the membrane with suitable reagents to locate the site of membrane attachment of the labeled antibody in step 4 or step 5
The antibody reacts with a specific protein on the membrane only at the location of that protein because of its specific interaction with its antigen. When immunoblotting techniques are used, methods are still needed to recognize the location of the interaction of the antibody with its specific protein. A number of procedures can be used to detect this complex (see Table 2.6).
Table 2.6 ■ 
Detection methods used in blotting techniques.
1. Antibodies are labeled with radioactive markers such as 125I
2. Antibodies are linked to an enzyme such as horseradish peroxidase (HRP) or alkaline phosphatase (AP). On incubation with substrate, an insoluble colored product is formed at the location of the antibody. Alternatively, the location of the antibody can be detected using a substrate which yields a chemiluminescent product, an image of which is made on photographic film
3. Antibody is labeled with biotin. Streptavidin or avidin is added to strongly bind to the biotin. Each streptavidin molecule has four binding sites. The remaining binding sites can combine with other biotin molecules which are covalently linked to HRP or to AP
The antibody itself can be labeled with a radioactive marker such as 125I and placed in direct contact with X-ray film. After exposure of the membrane to the film for a suitable period, the film is developed and a photographic negative is made of the location of radioactivity on the membrane. Alternatively, the antibody can be linked to an enzyme which, upon the addition of appropriate reagents, catalyzes a color or light reaction at the site of the antibody. These procedures entail purification of the antibody and specifically label it. More often, “secondary” antibodies are used. The primary antibody is the one which recognizes the protein of interest. The secondary antibody is then an antibody that specifically recognizes the primary antibody. Quite commonly, the primary antibody is raised in rabbits. The secondary antibody may then be an antibody raised in another animal, such as goat, which recognizes rabbit antibodies. Since this secondary antibody recognizes rabbit antibodies in general, it can be used as a generic reagent to detect rabbit antibodies in a number of different proteins of interest that have been raised in rabbits. Thus, the primary antibody specifically recognizes and complexes a unique protein, and the secondary antibody, suitably labeled, is used for detection (see also section “ELISA” and Fig. 2.10).
The secondary antibody can be labeled with a radioactive or enzymatic marker group and used to detect several different primary antibodies. Thus, rather than purifying a number of different primary antibodies, only one secondary antibody needs to be purified and labeled for recognition of all the primary antibodies. Because of their wide use, many common secondary antibodies are commercially available in kits containing the detection system and follow routine, straightforward procedures.
In addition to antibodies raised against the amino acyl constituents of proteins, specific antibodies can be used which recognize unique posttranslational components in proteins, such as phosphotyrosyl residues, which are important during signal transduction, and carbohydrate moieties of glycoproteins.
Figure 2.9 illustrates a number of detection methods that can be used on immunoblots. The primary antibody, or if convenient, the secondary antibody, can have an appropriate label for detection. They may be labeled with a radioactive tag as mentioned previously. Secondly, these antibodies can be coupled with an enzyme such as horseradish peroxidase (HRP) or alkaline phosphatase (AP). Substrate is added and is converted to an insoluble, colored product at the site of the protein-primary antibody-secondary antibody-HRP product. An alternative substrate can be used which yields a chemiluminescent product. A chemical reaction leads to the production of light which can expose photographic or X-ray film. The chromogenic and chemiluminescent detection systems have comparable sensitivities to radioactive methods. The former detection methods are displacing the latter method, since problems associated with handling radioactive material and radioactive waste solutions are eliminated.
A273058_4_En_2_Fig9_HTML.gif
Figure 2.9 ■ 
Common immunoblotting detection systems used to detect antigens, Ag, on membranes. Abbreviations used: Ab antibody, E enzyme, such as horseradish peroxidase or alkaline phosphatase, S substrate, P product, either colored and insoluble or chemiluminescent, B biotin, Sa streptavidin.
As illustrated in Fig. 2.9, streptavidin, or alternatively avidin, and biotin can play an important role in detecting proteins on immunoblots. This is because biotin forms very tight complexes with streptavidin and avidin. Secondly, these proteins are multimeric and contain four binding sites for biotin. When biotin is covalently linked to proteins such as antibodies and enzymes, streptavidin binds to the covalently bound biotin, thus recognizing the site on the membrane where the protein of interest is located.

Immunoassays

ELISA

Enzyme-linked immunosorbent assay (ELISA) provides a means to quantitatively measure extremely small amounts of proteins in biological fluids and serves as a tool for analyzing specific proteins during purification. This procedure takes advantage of the observation that plastic surfaces are able to adsorb low but detectable amounts of proteins. This is a solid-phase assay. Therefore, antibodies against a certain desired protein are allowed to adsorb to the surface of microtitration plates. Each plate may contain up to 96 wells so that multiple samples can be assayed. After incubating the antibodies in the wells of the plate for a specific period of time, excess antibody is removed and residual protein binding sites on the plastic are blocked by incubation with an inert protein. Several microtitration plates can be prepared at one time since the antibodies coating the plates retain their binding capacity for an extended period. During the ELISA, sample solution containing the protein of interest is incubated in the wells and the protein (Ag) is captured by the antibodies coating the well surface. Excess sample is removed and other antibodies which now have an enzyme (E) linked to them are added to react with the bound antigen.
The format described above is called a sandwich assay since the antigen of interest is located between the antibody on the titer well surface and the antibody containing the linked enzyme. Figure 2.10 illustrates a number of formats that can be used in an ELISA. A suitable substrate is added and the enzyme linked to the antibody-antigen-antibody well complex converts this compound to a colored product. The amount of product obtained is proportional to the enzyme adsorbed in the well of the plate. A standard curve can be prepared if known concentrations of antigen are tested in this system and the amount of antigen in unknown samples can be estimated from this standard curve. A number of enzymes can be used in ELISAs. However, the most common ones are horseradish peroxidase and alkaline phosphatase. A variety of substrates for each enzyme are available which yield colored products when catalyzed by the linked enzyme. Absorbance of the colored product solutions is measured on plate readers, instruments which rapidly measure the absorbance in all 96 wells of the microtitration plate, and data processing can be automated for rapid throughput of information. Note that detection approaches partly parallel those discussed in the section on “Blotting.” The above ELISA format is only one of many different methods. For example, the microtitration wells may be coated directly with the antigen rather than having a specific antibody attached to the surface. Quantitation is made by comparison with known quantities of antigen used to coat individual wells.
A273058_4_En_2_Fig10_HTML.gif
Figure 2.10 ■ 
Examples of several formats for ELISA in which the specific antibody is adsorbed to the surface of a microtitration plate. See Fig. 2.9 for abbreviations used. The antibody is represented by the Y-type structure. The product P is colored and the amount generated is measured with a spectrometer or plate reader.
Another approach, this time subsequent to the binding of antigen either directly to the surface or to an antibody on the surface, is to use an antibody specific to the antibody binding the protein antigen, that is, a secondary antibody. This latter, secondary, antibody contains the linked enzyme used for detection. As already discussed in the section on “Blotting,” the advantage to this approach is that such antibodies can be obtained in high purity and with the desired enzyme linked to them from commercial sources. Thus, a single source of enzyme-linked antibody can be used in assays for different protein antigens. Should a sandwich assay be used, then antibodies from different species need to be used for each side of the sandwich. A possible scenario is that rabbit antibodies are used to coat the microtitration wells; mouse antibodies, possibly a monoclonal antibody, are used to complex with the antigen; and then, a goat anti-mouse immunoglobulin containing linked HRP or AP is used for detection purposes.
As with immunoblots discussed above, streptavidin or avidin can be used in these assays if biotin is covalently linked to the antibodies and enzymes (Fig. 2.10).
If a radioactive label is used in place of the enzyme in the above procedure, then the assay is a solid-phase radioimmunoassay (RIA). Assays are moving away from the use of radioisotopes, because of problems with safety and disposal of radioactive waste and since nonradioactive assays have comparable sensitivities.

Electrophoresis

Analytical methodologies for measuring protein properties stem from those used in their purification. The major difference is that systems used for analysis have a higher resolving power and lower detection limit than those used in purification. The two major methods for analysis have their bases in chromatographic or electrophoretic techniques.

Polyacrylamide Gel Electrophoresis

One of the earliest methods for analysis of proteins is polyacrylamide gel electrophoresis (PAGE). In this assay, proteins, being amphoteric molecules with both positive and negative charge groups in their primary structure, are separated according to their net electrical charge. A second factor which is responsible for the separation is the mass of the protein. Thus, one can consider more precisely that the charge to mass ratio of proteins determines how they are separated in an electrical field. The charge of the protein can be controlled by the pH of the solution in which the protein is separated. The farther away the protein is from its pI value, that is, the pH at which it has a net charge of zero, the greater is the net charge and hence the greater is its charge to mass ratio. Therefore, the direction and speed of migration of the protein depend on the pH of the gel. If the pH of the gel is above its pI value, then the protein is negatively charged and hence migrates toward the anode. The higher the pH of the gel, the faster the migration. This type of electrophoresis is called native gel electrophoresis.
The major component of polyacrylamide gels is water. However, they provide a flexible support so that after a protein has been subjected to an electrical field for an appropriate period of time, it provides a matrix to hold the proteins in place until they can be detected with suitable reagents. By adjusting the amount of acrylamide that is used in these gels, one can control the migration of material within the gel. The more acrylamide, the more hindrance for the protein to migrate in an electrical field.
The addition of a detergent, sodium dodecyl sulfate (SDS), to the electrophoretic separation system allows for the separation to take place primarily as a function of the size of the protein. Dodecyl sulfate ions form complexes with proteins, resulting in an unfolding of the proteins, and the amount of detergent that is complexed is proportional to the mass of the protein. The larger the protein, the more detergent that is complexed. Dodecyl sulfate is a negatively charged ion. When proteins are in a solution of SDS, the net effect is that the own charge of the protein is overwhelmed by that of the dodecyl sulfate complexed with it, so that the proteins take on a net negative charge proportional to their mass.
Polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfates is commonly known as SDS-PAGE. All the proteins take on a net negative charge, with larger proteins binding more SDS but with the charge to mass ratio being fairly constant among the proteins. An example of SDS-PAGE is shown in Fig. 2.11. Here, SDS-PAGE is used to monitor expression of G-CSF receptor and of G-CSF (panel B) in different culture media.
A273058_4_En_2_Fig11_HTML.gif
Figure 2.11 ■ 
SDS-PAGE of G-CSF receptor, about 35 kDa (panel a), and G-CSF, about 20 kDa (panel b). These proteins are expressed in different culture media (lanes 1–9). Positions of molecular weight standards are given on the left side. The bands are developed with antibody against G-CSF receptor (panel a) or G-CSF (panel b) after blotting.
Since all proteins have essentially the same charge to mass ratio, how can separation occur? This is done by controlling the concentration of acrylamide in the path of proteins migrating in an electrical field. The greater the acrylamide concentration, the more difficult it is for large protein molecules to migrate relative to smaller protein molecules. This is sometimes thought of as a sieving effect, since the greater the acrylamide concentration, the smaller the pore size within the polyacrylamide gel. Indeed, if the acrylamide concentration is sufficiently high, some high-molecular-weight proteins may not migrate at all within the gel. Since in SDS-PAGE the proteins are denatured, their hydrodynamic size, and hence the degree of retardation by the sieving effects, is directly related to their mass. Proteins containing disulfide bonds will have a much more compact structure and higher mobility for their mass unless the disulfides are reduced prior to electrophoresis.
As described above, native gel electrophoresis and SDS-PAGE are quite different in terms of the mechanism of protein separation. In native gel electrophoresis, the proteins are in the native state and migrate on their own charges. Thus, this electrophoresis can be used to characterize proteins in the native state. In SDS-PAGE, proteins are unfolded and migrate based on their molecular mass. As an intermediate case, Blue native electrophoresis is developed, in which proteins are bound by a dye, Coomassie blue, used to stain protein bands. This dye is believed to bind to the hydrophobic surface of the proteins and to add negative charges to the proteins. The dye-bound proteins are still in the native state and migrate based on the net charges, which depend on the intrinsic charges of the proteins and the amounts of the negatively charged dye. This is particularly useful for analyzing membrane proteins, which tend to aggregate in the absence of detergents. The dye prevents the proteins from aggregation by binding to their hydrophobic surface.

Isoelectric Focusing (IEF)

Another method to separate proteins based on their electrophoretic properties is to take advantage of their isoelectric point. In a first run, a pH gradient is established within the gel using a mixture of small-molecular-weight ampholytes with varying pI values. The high pH conditions are established at the site of the cathode. Then, the protein is brought on the gel, e.g., at the site where the pH is 7. In the electrical field, the protein will migrate until it reaches the pH on the gel where its net charge is zero. If the protein were to migrate away from this pH value, it could gain a charge and migrate toward its pI value again, leading to a focusing effect.

2-Dimensional Gel Electrophoresis

The above methods can be combined into a procedure called 2-D gel electrophoresis. Proteins are first fractionated by isoelectric focusing based upon their pI values. They are then subjected to SDS-PAGE perpendicular to the first dimension and fractionated based on the molecular weights of proteins. SDS-PAGE cannot be performed before isoelectric focusing, since once SDS binds to and denatures the proteins, they no longer migrate based on their pI values.

Detection of Proteins Within Polyacrylamide Gels

Although the polyacrylamide gels provide a flexible support for the proteins, with time, the proteins will diffuse and spread within the gel. Consequently, the usual practice is to fix the proteins or trap them at the location where they migrated to. This is accomplished by placing the gels in a fixing solution in which the proteins become insoluble.
There are many methods for staining proteins in gels, but the two most common and well-studied methods are either staining with Coomassie blue or by a method using silver. The latter method is used if increased sensitivity is required. The principle of developing the Coomassie blue stain is the hydrophobic interaction of a dye with the protein. Thus, the gel takes on a color wherever a protein is located. Using standard amounts of proteins, the amount of protein or contaminant may be estimated. Quantification using the silver staining method is less precise. However, due to the increased sensitivity of this method, very low levels of contaminants can be detected. These fixing and staining procedures denature the proteins. Hence, proteins separated under native conditions, as in native or Blue native gel electrophoresis, will be denatured. To maintain the native state, the gels can be stained with copper or other metal ions.

Capillary Electrophoresis

With recent advances in instrumentation and technology, capillary electrophoresis has gained an increased presence in the analysis of recombinant proteins. Rather than having a matrix, as in polyacrylamide gel electrophoresis through which the proteins migrate, they are free in solution in an electric field within the confines of a capillary tube with a diameter of 25–50 μm. The capillary tube passes through an ultraviolet light or fluorescence detector that measures the presence of proteins migrating in the electric field. The movement of one protein relative to another is a function of the molecular mass and the net charge on the protein. The latter can be influenced by pH and analytes in the solution. This technique has only partially gained acceptance for routine analysis, because of difficulties in reproducibility of the capillaries and in validating this system. Nevertheless, it is a powerful analytical tool for the characterization of recombinant proteins during process development and in stability studies.

Chromatography

Chromatography techniques are used extensively in biotechnology not only in protein purification procedures (see Chap.​ 3) but also in assessing the integrity of the product. Routine procedures are highly automated so that comparisons of similar samples can be made. An analytical system consists of an autosampler which will take a known amount (usually a known volume) of material for analysis and automatically places it in the solution stream headed toward a separation column used to fractionate the sample. Another part of this system is a pump module which provides a reproducible flow rate. In addition, the pumping system can provide a gradient which changes properties of the solution such as pH, ionic strength, and hydrophobicity. A detection system (or possibly multiple detectors in series) is located at the outlet of the column. This measures the relative amount of protein exiting the column. Coupled to the detector is a data acquisition system which takes the signal from the detector and integrates it into a value related to the amount of material (see Fig. 2.12). When the protein appears, the signal begins to increase, and as the protein passes through the detector, the signal subsequently decreases. The area under the peak of the signal is proportional to the amount of material which has passed through the detector. By analyzing known amounts of protein, an area versus amount of protein plot can be generated and this may be used to estimate the amount of this protein in the sample under other circumstances. Another benefit of this integrated chromatography system is that low levels of components which appear over time can be estimated relative to the major desired protein being analyzed. This is a particularly useful function when the long-term stability of the product is under evaluation.
A273058_4_En_2_Fig12_HTML.gif
Figure 2.12 ■ 
Components of a typical chromatography station. The pump combines solvents one and two in appropriate ratios to generate a pH, salt concentration, or hydrophobic gradient. Proteins that are fractioned on the column pass through a detector which measures their occurrence. Information from the detector is used to generate chromatograms and the relative amount of each component.
Chromatographic systems offer a multitude of different strategies for successfully separating protein mixtures and for quantifying individual protein components (see Chap.​ 3). The following describes some of these strategies.

Size-Exclusion Chromatography

As the name implies, this procedure separates proteins based on their size or molecular weight or shape. The matrix consists of very fine beads containing cavities and pores accessible to molecules of a certain size or smaller, but inaccessible to larger molecules. The principle of this technique is the distribution of molecules between the volume of solution within the beads and the volume of solution surrounding the beads. Small molecules have access to a larger volume than do large molecules. As solution flows through the column, molecules can diffuse back and forth, depending upon their size, in and out of the pores of the beads. Smaller molecules can reside within the pores for a finite period of time whereas larger molecules, unable to enter these spaces, continue along in the fluid stream. Intermediate-sized molecules spend an intermediate amount of time within the pores. They can be fractionated from large molecules that cannot access the matrix space at all and from small molecules that have free access to this volume and spend most of the time within the beads. Protein molecules can distribute between the volume within these beads and the excluded volume based on the mass and shape of the molecule. This distribution is based on the relative concentration of the protein in the beads versus the excluded volume.
Size-exclusion chromatography can be used to estimate the mass of proteins by calibrating the column with a series of globular proteins of known mass. However, the separation depends on molecular shape (conformation) as well as mass and highly elongated proteins—proteins containing flexible, disordered regions— and glycoproteins will often appear to have masses as much as two to three times the true value. Other proteins may interact weakly with the column matrix and be retarded, thereby appearing to have a smaller mass. Thus, sedimentation or light scattering methods are preferred for accurate mass measurement (see section “Techniques Specifically Suitable for Characterizing Protein Folding”). Over time, proteins can undergo a number of changes that affect their mass. A peptide bond within the protein can hydrolyze, yielding two smaller polypeptide chains. More commonly, size-exclusion chromatography is used to assess aggregated forms of the protein. Figure 2.13 shows an example of this. The peak at 22 min represents the native protein. The peak at 15 min is aggregated protein and that at 28 min depicts degraded protein, yielding smaller polypeptide chains. Aggregation can occur when a protein molecule unfolds to a slight extent and exposes surfaces that are attracted to complementary surfaces on adjacent molecules. This interaction can lead to dimerization or doubling of molecular weight or to higher-molecular-weight oligomers. From the chromatographic profile, the mechanism of aggregation can often be implicated. If dimers, trimers, tetramers, etc., are observed, then aggregation occurs by stepwise interaction of a monomer with a dimer, trimer, etc. If dimers, tetramers, octamers, etc., are observed, then aggregates can interact with each other. Sometimes, only monomers and high-molecular-weight aggregates are observed, suggesting that intermediate species are kinetically of short duration and protein molecules susceptible to aggregation combine into very large-molecular-weight complexes.
A273058_4_En_2_Fig13_HTML.gif
Figure 2.13 ■ 
Size-exclusion chromatography of a recombinant protein which on storage yields aggregates and smaller peptides.

Reversed-Phase High-Performance Liquid Chromatography

Reversed-phase high-performance liquid chromatography (RP-HPLC) takes advantage of the hydrophobic properties of proteins. The functional groups on the column matrix contain from one to up to 18 carbon atoms in a hydrocarbon chain. The longer this chain, the more hydrophobic is the matrix. The hydrophobic patches of proteins interact with the hydrophobic chromatographic matrix. Proteins are then eluted from the matrix by increasing the hydrophobic nature of the solvent passing through the column. Acetonitrile is a common solvent used, although other organic solvents such as ethanol also may be employed. The solvent is made acidic by the addition of trifluoroacetic acid, since proteins have increased solubility at pH values further removed from their pI. A gradient with increasing concentration of hydrophobic solvent is passed through the column. Different proteins have different hydrophobicities and are eluted from the column depending on the “hydrophobic potential” of the solvent.
This technique can be very powerful. It may detect the addition of a single oxygen atom to the protein, as when a methionyl residue is oxidized or when the hydrolysis of an amide moiety on a glutamyl or asparaginyl residue occurs. Disulfide bond formation or shuffling also changes the hydrophobic characteristic of the protein. Hence, RP-HPLC can be used not only to assess the homogeneity of the protein but also to follow degradation pathways occurring during long-term storage.
Reversed-phase chromatography of proteolytic digests of recombinant proteins may serve to identify this protein. Enzymatic digestion yields unique peptides that elute at different retention times or at different organic solvent concentrations. Moreover, the map, or chromatogram, of peptides arising from enzymatic digestion of one protein is quite different from the map obtained from another protein. Several different proteases, such as trypsin, chymotrypsin, and other endoproteinases, are used for these identity tests (see below under “Mass Spectrometry”).

Hydrophobic Interaction Chromatography

A companion to RP-HPLC is hydrophobic interaction chromatography (HIC), although in principle, this latter method is normal-phase chromatography, i.e., here an aqueous solvent system rather than an organic one is used to fractionate proteins. The hydrophobic characteristics of the solution are modulated by inorganic salt concentrations. Ammonium sulfate and sodium chloride are often used since these compounds are highly soluble in water. In the presence of high salt concentrations (up to several molar), proteins are attracted to hydrophobic surfaces on the matrix of resins used in this technique. As the salt concentration decreases, proteins have less affinity for the matrix and eventually elute from the column. This method lacks the resolving power of RP-HPLC, but is a more gentle method, since low pH values or organic solvents as used in RP-HPLC can be detrimental to some proteins.

Ion-Exchange Chromatography

This technique takes advantage of the electronic charge properties of proteins. Some of the amino acyl residues are negatively charged and others are positively charged. The net charge of the protein can be modulated by the pH of its environment relative to the pI value of the protein. At a pH value lower than the pI, the protein has a net positive charge, whereas at a pH value greater than the pI, the protein has a net negative charge. Opposites attract in ion-exchange chromatography. The resins in this procedure can contain functional groups with positive or negative charges. Thus, positively charged proteins bind to negatively charged matrices and negatively charged proteins bind to positively charged matrices. Proteins are displaced from the resin by increasing salt, e.g., sodium chloride, concentrations. Proteins with different net charges can be separated from one another during elution with an increasing salt gradient. The choice of charged resin and elution conditions are dependent upon the protein of interest.
In lieu of changing the ionic strength of the solution, proteins can be eluted by changing the pH of the medium, i.e., with the use of a pH gradient. This method is called chromatofocusing and proteins are separated based on their pI values. When the solvent pH reaches the pI value of a specific protein, the protein has a zero net charge and is no longer attracted to the charged matrix and hence is eluted.

Other Chromatographic Techniques

Other functional groups may be attached to chromatographic matrices to take advantage of unique properties of certain proteins. These affinity methodologies, however, are more often used in the manufacturing process than in analytical techniques (see Chap.​ 3). For example, conventional affinity purification schemes of antibodies use protein A or G columns. Protein A or G specifically binds antibodies. Antibodies consist of variable regions and constant regions (see Chap.​ 7). The variable regions are antigen specific and hence vary in sequence from one antibody to another, while the constant regions are common to each subgroup of antibodies. The constant region binds to protein A or G.
Mixed-mode chromatography uses columns having both hydrophobic and charged groups, i.e., combination of ion-exchange and hydrophobic interaction chromatography. Mixed-mode columns confer protein binding under conditions at which protein binding normally does not occur. For example, protein binding to an ion-exchange column requires low ionic strength. Under identical conditions, mixed-mode columns can bind proteins through both ionic and hydrophobic interactions.

Bioassays

Paramount to the development of a protein therapeutic is to have an assay that identifies its biological function. Chromatographic and electrophoretic methodologies can address the homogeneity of a biotherapeutic and be useful in investigating stability parameters. However, it is also necessary to ascertain whether the protein has acceptable bioactivity. Bioactivity can be determined either in vivo, i.e., by administering the protein to an animal and ascertaining some change within its body (function), or in vitro. Bioassays in vitro monitor the response of a specific receptor or microbiological or tissue cell line when the therapeutic protein is added to the system. An example of an in vitro bioassay is the increase in DNA synthesis in the presence of the therapeutic protein as measured by the incorporation of radioactively labeled thymidine. The protein factor binds to receptors on the cell surface that triggers secondary messengers to send signals to the cell nucleus to synthesize DNA. The binding of the protein factor to the cell surface is dependent upon the amount of factor present. Figure 2.14 presents a dose–response curve of thymidine incorporation as a function of concentration of the factor. At low concentrations, the factor is too low to trigger a response. As the concentration increases, the incorporation of thymidine occurs, and at higher concentrations, the amount of thymidine incorporation ceases to increase as DNA synthesis is occurring at the maximum rate. A standard curve can be obtained using known quantities of the protein factor. Comparison of other solutions containing unknown amounts of the factor with this standard curve will then yield quantitative estimates of the factor concentration. Through experience during the development of the protein therapeutic, a value is obtained for a fully functional protein. Subsequent comparisons to this value can be used to ascertain any loss in activity during stability studies or changes in activity when amino acyl residues of the protein are modified.
A273058_4_En_2_Fig14_HTML.gif
Figure 2.14 ■ 
An in vitro bioassay showing a mitogenic response in which radioactive thymidine is incorporated into DNA in the presence of an increasing amount of a protein factor.
Other in vitro bioassays can measure changes in cell number or production of another protein factor in response to the stimulation of cells by the protein therapeutic. The amount of the secondary protein produced can be estimated by using an ELISA.

Mass Spectrometry

Recent advances in the measurement of the molecular masses of proteins have made this technique an important analytical tool. While this method was used in the past to analyze small volatile molecules, the molecular weights of highly charged proteins with masses of over 100 kilodaltons (kDa) can now be accurately determined.
Because of the precision of this method, posttranslational modifications such as acetylation or glycosylation can be predicted. The masses of new protein forms that arise during stability studies provide information on the nature of this form. For example, an increase in mass of 16 Da suggests that an oxygen atom has been added to the protein as happens when a methionyl residue is oxidized to a methionyl sulfoxide residue. The molecular mass of peptides obtained after proteolytic digestion and separation by HPLC indicates from which region of the primary structure they are derived. Such HPLC chromatogram is called a “peptide map.” An example is shown in Fig. 2.15. This is obtained by digesting a protein with pepsin and by subsequently separating the digested peptides by reverse HPLC. This highly characteristic pattern for a protein is called a “protein fingerprint.” Peaks are identified by elution times on HPLC. If peptides have molecular masses differing from those expected from the primary sequence, the nature of the modification to that peptide can be implicated. Moreover, molecular mass estimates can be made for peptides obtained from unfractionated proteolytic digests. Molecular masses that differ from expected values indicate that a part of the protein molecule has been altered, that glycosylation or another modification has been altered, or that the protein under investigation still contains contaminants.
A273058_4_En_2_Fig15_HTML.gif
Figure 2.15 ■ 
Peptide map of pepsin digest of recombinant human β-secretase. Each peptide is labeled by elution time in HPLC.
Another way that mass spectrometry can be used as an analytical tool is in the sequencing of peptides. A recurring structure, the peptide bond, in peptides tends to yield fragments of the mature peptide which differ stepwise by an amino acyl residue. The difference in mass between two fragments indicates the amino acid removed from one fragment to generate the other. Except for leucine and isoleucine, each amino acid has a different mass and hence a sequence can be read from the mass spectrograph. Stepwise removal can occur from either the amino terminus or carboxy terminus.
By changing three basic components of the mass spectrometer, the ion source, the analyzer, and the detector, different types of measurement may be undertaken. Typical ion sources which volatilize the proteins are electrospray ionization, fast atom bombardment, and liquid secondary ion. Common analyzers include quadrupole, magnetic sector, and time-of-flight instruments. The function of the analyzer is to separate the ionized biomolecules based on their mass-to-charge ratio. The detector measures a current whenever impinged upon by charged particles. Electrospray ionization (El) and matrix-assisted laser desorption (MALDI) are two sources that can generate high-molecular-weight volatile proteins. In the former method, droplets are generated by spraying or nebulizing the protein solution into the source of the mass spectrometer. As the solvent evaporates, the protein remains behind in the gas phase and passes through the analyzer to the detector. In MALDI, proteins are mixed with a matrix which vaporizes when exposed to laser light, thus carrying the protein into the gas phase. An example of MALDI-mass analysis is shown in Fig. 2.16, indicating the singly charged ion (116,118 Da) and the doubly charged ion (5,8036.2) for a purified protein. Since proteins are multi-charge compounds, a number of components are observed representing mass-to-charge forms, each differing from the next by one charge. By imputing various charges to the mass-to-charge values, a molecular mass of the protein can be estimated. The latter step is empirical since only the mass-to-charge ratio is detected and not the net charge for that particular particle.
A273058_4_En_2_Fig16_HTML.gif
Figure 2.16 ■ 
MALDI-mass analysis of a purified recombinant human β-secretase. Numbers correspond to the singly charged and doubly charged ions.

Concluding Remarks

With the advent of recombinant proteins as human therapeutics, the need for methods to evaluate their structure, function, and homogeneity has become paramount. Various analytical techniques are used to characterize the primary, secondary, and tertiary structure of the protein and to determine the quality, purity, and stability of the recombinant product. Bioassays establish its activity.

Self-Assessment Questions

Questions
1.
What is the net charge of granulocyte-colony-stimulating factor at pH 2.0, assuming that all the carboxyl groups are protonated?
 
2.
Based on the above calculation, do you expect the protein to unfold at pH 2.0?
 
3.
Design an experiment using blotting techniques to ascertain the presence of a ligand to a particular receptor.
 
4.
What is the transfer of proteins to a membrane such as nitrocellulose or PDVF called?
 
5.
What is the assay in which the antibody is adsorbed to a plastic microtitration plate and then is used to quantify the amount of a protein using a secondary antibody conjugated with horseradish peroxidase named?
 
6.
In 2-dimensional electrophoresis, what is the first method of separation?
 
7.
What is the method for separating proteins in solution based on molecular size called?
 
8.
Why are large protein particles more immunogenic?
 
Answers
1.
Based on the assumption that glutamyl and aspartyl residues are uncharged at this pH, all the charges come from protonated histidyl, lysyl, arginyl residues, and the amino terminus, i.e., 5 His + 4 Lys + 5 Arg + N-terminal = 15.
 
2.
Whether a protein unfolds or remains folded depends on the balance between the stabilizing and destabilizing forces. At pH 2.0, extensive positive charges destabilize the protein, but whether such destabilization is sufficient or insufficient to unfold the protein depends on how stable the protein is in the native state. The charged state alone cannot predict whether a protein will unfold.
 
3.
A solution containing the putative ligand is subjected to SDS-PAGE. After blotting the proteins in the gel to a membrane, it is probed with a solution containing the receptor. The receptor, which binds the ligand, may be labeled with agents suitable for detection or, alternatively, the complex can subsequently be probed with an antibody to the receptor and developed as for an immunoblot. Note that the reciprocal of this can be done as well, in which the receptor is subjected to SDS-PAGE and the blot is probed with the ligand.
 
4.
This method is called blotting. If an electric current is used, then the method is called electroblotting.
 
5.
This assay is called an ELISA, enzyme-linked immunosorbent assay.
 
6.
Either isoelectric focusing or native polyacrylamide electrophoresis. The second dimension is performed in the presence of the detergent sodium dodecyl sulfate.
 
7.
Size-exclusion chromatography.
 
8.
The immune systems are designed to fight against virus infections and hence generate antibodies against foreign particles with the size of the virus. When pharmaceutical proteins aggregate into the particle size, the immune system recognizes them as viruslike (cf. Chap.​ 6).
 
Further Reading
Butler JE (ed) (1991) Immunochemistry of solid-phase immunoassay. CRC Press, Boca Raton
Coligan J, Dunn B, Ploegh H, Speicher D, Wingfield P (eds) (1995) Current protocols in protein science. Wiley, New York
Crabb JW (ed) (1995) Techniques in protein chemistry VI. Academic, San Diego
Creighton TE (ed) (1989) Protein structure: a practical approach. IRL Press, Oxford
Crowther JR (1995) ELISA, theory and practice. Humana Press, Totowa
Dunbar BS (1994) Protein blotting: a practical approach. Oxford University Press, New York
Gregory RB (ed) (1994) Protein-solvent interactions. Marcel Dekker, New York
Hames BD, Rickwood D (eds) (1990) Gel electrophoresis of proteins: a practical approach, 2nd edn. IRL Press, New York
Jiskoot W, Crommelin DJA (eds) (2005) Methods for structural analysis of protein pharmaceuticals. AAPS Press, Arlington
Landus JP (ed) (1994) Handbook of capillary electrophoresis. CRC Press, Boca Raton
McEwen CN, Larsen BS (eds) (1990) Mass spectrometry of biological materials. Dekker, New York
Price CP, Newman DJ (eds) (1991) Principles and practice of immunoassay. Stockton, New York
Schulz GE, Schirmer RH (eds) (1979) Principles of protein structure. Springer, New York
Shirley BA (ed) (1995) Protein stability and folding. Humana Press, Totowa