Introduction
For a recombinant protein to become a human
therapeutic, its biophysical and biochemical characteristics must
be well understood. These properties serve as a basis for
comparison of lot-to-lot reproducibility; for establishing the
range of conditions to stabilize the protein during production,
storage, and shipping; and for identifying characteristics useful
for monitoring stability during long-term storage.
A number of techniques can be used to determine
the biophysical properties of proteins and to examine their
biochemical and biological integrity. Where possible, the results
of these experiments are compared with those obtained using
naturally occurring proteins in order to be confident that the
recombinant protein has the desired characteristics of the
naturally occurring one.
Protein Structure
Primary Structure
Most proteins which are developed for therapy
perform specific functions by interacting with other small and
large molecules, e.g., cell-surface receptors, binding proteins,
nucleic acids, carbohydrates, and lipids. The functional properties
of proteins are derived from their folding into distinct
three-dimensional structures. Each protein fold is based on its
specific polypeptide sequence in which different amino acids are
connected through peptide bonds in a specific way. This alignment
of the 20 amino acids, called a primary sequence, has in general
all the information necessary for folding into a distinct tertiary
structure comprising different secondary structures such as
α-helices and β-sheets (see below). Because the 20 amino acids
possess different side chains, polypeptides with widely diverse
properties are obtained.
All of the 20 amino acids consist of a
Cα carbon to which an amino group, a carboxyl group, a
hydrogen, and a side chain bind in L configuration (Fig.
2.1).
These amino acids are joined by condensation to yield a peptide
bond consisting of a carboxyl group of an amino acid joined with
the amino group of the next amino acid (Fig. 2.2).

Figure
2.1 ■
Structure of L-amino acids.

Figure
2.2 ■
Structure of peptide bond.
The condensation gives an amide group, NH, at the
N-terminal side of Cα and a carbonyl group, C = O, at
the C-terminal side. These groups, as well as the amino acyl side
chains, play important roles in protein folding. Due to their
ability to form hydrogen bonds, they make major energetic
contributions to the formation of two important secondary
structures, α-helix and β-sheet. The peptide bonds between various
amino acids are very much equivalent, however, so that they do not
determine which part of a sequence should form an α-helix or
β-sheet. Sequence-dependent secondary structure formation is
determined by the side chains.
The 20 amino acids commonly found in proteins are
shown in Fig. 2.3. They are described by their full names
and three- and one-letter codes. Their side chains are structurally
different in such a way that at neutral pH, aspartic and glutamic
acid are negatively charged and lysine and arginine are positively
charged. Histidine is positively charged to an extent that depends
on the pH. At pH 7.0, on average, about half of the histidine side
chains are positively charged. Tyrosine and cysteine are protonated
and uncharged at neutral pH, but become negatively charged above pH
10 and 8, respectively.



Figure
2.3 ■
NB (a
and b) Structure of 20 amino acids.
Polar amino acids consist of serine, threonine,
asparagine, and glutamine, as well as cysteine, while nonpolar
amino acids consist of alanine, valine, phenylalanine, proline,
methionine, leucine, and isoleucine. Glycine behaves neutrally
while cystine, the oxidized form of cysteine, is characterized as
hydrophobic. Although tyrosine and tryptophan often enter into
polar interactions, they are better characterized as nonpolar, or
hydrophobic, as described later.
These 20 amino acids are incorporated into a
unique sequence based on the genetic code, as the example in Fig.
2.4 shows.
This is an amino acid sequence of granulocyte-colony-stimulating
factor (G-CSF), which selectively regulates proliferation and
maturation of neutrophils. Although the exact properties of this
protein depend on the location of each amino acid and hence the
location of each side chain in the three-dimensional structure, the
average properties can be estimated simply from the amino acid
composition, as shown in Table 2.1, i.e., a list of the total number of
each type of amino acid contained in this protein molecule.

Figure
2.4 ■
Amino acid sequence of
granulocyte-colony-stimulating factor.
Using the pKa values of these side
chains and one amino and carboxyl terminus, one can calculate total
charges (positive plus negative charges) and net charges (positive
minus negative charges) of a protein as a function of pH, i.e., a
titration curve. Since cysteine can be oxidized to form a disulfide
bond or can be in a free form, accurate calculation above pH 8
requires knowledge of the status of cysteinyl residues in the
protein. The titration curve thus obtained is only an
approximation, since some charged residues may be buried and the
effective pKa values depend on the local environment of each
residue. Nevertheless, the calculated titration curve gives a first
approximation of the overall charged state of a protein at a given
pH and hence its solution property. Other molecular parameters,
such as isoelectric point (pI, where the net charge of a protein
becomes zero), molecular weight, extinction coefficient, partial
specific volume, and hydrophobicity, can also be estimated from the
amino acid composition, as shown in Table 2.1.
The primary structure of a protein, i.e., the
sequence of the 20 amino acids, can lead to the three-dimensional
structure because the amino acids have diverse physical properties.
First, each type of amino acid has the tendency to be more
preferentially incorporated into certain secondary structures. The
frequencies with which each amino acid is found in α-helix,
β-sheet, and β-turn, secondary structures that are discussed later
in this chapter, can be calculated as an average over a number of
proteins whose three-dimensional structures have been solved. These
frequencies are listed in Table 2.2. The β-turn has a distinct configuration
consisting of four sequential amino acids and there is a strong
preference for specific amino acids in these four positions. For
example, asparagine has an overall high frequency of occurrence in
a β-turn and is most frequently observed in the first and third
position of a β-turn. This characteristic of asparagine is
consistent with its side chain being a potential site of N-linked
glycosylation. Effects of glycosylation on the biological and
physicochemical properties of proteins are extremely important.
However, their contribution to structure is not readily predictable
based on the amino acid composition.
Table
2.1 ■
Amino acid composition and structural
parameters of granulocyte-colony-stimulating factor.
Parameter
|
Value
|
||
Molecular weight
|
18,673
|
||
Total number of amino acids
|
174
|
||
1 μg
|
53.5 picomoles
|
||
Molar extinction coefficient
|
15,820
|
||
1 A (280)
|
1.18 mg/ml
|
||
Isoelectric point
|
5.86
|
||
Charge at pH 7
|
−3.39
|
||
Amino
acid
|
Number
|
% By
weight
|
% By
frequency
|
A Ala
|
19
|
7.23
|
10.92
|
C Cys
|
5
|
2.76
|
2.87
|
D Asp
|
4
|
2.47
|
2.30
|
E Glu
|
9
|
6.22
|
5.17
|
F Phe
|
6
|
4.73
|
3.45
|
G Gly
|
14
|
4.28
|
8.05
|
H His
|
5
|
3.67
|
2.87
|
1 Me
|
4
|
2.42
|
2.30
|
K Lys
|
4
|
2.75
|
2.30
|
L Leu
|
33
|
20.00
|
18.97
|
M Met
|
3
|
2.11
|
1.72
|
N Asn
|
0
|
0.00
|
0.00
|
P Pro
|
13
|
6.76
|
7.47
|
Q GIn
|
17
|
11.66
|
9.77
|
R Arg
|
5
|
4.18
|
2.87
|
S Ser
|
14
|
6.53
|
8.05
|
T Thr
|
7
|
3.79
|
4.02
|
V Val
|
7
|
3.71
|
4.02
|
W Trp
|
2
|
1.99
|
1.15
|
Y Tyr
|
3
|
2.62
|
1.72
|
Table
2.2 ■
Frequency of occurrence of 20 amino acids
in α-helix, β-sheet, and β-turn.
α-Helix
|
β-Sheet
|
β-Turn
|
β-Turn position 1
|
β-Turn position 2
|
β-Turn position 3
|
β-Turn position 4
|
|||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Glu
|
1.51
|
Val
|
1.70
|
Asn
|
1.56
|
Asn
|
0.161
|
Pro
|
0.301
|
Asn
|
0.191
|
Trp
|
0.167
|
Met
|
1.45
|
Lie
|
1.60
|
Gly
|
1.56
|
Cys
|
0.149
|
Ser
|
0.139
|
Gly
|
0.190
|
Gly
|
0.152
|
Ala
|
1.42
|
Tyr
|
1.47
|
Pro
|
1.52
|
Asp
|
0.147
|
Lys
|
0.115
|
Asp
|
0.179
|
Cys
|
0.128
|
Leu
|
1.21
|
Phe
|
1.38
|
Asp
|
1.46
|
His
|
0.140
|
Asp
|
0.110
|
Ser
|
0.125
|
Tyr
|
0.125
|
Lys
|
1.16
|
Trp
|
1.37
|
Ser
|
1.43
|
Ser
|
0.120
|
Thr
|
0.108
|
Cys
|
0.117
|
Ser
|
0.106
|
Phe
|
1.13
|
Leu
|
1.30
|
Cys
|
1.19
|
Pro
|
0.102
|
Arg
|
0.106
|
Tyr
|
0.114
|
Gin
|
0.098
|
Gin
|
1.11
|
Cys
|
1.19
|
Tyr
|
1.14
|
Gly
|
0.102
|
Gin
|
0.098
|
Arg
|
0.099
|
Lys
|
0.095
|
Trp
|
1.08
|
Thr
|
1.19
|
Lys
|
1.01
|
Thr
|
0.086
|
Gly
|
0.085
|
His
|
0.093
|
Asn
|
0.091
|
IIe
|
1.08
|
Gin
|
1.10
|
Gin
|
0.98
|
Tyr
|
0.082
|
Asn
|
0.083
|
Glu
|
0.077
|
Arg
|
0.085
|
Val
|
1.06
|
Met
|
1.05
|
Thr
|
0.96
|
Trp
|
0.077
|
Met
|
0.082
|
Lys
|
0.072
|
Asp
|
0.081
|
Asp
|
1.01
|
Arg
|
0.93
|
Trp
|
0.96
|
Gin
|
0.074
|
Ala
|
0.076
|
Tyr
|
0.065
|
Thr
|
0.079
|
His
|
1.00
|
Asn
|
0.89
|
Arg
|
0.95
|
Arg
|
0.070
|
Tyr
|
0.065
|
Phe
|
0.065
|
Leu
|
0.070
|
Arg
|
0.98
|
His
|
0.87
|
His
|
0.95
|
Met
|
0.068
|
Glu
|
0.060
|
Trp
|
0.064
|
Pro
|
0.068
|
Thr
|
0.83
|
Ala
|
0.83
|
Glu
|
0.74
|
Val
|
0.062
|
Cys
|
0.053
|
Gin
|
0.037
|
Phe
|
0.065
|
Ser
|
0.77
|
Ser
|
0.75
|
Ala
|
0.66
|
Leu
|
0.061
|
Val
|
0.048
|
Leu
|
0.036
|
Glu
|
0.064
|
Cys
|
0.70
|
Gly
|
0.75
|
Met
|
0.60
|
Ala
|
0.060
|
His
|
0.047
|
Ala
|
0.035
|
Ala
|
0.058
|
Tyr
|
0.69
|
Lys
|
0.74
|
Phe
|
0.60
|
Phe
|
0.059
|
Phe
|
0.041
|
Pro
|
0.034
|
IIe
|
0.056
|
Asn
|
0.67
|
Pro
|
0.55
|
Leu
|
0.59
|
Glu
|
0.056
|
IIe
|
0.034
|
Val
|
0.028
|
Met
|
0.055
|
Pro
|
0.57
|
Asp
|
0.54
|
Val
|
0.50
|
Lys
|
0.055
|
Leu
|
0.025
|
Met
|
0.014
|
His
|
0.054
|
Gly
|
0.57
|
Glu
|
0.37
|
IIe
|
0.47
|
IIe
|
0.043
|
Trp
|
0.013
|
IIe
|
0.013
|
Val
|
0.053
|
Based on these frequencies, one can predict for
particular polypeptide segments which type of secondary structure
they are likely to form. As shown in Fig. 2.5a, there are a
number of methods developed to predict the secondary structure from
the primary sequence of the proteins. Using G-CSF (Fig.
2.5b) as
an example, regions of α-helix, β-sheets, turns, hydrophilicity,
and antigen sites can be suggested.



Figure
2.5 ■
(a)
Predicted secondary structure of granulocyte-colony-stimulating
factor. Obtained using a program “DNASTAR” (DNASTAR Inc., Madison,
WI). (b) Secondary structure
of Filgrastim (recombinant G-CSF). Filgrastim is a 175-amino acid
polypeptide. Its four antiparallel alpha helices (A, B, C, and D) and short 3-to-10 type helix
(310) form a helical bundle. The two biologically active
sites (α and αL) are remote from modifications at the
N-terminus of the α-helix and the sugar chain attached to loops
C–D. Note: Filgrastim is
not glycosylated; the sugar chain is included to illustrate its
location in endogenous G-CSF.
Another property of amino acids, which impacts on
protein folding, is the hydrophobicity of their side chains.
Although nonpolar amino acids are basically hydrophobic, it is
important to know how hydrophobic they are. This property has been
determined by measuring the partition coefficient or solubility of
amino acids in water and organic solvents and normalizing such
parameters relative to glycine. Relative to the side chain of
glycine, a single hydrogen, such normalization shows how strongly
the side chains of nonpolar amino acids prefer the organic phase to
the aqueous phase. A representation of such measurements is shown
in Table 2.3. The values indicate that the free
energy increases as the side chain of tryptophan and tyrosine are
transferred from an organic solvent to water and that such transfer
is thermodynamically unfavorable. Although it is unclear how
comparable the hydrophobic property is between an organic solvent
and the interior of protein molecules, the hydrophobic side chains
favor clustering together, resulting in a core structure with
properties similar to an organic solvent. These hydrophobic
characteristics of nonpolar amino acids and hydrophilic
characteristics of polar amino acids generate a partition of amino
acyl residues into a hydrophobic core and hydrophilic surface,
resulting in overall folding.
Table
2.3 ■
Hydrophobicity scale: transfer free
energies of amino acid side chains from organic solvent to
water.
Amino acid side chain
|
Cal/mol
|
---|---|
Tryptophan
|
3,400
|
Norleucine
|
2,600
|
Phenylalanine
|
2,500
|
Tyrosine
|
2,300
|
Dihydroxyphenylalanine
|
1,800
|
Leucine
|
1,800
|
Valine
|
1,500
|
Methionine
|
1,300
|
Histidine
|
500
|
Alanine
|
500
|
Threonine
|
400
|
Serine
|
−300
|
Secondary Structure
α-Helix
Immediately evident in the primary structure of a
protein is that each amino acid is linked by a peptide bond. The
amide, NH, is a hydrogen donor and the carbonyl, C = O, is a
hydrogen acceptor, and they can form a stable hydrogen bond when
they are positioned in an appropriate configuration of the
polypeptide chain. Such structures of the polypeptide chain are
called secondary structure. Two main structures, α-helix and
β-sheet, accommodate such stable hydrogen bonds. The main chain
forms a right-handed helix, because only the L-form of amino acids
is in proteins and makes one turn per 3.6 residues. The overall
length of α-helices can vary widely. Figure 2.6 shows an example
of a short α-helix. In this case, the C = O group of residue 1
forms a hydrogen bond to the NH group of residue 5 and C = O group
of residue 2 forms a hydrogen bond with the NH group of residue 6.
Thus, at the start of an α-helix, four amide groups are always
free, and at the end of an α-helix, four carboxyl groups are also
free. As a result, both ends of an α-helix are highly polar.

Figure
2.6 ■
Schematic illustration of the structure of
α-helix.
Moreover, all the hydrogen bonds are aligned
along the helical axis. Since both peptide NH and C = O groups have
electric dipole moments pointing in the same direction, they will
add to a substantial dipole moment throughout the entire α-helix,
with the negative partial charge at the C-terminal side and the
positive partial charge at the N-terminal side.
The side chains project outward from the α-helix.
This projection means that all the side chains surround the outer
surface of an α-helix and interact both with each other and with
side chains of other regions which come in contact with these side
chains. These interactions, so-called long-range interactions, can
stabilize the α-helical structure and help it to act as a folding
unit. Often an α-helix serves as a building block for the
three-dimensional structure of globular proteins by bringing
hydrophobic side chains to one side of a helix and hydrophilic side
chains to the opposite side of the same helix. Distribution of side
chains along the α-helical axis can be viewed using the helical
wheel. Since one turn in an α-helix is 3.6 residues long, each
residue can be plotted every 360°/3.6 = 100° around a circle
(viewed from the top of α-helix), as shown in Fig. 2.7. Such a plot shows
the projection of the position of the residues onto a plane
perpendicular to the helical axis. One of the predicted helices in
erythropoietin is shown in Fig. 2.7, using an open circle for hydrophobic
side chains and an open rectangle for hydrophilic side chains. It
becomes immediately obvious that one side of the α-helix is highly
hydrophobic, suggesting that this side forms an internal core,
while the other side is relatively hydrophilic and is hence most
likely exposed to the surface. Since many biologically important
proteins function by interacting with other macromolecules, the
information obtained from the helical wheel is extremely useful.
For example, mutations of amino acids in the solvent-exposed side
may lead to identification of regions responsible for biological
activity, while mutations in the internal core may lead to altered
protein stability.

Figure
2.7 ■
Helical wheel analysis of erythropoietin
sequence, from His94 to Ala111 (Elliott S, personal communication,
1990).
β-Sheet
The second major secondary structural element
found in proteins is the β-sheet. In contrast to the α-helix, which
is built up from a continuous region with a peptide hydrogen bond
linking every fourth amino acid, the β-sheet is comprised of
peptide hydrogen bonds between different regions of the polypeptide
that may be far apart in sequence. β-strands can interact with each
other in one of the two ways shown in Fig. 2.8, i.e., either
parallel or antiparallel. In a parallel β-sheet, each strand is
oriented in the same direction with peptide hydrogen bonds formed
between the strands, while in antiparallel β-sheets, the
polypeptide sequences are oriented in the opposite direction. In
both structures, the C = O and NH groups project into opposite
sides of the polypeptide chain, and hence, a β-strand can interact
from either side of that particular chain to form peptide hydrogen
bonds with adjacent strands. Thus, more than two β-strands can
contact each other either in a parallel or in an antiparallel
manner, or even in combination. Such clustering can result in all
the β-strands lying in a plane as a sheet. The β-strands which are
at the edges of the sheet have unpaired alternating C = O and NH
groups.

Figure
2.8 ■
Schematic illustration of the structure of
antiparallel (left side)
and parallel (right side)
β-sheet. Arrow indicates
the direction of amino acid sequence from the N-terminus to
C-terminus.
Side chains project perpendicularly to this plane
in opposite directions and can interact with other side chains
within the same β-sheet or with other regions of the molecule, or
may be exposed to the solvent.
In almost all known protein structures, β-strands
are right-handed twisted. This way, the β-strands adapt into widely
different conformations. Depending on how they are twisted, all the
side chains in the same strand or in different strands do not
necessarily project in the same direction.
Loops and Turns
Loops and turns form more or less linear
structures and interact with each other to form a folded
three-dimensional structure. They are comprised of an amino acid
sequence which is usually hydrophilic and exposed to the solvent.
These regions consist of β-turns (reverse turns), short hairpin
loops, and long loops. Many hairpin loops are formed to connect two
antiparallel β-strands.
As shown in Fig. 2.5a, the amino acid
sequences which form β-turns are relatively easy to predict, since
turns must be present periodically to fold a linear sequence into a
globular structure. Amino acids found most frequently in the β-turn
are usually not found in α-helical or β-sheet structures. Thus,
proline and glycine represent the least-observed amino acids in
these typical secondary structures. However, proline has an
extremely high frequency of occurrence at the second position in
the β-turn, while glycine has a high preference at the third and
fourth position of a β-turn.
Although loops are not as predictable as β-turns,
amino acids with high frequency for β-turns also can form a long
loop. Even though difficult to predict, loops are an important
secondary structure, since they form a highly solvent-exposed
region of the protein molecules and allow the protein to fold onto
itself.
Tertiary Structure
Combination of the various secondary structures
in a protein results in its three-dimensional structure. Many
proteins fold into a fairly compact, globular structure.
The folding of a protein molecule into a distinct
three-dimensional structure determines its function. Enzyme
activity requires the exact coordination of catalytically important
residues in the three-dimensional space. Binding of antibody to
antigen and binding of growth factors and cytokines to their
receptors all require a distinct, specific surface for
high-affinity binding. These interactions do not occur if the
tertiary structures of antibodies, growth factors, and cytokines
are altered.
A unique tertiary structure of a protein can
often result in the assembly of the protein into a distinct
quaternary structure consisting of a fixed stoichiometry of protein
chains within the complex. Assembly can occur between the same
proteins or between different polypeptide chains. Each molecule in
the complex is called a subunit. Actin and tubulin self-associate
into F-actin and microtubule, while hemoglobin is a tetramer
consisting of two α- and two β-subunits. Among the cytokines and
growth factors, interferon-γ is a homodimer, while platelet-derived
growth factor is a homodimer of either A or B chains or a
heterodimer of the A and B chain. The formation of a quaternary
structure occurs via non-covalent interactions or through disulfide
bonds between the subunits.
Forces
Interactions occurring between chemical groups in
proteins are responsible for formation of their specific secondary,
tertiary, and quaternary structures. Either repulsive or attractive
interactions can occur between different groups. Repulsive
interactions consist of steric hindrance and electrostatic effects.
Like charges repel each other and bulky side chains, although they
do not repel each other, cannot occupy the same space. Folding is
also against the natural tendency to move toward randomness, i.e.,
increasing entropy. Folding leads to a fixed position of each atom
and hence a decrease in entropy. For folding to occur, this
decrease in entropy, as well as the repulsive interactions, must be
overcome by attractive interactions, i.e., hydrophobic
interactions, hydrogen bonds, electrostatic attraction, and van der
Waals interactions. Hydration of proteins, discussed in the next
section, also plays an important role in protein folding.
These interactions are all relatively weak and
can be easily broken and formed. Hence, each folded protein
structure arises from a fine balance between these repulsive and
attractive interactions. The stability of the folded structure is a
fundamental concern in developing protein therapeutics.
Hydrophobic Interactions
The hydrophobic interaction reflects a summation
of the van der Waals attractive forces among nonpolar groups in the
protein interior, which change the surrounding water structure
necessary to accommodate these groups if they become exposed. The
transfer of nonpolar groups from the interior to the surface
requires a large decrease in entropy so that hydrophobic
interactions are essentially entropically driven. The resulting
large positive free energy change prevents the transfer of nonpolar
groups from the largely sheltered interior to the more
solvent-exposed exterior of the protein molecule. Thus, nonpolar
groups preferentially reside in the protein interior, while the
more polar groups are exposed to the surface and surrounding
environment. The partitioning of different amino acyl residues
between the inside and outside of a protein correlates well with
the hydration energy of their side chains, that is, their relative
affinity for water.
Hydrogen Bonds
The hydrogen bond is ionic in character since it
depends strongly on the sharing of a proton between two
electronegative atoms (generally oxygen and nitrogen atoms).
Hydrogen bonds may form either between a protein atom and a water
molecule or exclusively as protein intramolecular hydrogen bonds.
Intramolecular interactions can have significantly more favorable
free energies (because of entropic considerations) than
intermolecular hydrogen bonds, so the contribution of all hydrogen
bonds in the protein molecule to the stability of protein
structures can be substantial. In addition, when the hydrogen bonds
occur in the interior of protein molecules, the bonds become
stronger due to the hydrophobic environment.
Electrostatic Interactions
Electrostatic interactions occur between any two
charged groups. According to Coulomb’s law, if the charges are of
the same sign, the interaction is repulsive with an increase in
energy, but if they are opposite in sign, it is attractive, with a
lowering of energy. Electrostatic interactions are strongly
dependent upon distance, according to Coulomb’s law, and inversely
related to the dielectric constant of the medium. Electrostatic
interactions are much stronger in the interior of the protein
molecule because of a lower dielectric constant. The numerous
charged groups present on protein molecules can provide overall
stability by the electrostatic attraction of opposite charges, for
example, between negatively charged carboxyl groups and positively
charged amino groups. However, the net effects of all possible
pairs of charged groups must be considered. Thus, the free energy
derived from electrostatic interactions is actually a property of
the whole structure, not just of any single amino acid residue or
cluster.
Van der Waals Interactions
Weak van der Waals interactions exist between
atoms (except the bare proton), whether they are polar or nonpolar.
They arise from net attractive interactions between permanent
dipoles and/or induced (temporary and fluctuating) dipoles.
However, when two atoms approach each other too closely, the
repulsion between their electron clouds becomes strong and
counterbalances the attractive forces.
Hydration
Water molecules are bound to proteins internally
and externally. Some water molecules occasionally occupy small
internal cavities in the protein structure and are hydrogen bonded
to peptide bonds and side chains of the protein and often to a
prosthetic group, or cofactor, within the protein. The protein
surface is large and consists of a mosaic of polar and nonpolar
amino acids, and it binds a large number of water molecules, i.e.,
it is hydrated, from the surrounding environment. As described in
the previous section, water molecules trapped in the interior of
protein molecules are bound more tightly to hydrogen-bonding donors
and acceptors because of a lower dielectric constant.
Solvent around the protein surface clearly has a
general role in hydrating peptide and side chains but might be
expected to be rather mobile and nonspecific in its interactions.
Well-ordered water molecules can make significant contributions to
protein stability. One water molecule can hydrogen bond to two
groups distant in the primary structure on a protein molecule,
acting as a bridge between these groups. Such a water molecule may
be highly restricted in motion and can contribute to the stability,
at least locally, of the protein, since such tight binding may
exist only when these groups assume the proper configuration to
accommodate a water molecule that is present only in the native
state of the protein. Such hydration can also decrease the
flexibility of the groups involved.
There is also evidence for solvation over
hydrophobic groups on the protein surface. So-called hydrophobic
hydration occurs because of the unfavorable nature of the
interaction between water molecules and hydrophobic surfaces,
resulting in the clustering of water molecules. Since this
clustering is energetically unfavorable, such hydrophobic hydration
does not contribute to the protein stability. However, this
hydrophobic hydration facilitates hydrophobic interaction. This
unfavorable hydration is diminished as the various hydrophobic
groups come in contact either intramolecularly or intermolecularly,
leading to the folding of intrachain structures or to
protein-protein interactions.
Both the loosely and strongly bound water
molecules can have an important impact, not only on protein
stability but also on protein function. For example, certain
enzymes function in nonaqueous solvent provided that a small amount
of water, just enough to cover the protein surface, is present.
Bound water can modulate the dynamics of surface groups, and such
dynamics may be critical for enzyme function. Dried enzymes are, in
general, inactive and become active after they absorb 0.2 g water
per g protein. This amount of water is only sufficient to cover
surface polar groups, yet may give sufficient flexibility for
function.
Evidence that water bound to protein molecules
has a different property from bulk water can be demonstrated by the
presence of non-freezable water. Thus, when a protein solution is
cooled below −40 °C, a fraction of water, ~0.3 g water/g protein,
does not freeze and can be detected by high-resolution NMR. Several
other techniques also detect a similar amount of bound water. This
unfreezable water reflects the unique property of bound water that
prevents it from adopting an ice structure.
Proteins are immersed under physiological
conditions or in test tubes in aqueous solutions containing not
only water but also other solution components, e.g., salts, metals,
amino acids, sugars, and many other minor components. These
components also interact with the protein surface and affect
protein folding and stability. For examples, sugars and amino acids
are known to enhance folding and stability of the proteins, as
described below.
Protein Folding
Proteins become functional only when they assume
a distinct tertiary structure. Many physiologically and
therapeutically important proteins present their surface for
recognition by interacting with molecules such as substrates,
receptors, signaling proteins, and cell-surface adhesion
macromolecules. When recombinant proteins are produced in
Escherichia coli, they
often form inclusion bodies into which they are deposited as
insoluble proteins. Formation of such insoluble states does not
naturally occur in cells where they are normally synthesized and
transported. Therefore, an in vitro process is required to refold
insoluble recombinant proteins into their native, physiologically
active state. This is usually accomplished by solubilizing the
insoluble proteins with detergents or denaturants, followed by the
purification and removal of these reagents concurrent with
refolding the proteins (see Chap.
3).
Unfolded states of proteins are usually highly
stable and soluble in the presence of denaturing agents. Once the
proteins are folded correctly, they are also relatively stable.
During the transition from the unfolded form to the native state,
the protein must go through a multitude of other transition states
in which it is not fully folded, and denaturants or solubilizing
agents are at low concentrations or even absent.
The refolding of proteins can be achieved in
various ways. The dilution of proteins at high denaturant
concentration into aqueous buffer will decrease both denaturant and
protein concentration simultaneously. The addition of an aqueous
buffer to a protein-denaturant solution also causes a decrease in
concentrations of both denaturant and protein. The difference in
these procedures is that, in the first case, both denaturant and
protein concentrations are the lowest at the beginning of dilution
and gradually increase as the process continues. In the second
case, both denaturant and protein concentrations are highest at the
beginning of dilution and gradually decrease as the dilution
proceeds. Dialysis or the diafiltration of protein in the
denaturant against an aqueous buffer resembles the second case,
since the denaturant concentration decreases as the procedure
continues. In this case, however, the protein concentration remains
unchanged. Refolding can also be achieved by first binding the
protein in denaturants to a solid phase, i.e., to a column matrix,
and then equilibrating it with an aqueous buffer. In this case,
protein concentrations are not well defined. Each procedure has
advantages and disadvantages and may be applicable for one protein,
but not to another.
If proteins in the native state have disulfide
bonds, cysteines must be correctly oxidized. Such oxidation may be
done in various ways, e.g., air oxidation, glutathione-catalyzed
disulfide exchange, or mixed-disulfide formation followed by
reduction and oxidation or by disulfide reshuffling.
Protein folding has been a topic of intensive
research since Anfinsen’s demonstration that ribonuclease can be
refolded from the fully reduced and denatured state in in vitro
experiments. This folding can be achieved only if the amino acid
sequence itself contains all information necessary for folding into
the native structure. This is the case, at least partially, for
many proteins. However, a lot of other proteins do not refold in a
simple one-step process. Rather, they refold via various
intermediates which are relatively compact and possess varying
degrees of secondary structures, but which lack a rigid tertiary
structure. Intrachain interactions of these preformed secondary
structures eventually lead to the native state. However, the
absence of a rigid structure in these preformed secondary
structures can also expose a cluster of hydrophobic groups to those
of other polypeptide chains, rather than to their own polypeptide
segments, resulting in intermolecular aggregation. High efficiency
in the recovery of native protein depends to a large extent on how
this aggregation of intermediate forms is minimized. The use of
chaperones or polyethylene glycol has been found quite effective
for this purpose. The former are proteins, which aid in the proper
folding of other proteins by stabilizing intermediates in the
folding process and the latter serves to solvate the protein during
folding and diminishes interchain aggregation events.
Protein folding is often facilitated by
cosolvents, such as polyethylene glycol. As described above,
proteins are functional and highly hydrated in aqueous solutions.
True physiological solutions, however, contain not only water but
also various ions and low- and high-molecular-weight solutes, often
at very high concentrations. These ions and other solutes play a
critical role in maintaining the functional structure of the
proteins. When isolated from their natural environment, the protein
molecules may lose these stabilizing factors and hence must be
stabilized by certain compounds, often at high concentrations.
These solutes are also used in vitro to assist in protein folding
and to help stabilize proteins during large-scale purification and
production as well as for long-term storage. Such solutes are often
called cosolvents when used at high concentrations, since at such
high concentrations they also serve as a solvent along with water
molecules. These solutes encompass sugars, amino acids, inorganic
and organic salts, and polyols. They may not strongly bind to
proteins, but instead typically interact weakly with the protein
surface to provide significant stabilizing energy without
interfering with their functional structure.
When recombinant proteins are expressed in
eukaryotic cells and secreted into media, the proteins are
generally folded into the native conformation. If the proteins have
sites for N-linked or O-linked glycosylation, they undergo varying
degrees of glycosylation depending on the host cells used and level
of expression. For many glycoproteins, glycosylation is not
essential for folding, since they can be refolded into the native
conformation without carbohydrates, nor is glycosylation often
necessary for receptor binding and hence biological activity.
However, glycosylation can alter important biological and
physicochemical properties of proteins, such as pharmacokinetics,
solubility, and stability.
Techniques Specifically Suitable for Characterizing Protein Folding
Conventional spectroscopic techniques used to
obtain information on the folded structure of proteins are circular
dichroism (CD), fluorescence, and Fourier transform infrared
spectroscopies (FTIR). CD and FTIR are widely used to estimate the
secondary structure of proteins. The α-helical content of a protein
can be readily estimated by CD in the far-UV region (180–260 nm)
and by FTIR. FTIR signals from loop structures, however,
occasionally overlap with those arising from an α-helix. The
β-sheet gives weak CD signals, which are variable in peak positions
and intensities due to twists of interacting β-strands, making
far-UV CD unreliable for evaluation of these structures. On the
other hand, FTIR can reliably estimate the β-structure content as
well as distinguish between parallel and antiparallel forms.
CD in the near-UV region (250–340 nm) reflects
the environment of aromatic amino acids, i.e., tryptophan,
tyrosine, and phenylalanine, as well as that of disulfide
structures. Fluorescence spectroscopy yields information on the
environment of tyrosine and tryptophan residues. CD and
fluorescence signals in many cases are drastically altered upon
refolding and hence can be used to follow the formation of the
tertiary structure of a protein.
None of these techniques can give the folded
structure at the atomic level, i.e., they give no information on
the exact location of each amino acyl residue in the
three-dimensional structure of the protein. This information can
only be determined by X-ray crystallography or NMR. However, CD,
FTIR, and fluorescence spectroscopic methods are fast and require
lower protein concentrations than either NMR or X-ray
crystallography and are amenable for the examination of the protein
under widely different conditions. When a naturally occurring form
of the protein is available, these techniques, in particular
near-UV CD and fluorescence spectroscopies, can quickly address
whether the refolded protein assumes the native folded
structure.
Temperature dependence of these spectroscopic
properties also provides information about protein folding. Since
the folded structures of proteins are built upon cooperative
interactions of many side chains and peptide bonds in a protein
molecule, elimination of one interaction by heat can cause
cooperative elimination of other interactions, leading to the
unfolding of protein molecules. Thus, many proteins undergo a
cooperative thermal transition over a narrow temperature range.
Conversely, if the proteins are not fully folded, they may undergo
noncooperative thermal transitions as observed by a gradual signal
change over a wider range of temperature.
Such a cooperative structure transition can also
be examined by differential scanning calorimetry. When the
structure unfolds, it requires heat. Such heat absorption can be
determined using this highly sensitive calorimetry technique.
Hydrodynamic properties of proteins change
greatly upon folding, going from elongated and expanded structures
to compact globular ones. Sedimentation velocity and size-exclusion
chromatography (see section “Analytical Techniques”) are two frequently
used techniques for the evaluation of hydrodynamic properties,
although the latter is much more accessible. The sedimentation
coefficient (how fast a molecule migrates in a centrifugal field)
is a function of both the molecular weight and hydrodynamic size of
the proteins, while elution position in size-exclusion
chromatography (how fast it migrates through pores) depends only on
the hydrodynamic size (see Chap.
3). In both methods, comparison of the
sedimentation coefficient or elution position with that of a
globular protein with an identical molecular weight (or upon
appropriate molecular-weight normalization) gives information on
how compactly the protein is folded.
For oligomeric proteins, the determination of
molecular weight of the associated states and acquisition of the
quaternary structure can be used to assess the folded structure.
For strong interactions, specific protein association requires that
intersubunit contact surfaces perfectly match each other. Such an
associated structure, if obtained by covalent bonding, may be
determined simply by sodium dodecyl sulfate-polyacrylamide gel
electrophoresis. If protein association involves non-covalent
interactions, sedimentation equilibrium or light scattering
experiments can assess this phenomenon. Although these techniques
have been used for many decades with some difficulty, emerging
technologies in analytical ultracentrifugation and laser light
scattering, and appropriate software for analyzing the results,
have greatly facilitated their general use, as described in detail
below.
Two fundamentally different light scattering
techniques can be used in characterizing recombinant proteins.
“Static” light scattering measures the intensity of the scattered
light. “Dynamic” light scattering measures the fluctuations in the
scattered light intensity as molecules diffuse in and out of a very
small scattering region (Brownian motion).
Static light scattering is often used online in
conjunction with size-exclusion chromatography (SEC). The
scattering signal is proportional to the product of molecular mass
times weight concentration. Dividing this signal by one
proportional to the concentration, such as obtained from an UV
absorbance or refractive index detector, then gives a direct and
absolute measure of the mass of each peak eluting from the column,
independent of molecular conformation and elution position. This
SEC-static scattering combination allows rapid identification of
whether the native state of a protein is a monomer or an oligomer
and the stoichiometry of multi-protein complexes. It is also very
useful in identifying the mass of aggregates which may be present
and thus is useful for evaluating protein stability.
Dynamic light scattering (DLS) measures the
diffusion rate of the molecules, which can be translated into the
Stokes radius, a measure of hydrodynamic size. Although the Stokes
radius is strongly correlated with molecular mass, it is also
strongly influenced by molecular shape (conformation), and thus,
DLS is far less accurate than static scattering for measuring
molecular mass. The great strength of DLS is its ability to cover a
very wide size range in one measurement and to detect very small
amounts of large aggregates (<0.01 % by weight). Other important
advantages over static scattering with SEC are a wide choice of
buffer conditions and no potential loss of species through sticking
to a column.
An analytical ultracentrifuge incorporates an
optical system and special rotors and cells in a high-speed
centrifuge to permit measurement of the concentration of a sample
versus position within a spinning centrifuge cell. There are two
primary strategies: analyzing either the sedimentation velocity or
the sedimentation equilibrium. When analyzing the sedimentation
velocity, the rotor is spun at very high speed, so the protein
sample will completely sediment and form a pellet. The rate at
which the protein pellets is measured by the optical system to
derive the sedimentation coefficient, which depends on both mass
and molecular conformation. When more than one species is present
(e.g., a monomer plus a covalent dimer degradation product), a
separation is achieved based on the relative sedimentation
coefficient of each species.
Proteins form not only small oligomers that can
be measured by the above techniques but also much larger
aggregates, called subvisible and visible particles, which are
present in minute quantities. As their size approaches the size of
virus, they become highly immunogenic (cf. Chap. 6) and hence determination of
their size and amount becomes critical for developing
pharmaceutical protein products. Such determination requires
imaging of the particles, as the hydrodynamic techniques such as
dynamic light scattering and sedimentation velocity have neither
sensitivity nor resolution for such large aggregates heterogeneous
in size distribution. Normally such particles are present in minute
quantity, but yet cause serious immunogenic responses due to their
large size.
Because the sedimentation coefficient is
sensitive to molecular conformation and can be measured with high
precision (~0.5 %), sedimentation velocity can detect even fairly
subtle differences in conformation. This ability can be used, for
example, to confirm that a recombinant protein has the same
conformation as the natural wild-type protein or to detect small
changes in structure with changes in the pH or salt concentration
that may be too subtle to detect by other techniques, such as CD or
differential scanning calorimetry.
In sedimentation equilibrium, a much lower rotor
speed and milder centrifugal force is used than for sedimentation
velocity. The protein still accumulates toward the outside of the
rotor, but no pellet is formed. This concentration gradient across
the cell is continuously opposed by diffusion, which tries to
restore a uniform concentration. After spinning for a long time
(usually 12–36 h), an equilibrium is reached where sedimentation
and diffusion are balanced and the distribution of protein no
longer changes with time. At sedimentation equilibrium, the
concentration distribution depends only on the molecular mass and
is independent of molecular shape. Thus, self-association for the
formation of dimers or higher oligomers (whether reversible or
irreversible) is readily detected, as are binding interactions
between different proteins. For reversible association, it is
possible to determine the strength of the binding interaction by
measuring samples over a wide range of protein
concentrations.
In biotechnology applications, sedimentation
equilibrium is often used as the “gold standard” for confirming
that a recombinant protein has the expected molecular mass and
biologically active state of oligomerization in solution. It can
also be used to determine the average amount of glycosylation or
conjugation of moieties such as polyethylene glycol. The
measurement of binding affinities for receptor-cytokine,
antigen-antibody, or other interaction can also sometimes serve as
a functional characterization of recombinant proteins (although
some of these interactions are too strong to be measured by this
method).
Site-specific chemical modification and
proteolytic digestion are also powerful techniques for studying the
folding of proteins. The extent of chemical modification or
proteolytic digestion depends on whether the specific sites are
exposed to the solvent or are buried in the interior of the protein
molecules and are thus inaccessible to these modifications. For
example, trypsin cleaves peptide bonds on the C-terminal side of
basic residues. Although most proteins contain several basic
residues, brief exposure of the native protein to trypsin usually
generates only a few peptides, as cleavage occurs only at the
accessible basic residues, whereas the same treatment can generate
many more peptides when done on the denatured (unfolded) protein,
since all the basic residues are now accessible (see also peptide
mapping in section “Mass Spectrometry”).
Protein Stability
Although freshly isolated proteins may be folded
into a distinct three-dimensional structure, this folded structure
is not necessarily retained indefinitely in aqueous solution. The
reason is that proteins are neither chemically nor physically
stable. The protein surface is chemically highly heterogeneous and
contains reactive groups. Long-term exposure of these groups to
environmental stresses causes various chemical alterations. Many
proteins, including growth factors and cytokines, have cysteine
residues. If some of them are in a free or sulfhydryl form, they
may undergo oxidation and disulfide exchange. Oxidation can also
occur on methionyl residues. Hydrolysis can occur on peptide bonds
and on amides of asparagine and glutamine residues. Other chemical
modifications can occur on peptide bonds, tryptophan, tyrosine, and
amino and carboxyl groups. Table 2.4 lists both a number of reactions that
can occur during purification and storage of proteins and methods
that can be used to detect such changes.
Table
2.4 ■
Common reactions affecting stability of
proteins.
Physical property effected
|
Method of analysis
|
|
---|---|---|
Oxidation
|
Hydrophobicity size
|
RP-HPLC, SDS-PAGE, size-exclusion
chromatography, and mass spectrometry
|
Cys
|
Hydrophobicity
|
|
Disulfide
|
||
Intrachain
|
||
Interchain
|
||
Met, Trp, Tyr
|
||
Peptide bond hydrolysis
|
Size
|
Size-exclusion chromatography
SDS-PAGE
|
N to O migration
|
Hydrophobicity
|
RP-HPLC inactive in Edman reaction
|
Ser, Thr
|
Chemistry
|
|
α-carboxy to β-carboxy migration
|
Hydrophobicity
|
RP-HPLC inactive in Edman reaction
|
Asp, Asn
|
Chemistry
|
|
Deamidation
|
Charge
|
Ion-exchange chromatography
|
Asn, Gln
|
||
Acylation
|
Charge
|
Ion-exchange chromatography Mass
spectrometry
|
α-amino group, ε-amino group
|
||
Esterification/carboxylation
|
Charge
|
Ion-exchange chromatography Mass
spectrometry
|
Glu, Asp, C-terminal
|
||
Secondary structure changes
|
Hydrophobicity
|
RP-HPLC
|
Size
|
Size-exclusion chromatography
|
|
Sec/tert structure
|
CD
|
|
Sec/tert structure
|
FTIR
|
|
Aggregation
|
Light scattering
|
|
Sec/tert structure, Aggregation
|
Analytical ultracentrifugation
|
Physical stability of a protein is expressed as
the difference in free energy, ΔG U, between the native and
denatured states. Thus, protein molecules are in equilibrium
between the above two states. As long as this unfolding is
reversible and ΔG
U is positive, it does not matter how small the
ΔG U is. In many
cases, this reversibility does not hold. This is often seen when
ΔG U is
decreased by heating. Most proteins denature upon heating and
subsequent aggregation of the denatured molecules results in
irreversible denaturation. Thus, unfolding is made irreversible by
aggregation:

Therefore, any stress that decreases
ΔG U and
increases k will cause the
accumulation of irreversibly inactivated forms of the protein. Such
stresses may include chemical modifications as described above and
physical parameters, such as pH, ionic strength, protein
concentration, and temperature. Development of a suitable
formulation that prolongs the shelf life of a recombinant protein
is essential when it is to be used as a human therapeutic.
The use of protein stabilizing agents to enhance
storage stability of proteins has become customary. These compounds
affect protein stability by increasing ΔG U. These compounds,
however, may also increase k and hence their net effect on
long-term storage of proteins may vary among proteins, as well as
on the storage conditions.
When unfolding is irreversible due to
aggregation, minimizing the irreversible step should increase the
stability, and often, this may be attained by the addition of mild
detergents. Prior to selecting the proper detergent concentration
and type, however, their effects on ΔG U must be carefully
evaluated.
Another approach for enhancing storage stability
of proteins is to lyophilize, or freeze-dry, the proteins (see
Chap.
4). Lyophilization can minimize the aggregation
step during storage, since both chemical modification and
aggregation are reduced in the absence of water. The effects of a
lyophilization process itself on ΔG U and k are not fully understood and hence
such a process must be optimized for each protein
therapeutic.
Analytical Techniques
In one of the previous sections on “Techniques Specifically Suitable
for Characterizing Folding,” a number of (spectroscopic)
techniques were mentioned that can be specifically used to monitor
protein folding. These were CD, FTIR, fluorescence spectroscopy,
and DSC. Moreover, analytical ultracentrifugation and light
scattering techniques were discussed in more detail. In this
section, other techniques will be discussed.
Blotting Techniques
Blotting methods form an important niche in
biotechnology. They are used to detect very low levels of unique
molecules in a milieu of proteins, nucleic acids, and other
cellular components. They can detect aggregates or breakdown
products occurring during long-term storage and they can be used to
detect components from the host cells used in producing recombinant
proteins.
Biomolecules are transferred to a membrane
(“blotting”), and this membrane is then probed with specific
reagents to identify the molecule of interest. Membranes used in
protein blots are made of a variety of materials including
nitrocellulose, nylon, and polyvinylidene difluoride (PVDF), all of
which avidly bind protein.
Liquid samples can be analyzed by methods called
dot blots or slot blots. A solution containing the biomolecule of
interest is filtered through a membrane which captures the
biomolecule. The difference between a dot blot and a slot blot is
that the former uses a circular or disk format, while the latter is
a rectangular configuration. The latter method allows for a more
precise quantification of the desired biomolecule by scanning
methods and relating the integrated results to that obtained with
known amounts of material.
Often, the sample is subjected to some type of
fractionation, such as polyacrylamide gel electrophoresis, prior to
the blotting step. An early technique, Southern blotting, named
after the discoverer, E.M. Southern, is used to detect DNA
fragments. When this procedure was adapted to RNA fragments and to
proteins, other compass coordinates were chosen as labels for these
procedures, i.e., northern blots for RNA and western blots for
proteins. Western blots involve the use of labeled antibodies to
detect specific proteins.
Transfer of Proteins
Following polyacrylamide gel electrophoresis, the
transfer of proteins from the gel to the membrane can be
accomplished in a number of ways. Originally, blotting was achieved
by capillary action. In this commonly used method, the membrane is
placed between the gel and absorbent paper. Fluid from the gel is
drawn toward the absorbent paper and the protein is captured by the
intervening membrane. A blot, or impression, of the protein within
the gel is thus made.
The transfer of proteins to the membrane can
occur under the influence of an electric field, as well. The
electric field is applied perpendicularly to the original field
used in separation so that the maximum distance the protein needs
to migrate is only the thickness of the gel, and hence, the
transfer of proteins can occur very rapidly. This latter method is
called electroblotting.
Detection Systems
Once the transfer has occurred, the next step is
to identify the presence of the desired protein. In addition to
various colorimetric staining methods, the blots can be probed with
reagents specific for certain proteins, as for example, antibodies
to a protein of interest. This technique is called immunoblotting.
In the biotechnology field, immunoblotting is used as an identity
test for the product of interest. An antibody that recognizes the
desired protein is used in this instance. Secondly, immunoblotting
is sometimes used to show the absence of host proteins. In this
instance, the antibodies are raised against proteins of the
organism in which the recombinant protein has been expressed. This
latter method can attest to the purity of the desired
protein.
Table 2.5 lists major steps needed for the
blotting procedure to be successful. Once the transfer of proteins
is completed, residual protein binding sites on the membrane need
to be blocked so that antibodies used for detection react only at
the location of the target molecule, or antigen, and not at some
nonspecific location. After blocking, the specific antibody is
incubated with the membrane.
Table
2.5 ■
Major steps in blotting proteins to
membranes.
1. Transfer protein to membrane, e.g., by
electroblotting
|
2. Block residual protein binding sites on
membrane with extraneous proteins such as milk proteins
|
3. Treat membrane with antibody which
recognizes the protein of interest. If this antibody is labeled
with a detecting group, then go to step 5
|
4. Incubate membrane with secondary
antibody which recognizes primary antibody used in step 3. This
antibody is labeled with a detecting group
|
5. Treat the membrane with suitable
reagents to locate the site of membrane attachment of the labeled
antibody in step 4 or step 5
|
The antibody reacts with a specific protein on
the membrane only at the location of that protein because of its
specific interaction with its antigen. When immunoblotting
techniques are used, methods are still needed to recognize the
location of the interaction of the antibody with its specific
protein. A number of procedures can be used to detect this complex
(see Table 2.6).
Table
2.6 ■
Detection methods used in blotting
techniques.
1. Antibodies are labeled with radioactive
markers such as 125I
|
2. Antibodies are linked to an enzyme such
as horseradish peroxidase (HRP) or alkaline phosphatase (AP). On
incubation with substrate, an insoluble colored product is formed
at the location of the antibody. Alternatively, the location of the
antibody can be detected using a substrate which yields a
chemiluminescent product, an image of which is made on photographic
film
|
3. Antibody is labeled with biotin.
Streptavidin or avidin is added to strongly bind to the biotin.
Each streptavidin molecule has four binding sites. The remaining
binding sites can combine with other biotin molecules which are
covalently linked to HRP or to AP
|
The antibody itself can be labeled with a
radioactive marker such as 125I and placed in direct
contact with X-ray film. After exposure of the membrane to the film
for a suitable period, the film is developed and a photographic
negative is made of the location of radioactivity on the membrane.
Alternatively, the antibody can be linked to an enzyme which, upon
the addition of appropriate reagents, catalyzes a color or light
reaction at the site of the antibody. These procedures entail
purification of the antibody and specifically label it. More often,
“secondary” antibodies are used. The primary antibody is the one
which recognizes the protein of interest. The secondary antibody is
then an antibody that specifically recognizes the primary antibody.
Quite commonly, the primary antibody is raised in rabbits. The
secondary antibody may then be an antibody raised in another
animal, such as goat, which recognizes rabbit antibodies. Since
this secondary antibody recognizes rabbit antibodies in general, it
can be used as a generic reagent to detect rabbit antibodies in a
number of different proteins of interest that have been raised in
rabbits. Thus, the primary antibody specifically recognizes and
complexes a unique protein, and the secondary antibody, suitably
labeled, is used for detection (see also section “ELISA” and Fig.
2.10).
The secondary antibody can be labeled with a
radioactive or enzymatic marker group and used to detect several
different primary antibodies. Thus, rather than purifying a number
of different primary antibodies, only one secondary antibody needs
to be purified and labeled for recognition of all the primary
antibodies. Because of their wide use, many common secondary
antibodies are commercially available in kits containing the
detection system and follow routine, straightforward
procedures.
In addition to antibodies raised against the
amino acyl constituents of proteins, specific antibodies can be
used which recognize unique posttranslational components in
proteins, such as phosphotyrosyl residues, which are important
during signal transduction, and carbohydrate moieties of
glycoproteins.
Figure 2.9 illustrates a number of detection
methods that can be used on immunoblots. The primary antibody, or
if convenient, the secondary antibody, can have an appropriate
label for detection. They may be labeled with a radioactive tag as
mentioned previously. Secondly, these antibodies can be coupled
with an enzyme such as horseradish peroxidase (HRP) or alkaline
phosphatase (AP). Substrate is added and is converted to an
insoluble, colored product at the site of the protein-primary
antibody-secondary antibody-HRP product. An alternative substrate
can be used which yields a chemiluminescent product. A chemical
reaction leads to the production of light which can expose
photographic or X-ray film. The chromogenic and chemiluminescent
detection systems have comparable sensitivities to radioactive
methods. The former detection methods are displacing the latter
method, since problems associated with handling radioactive
material and radioactive waste solutions are eliminated.

Figure
2.9 ■
Common immunoblotting detection systems
used to detect antigens, Ag, on membranes. Abbreviations used:
Ab antibody, E enzyme, such as horseradish
peroxidase or alkaline phosphatase, S substrate, P product, either colored and insoluble
or chemiluminescent, B
biotin, Sa
streptavidin.
As illustrated in Fig. 2.9, streptavidin, or
alternatively avidin, and biotin can play an important role in
detecting proteins on immunoblots. This is because biotin forms
very tight complexes with streptavidin and avidin. Secondly, these
proteins are multimeric and contain four binding sites for biotin.
When biotin is covalently linked to proteins such as antibodies and
enzymes, streptavidin binds to the covalently bound biotin, thus
recognizing the site on the membrane where the protein of interest
is located.
Immunoassays
ELISA
Enzyme-linked immunosorbent assay (ELISA)
provides a means to quantitatively measure extremely small amounts
of proteins in biological fluids and serves as a tool for analyzing
specific proteins during purification. This procedure takes
advantage of the observation that plastic surfaces are able to
adsorb low but detectable amounts of proteins. This is a
solid-phase assay. Therefore, antibodies against a certain desired
protein are allowed to adsorb to the surface of microtitration
plates. Each plate may contain up to 96 wells so that multiple
samples can be assayed. After incubating the antibodies in the
wells of the plate for a specific period of time, excess antibody
is removed and residual protein binding sites on the plastic are
blocked by incubation with an inert protein. Several microtitration
plates can be prepared at one time since the antibodies coating the
plates retain their binding capacity for an extended period. During
the ELISA, sample solution containing the protein of interest is
incubated in the wells and the protein (Ag) is captured by the
antibodies coating the well surface. Excess sample is removed and
other antibodies which now have an enzyme (E) linked to them are
added to react with the bound antigen.
The format described above is called a sandwich
assay since the antigen of interest is located between the antibody
on the titer well surface and the antibody containing the linked
enzyme. Figure 2.10 illustrates a number of formats that
can be used in an ELISA. A suitable substrate is added and the
enzyme linked to the antibody-antigen-antibody well complex
converts this compound to a colored product. The amount of product
obtained is proportional to the enzyme adsorbed in the well of the
plate. A standard curve can be prepared if known concentrations of
antigen are tested in this system and the amount of antigen in
unknown samples can be estimated from this standard curve. A number
of enzymes can be used in ELISAs. However, the most common ones are
horseradish peroxidase and alkaline phosphatase. A variety of
substrates for each enzyme are available which yield colored
products when catalyzed by the linked enzyme. Absorbance of the
colored product solutions is measured on plate readers, instruments
which rapidly measure the absorbance in all 96 wells of the
microtitration plate, and data processing can be automated for
rapid throughput of information. Note that detection approaches
partly parallel those discussed in the section on “Blotting.” The above
ELISA format is only one of many different methods. For example,
the microtitration wells may be coated directly with the antigen
rather than having a specific antibody attached to the surface.
Quantitation is made by comparison with known quantities of antigen
used to coat individual wells.

Figure
2.10 ■
Examples of several formats for ELISA in
which the specific antibody is adsorbed to the surface of a
microtitration plate. See Fig. 2.9 for abbreviations used. The antibody is
represented by the Y-type structure. The product P is colored and the amount generated
is measured with a spectrometer or plate reader.
Another approach, this time subsequent to the
binding of antigen either directly to the surface or to an antibody
on the surface, is to use an antibody specific to the antibody
binding the protein antigen, that is, a secondary antibody. This
latter, secondary, antibody contains the linked enzyme used for
detection. As already discussed in the section on “Blotting,” the
advantage to this approach is that such antibodies can be obtained
in high purity and with the desired enzyme linked to them from
commercial sources. Thus, a single source of enzyme-linked antibody
can be used in assays for different protein antigens. Should a
sandwich assay be used, then antibodies from different species need
to be used for each side of the sandwich. A possible scenario is
that rabbit antibodies are used to coat the microtitration wells;
mouse antibodies, possibly a monoclonal antibody, are used to
complex with the antigen; and then, a goat anti-mouse
immunoglobulin containing linked HRP or AP is used for detection
purposes.
As with immunoblots discussed above, streptavidin
or avidin can be used in these assays if biotin is covalently
linked to the antibodies and enzymes (Fig. 2.10).
If a radioactive label is used in place of the
enzyme in the above procedure, then the assay is a solid-phase
radioimmunoassay (RIA). Assays are moving away from the use of
radioisotopes, because of problems with safety and disposal of
radioactive waste and since nonradioactive assays have comparable
sensitivities.
Electrophoresis
Analytical methodologies for measuring protein
properties stem from those used in their purification. The major
difference is that systems used for analysis have a higher
resolving power and lower detection limit than those used in
purification. The two major methods for analysis have their bases
in chromatographic or electrophoretic techniques.
Polyacrylamide Gel Electrophoresis
One of the earliest methods for analysis of
proteins is polyacrylamide gel electrophoresis (PAGE). In this
assay, proteins, being amphoteric molecules with both positive and
negative charge groups in their primary structure, are separated
according to their net electrical charge. A second factor which is
responsible for the separation is the mass of the protein. Thus,
one can consider more precisely that the charge to mass ratio of
proteins determines how they are separated in an electrical field.
The charge of the protein can be controlled by the pH of the
solution in which the protein is separated. The farther away the
protein is from its pI value, that is, the pH at which it has a net
charge of zero, the greater is the net charge and hence the greater
is its charge to mass ratio. Therefore, the direction and speed of
migration of the protein depend on the pH of the gel. If the pH of
the gel is above its pI value, then the protein is negatively
charged and hence migrates toward the anode. The higher the pH of
the gel, the faster the migration. This type of electrophoresis is
called native gel electrophoresis.
The major component of polyacrylamide gels is
water. However, they provide a flexible support so that after a
protein has been subjected to an electrical field for an
appropriate period of time, it provides a matrix to hold the
proteins in place until they can be detected with suitable
reagents. By adjusting the amount of acrylamide that is used in
these gels, one can control the migration of material within the
gel. The more acrylamide, the more hindrance for the protein to
migrate in an electrical field.
The addition of a detergent, sodium dodecyl
sulfate (SDS), to the electrophoretic separation system allows for
the separation to take place primarily as a function of the size of
the protein. Dodecyl sulfate ions form complexes with proteins,
resulting in an unfolding of the proteins, and the amount of
detergent that is complexed is proportional to the mass of the
protein. The larger the protein, the more detergent that is
complexed. Dodecyl sulfate is a negatively charged ion. When
proteins are in a solution of SDS, the net effect is that the own
charge of the protein is overwhelmed by that of the dodecyl sulfate
complexed with it, so that the proteins take on a net negative
charge proportional to their mass.
Polyacrylamide gel electrophoresis in the
presence of sodium dodecyl sulfates is commonly known as SDS-PAGE.
All the proteins take on a net negative charge, with larger
proteins binding more SDS but with the charge to mass ratio being
fairly constant among the proteins. An example of SDS-PAGE is shown
in Fig. 2.11. Here, SDS-PAGE is used to monitor
expression of G-CSF receptor and of G-CSF (panel B) in different
culture media.

Figure
2.11 ■
SDS-PAGE of G-CSF receptor, about 35 kDa
(panel a), and G-CSF, about
20 kDa (panel b). These
proteins are expressed in different culture media (lanes 1–9).
Positions of molecular weight standards are given on the left side.
The bands are developed with antibody against G-CSF receptor (panel
a) or G-CSF (panel
b) after blotting.
Since all proteins have essentially the same
charge to mass ratio, how can separation occur? This is done by
controlling the concentration of acrylamide in the path of proteins
migrating in an electrical field. The greater the acrylamide
concentration, the more difficult it is for large protein molecules
to migrate relative to smaller protein molecules. This is sometimes
thought of as a sieving effect, since the greater the acrylamide
concentration, the smaller the pore size within the polyacrylamide
gel. Indeed, if the acrylamide concentration is sufficiently high,
some high-molecular-weight proteins may not migrate at all within
the gel. Since in SDS-PAGE the proteins are denatured, their
hydrodynamic size, and hence the degree of retardation by the
sieving effects, is directly related to their mass. Proteins
containing disulfide bonds will have a much more compact structure
and higher mobility for their mass unless the disulfides are
reduced prior to electrophoresis.
As described above, native gel electrophoresis
and SDS-PAGE are quite different in terms of the mechanism of
protein separation. In native gel electrophoresis, the proteins are
in the native state and migrate on their own charges. Thus, this
electrophoresis can be used to characterize proteins in the native
state. In SDS-PAGE, proteins are unfolded and migrate based on
their molecular mass. As an intermediate case, Blue native
electrophoresis is developed, in which proteins are bound by a dye,
Coomassie blue, used to stain protein bands. This dye is believed
to bind to the hydrophobic surface of the proteins and to add
negative charges to the proteins. The dye-bound proteins are still
in the native state and migrate based on the net charges, which
depend on the intrinsic charges of the proteins and the amounts of
the negatively charged dye. This is particularly useful for
analyzing membrane proteins, which tend to aggregate in the absence
of detergents. The dye prevents the proteins from aggregation by
binding to their hydrophobic surface.
Isoelectric Focusing (IEF)
Another method to separate proteins based on
their electrophoretic properties is to take advantage of their
isoelectric point. In a first run, a pH gradient is established
within the gel using a mixture of small-molecular-weight ampholytes
with varying pI values. The high pH conditions are established at
the site of the cathode. Then, the protein is brought on the gel,
e.g., at the site where the pH is 7. In the electrical field, the
protein will migrate until it reaches the pH on the gel where its
net charge is zero. If the protein were to migrate away from this
pH value, it could gain a charge and migrate toward its pI value
again, leading to a focusing effect.
2-Dimensional Gel Electrophoresis
The above methods can be combined into a
procedure called 2-D gel electrophoresis. Proteins are first
fractionated by isoelectric focusing based upon their pI values.
They are then subjected to SDS-PAGE perpendicular to the first
dimension and fractionated based on the molecular weights of
proteins. SDS-PAGE cannot be performed before isoelectric focusing,
since once SDS binds to and denatures the proteins, they no longer
migrate based on their pI values.
Detection of Proteins Within Polyacrylamide Gels
Although the polyacrylamide gels provide a
flexible support for the proteins, with time, the proteins will
diffuse and spread within the gel. Consequently, the usual practice
is to fix the proteins or trap them at the location where they
migrated to. This is accomplished by placing the gels in a fixing
solution in which the proteins become insoluble.
There are many methods for staining proteins in
gels, but the two most common and well-studied methods are either
staining with Coomassie blue or by a method using silver. The
latter method is used if increased sensitivity is required. The
principle of developing the Coomassie blue stain is the hydrophobic
interaction of a dye with the protein. Thus, the gel takes on a
color wherever a protein is located. Using standard amounts of
proteins, the amount of protein or contaminant may be estimated.
Quantification using the silver staining method is less precise.
However, due to the increased sensitivity of this method, very low
levels of contaminants can be detected. These fixing and staining
procedures denature the proteins. Hence, proteins separated under
native conditions, as in native or Blue native gel electrophoresis,
will be denatured. To maintain the native state, the gels can be
stained with copper or other metal ions.
Capillary Electrophoresis
With recent advances in instrumentation and
technology, capillary electrophoresis has gained an increased
presence in the analysis of recombinant proteins. Rather than
having a matrix, as in polyacrylamide gel electrophoresis through
which the proteins migrate, they are free in solution in an
electric field within the confines of a capillary tube with a
diameter of 25–50 μm. The capillary tube passes through an
ultraviolet light or fluorescence detector that measures the
presence of proteins migrating in the electric field. The movement
of one protein relative to another is a function of the molecular
mass and the net charge on the protein. The latter can be
influenced by pH and analytes in the solution. This technique has
only partially gained acceptance for routine analysis, because of
difficulties in reproducibility of the capillaries and in
validating this system. Nevertheless, it is a powerful analytical
tool for the characterization of recombinant proteins during
process development and in stability studies.
Chromatography
Chromatography techniques are used extensively in
biotechnology not only in protein purification procedures (see
Chap.
3) but also in assessing the integrity of the
product. Routine procedures are highly automated so that
comparisons of similar samples can be made. An analytical system
consists of an autosampler which will take a known amount (usually
a known volume) of material for analysis and automatically places
it in the solution stream headed toward a separation column used to
fractionate the sample. Another part of this system is a pump
module which provides a reproducible flow rate. In addition, the
pumping system can provide a gradient which changes properties of
the solution such as pH, ionic strength, and hydrophobicity. A
detection system (or possibly multiple detectors in series) is
located at the outlet of the column. This measures the relative
amount of protein exiting the column. Coupled to the detector is a
data acquisition system which takes the signal from the detector
and integrates it into a value related to the amount of material
(see Fig. 2.12). When the protein appears, the signal
begins to increase, and as the protein passes through the detector,
the signal subsequently decreases. The area under the peak of the
signal is proportional to the amount of material which has passed
through the detector. By analyzing known amounts of protein, an
area versus amount of protein plot can be generated and this may be
used to estimate the amount of this protein in the sample under
other circumstances. Another benefit of this integrated
chromatography system is that low levels of components which appear
over time can be estimated relative to the major desired protein
being analyzed. This is a particularly useful function when the
long-term stability of the product is under evaluation.

Figure
2.12 ■
Components of a typical chromatography
station. The pump combines solvents one and two in appropriate
ratios to generate a pH, salt concentration, or hydrophobic
gradient. Proteins that are fractioned on the column pass through a
detector which measures their occurrence. Information from the
detector is used to generate chromatograms and the relative amount
of each component.
Chromatographic systems offer a multitude of
different strategies for successfully separating protein mixtures
and for quantifying individual protein components (see Chap. 3). The following describes
some of these strategies.
Size-Exclusion Chromatography
As the name implies, this procedure separates
proteins based on their size or molecular weight or shape. The
matrix consists of very fine beads containing cavities and pores
accessible to molecules of a certain size or smaller, but
inaccessible to larger molecules. The principle of this technique
is the distribution of molecules between the volume of solution
within the beads and the volume of solution surrounding the beads.
Small molecules have access to a larger volume than do large
molecules. As solution flows through the column, molecules can
diffuse back and forth, depending upon their size, in and out of
the pores of the beads. Smaller molecules can reside within the
pores for a finite period of time whereas larger molecules, unable
to enter these spaces, continue along in the fluid stream.
Intermediate-sized molecules spend an intermediate amount of time
within the pores. They can be fractionated from large molecules
that cannot access the matrix space at all and from small molecules
that have free access to this volume and spend most of the time
within the beads. Protein molecules can distribute between the
volume within these beads and the excluded volume based on the mass
and shape of the molecule. This distribution is based on the
relative concentration of the protein in the beads versus the
excluded volume.
Size-exclusion chromatography can be used to
estimate the mass of proteins by calibrating the column with a
series of globular proteins of known mass. However, the separation
depends on molecular shape (conformation) as well as mass and
highly elongated proteins—proteins containing flexible, disordered
regions— and glycoproteins will often appear to have masses as much
as two to three times the true value. Other proteins may interact
weakly with the column matrix and be retarded, thereby appearing to
have a smaller mass. Thus, sedimentation or light scattering
methods are preferred for accurate mass measurement (see section
“Techniques
Specifically Suitable for Characterizing Protein
Folding”). Over time, proteins can undergo a number of
changes that affect their mass. A peptide bond within the protein
can hydrolyze, yielding two smaller polypeptide chains. More
commonly, size-exclusion chromatography is used to assess
aggregated forms of the protein. Figure 2.13 shows an example
of this. The peak at 22 min represents the native protein. The peak
at 15 min is aggregated protein and that at 28 min depicts degraded
protein, yielding smaller polypeptide chains. Aggregation can occur
when a protein molecule unfolds to a slight extent and exposes
surfaces that are attracted to complementary surfaces on adjacent
molecules. This interaction can lead to dimerization or doubling of
molecular weight or to higher-molecular-weight oligomers. From the
chromatographic profile, the mechanism of aggregation can often be
implicated. If dimers, trimers, tetramers, etc., are observed, then
aggregation occurs by stepwise interaction of a monomer with a
dimer, trimer, etc. If dimers, tetramers, octamers, etc., are
observed, then aggregates can interact with each other. Sometimes,
only monomers and high-molecular-weight aggregates are observed,
suggesting that intermediate species are kinetically of short
duration and protein molecules susceptible to aggregation combine
into very large-molecular-weight complexes.

Figure
2.13 ■
Size-exclusion chromatography of a
recombinant protein which on storage yields aggregates and smaller
peptides.
Reversed-Phase High-Performance Liquid Chromatography
Reversed-phase high-performance liquid
chromatography (RP-HPLC) takes advantage of the hydrophobic
properties of proteins. The functional groups on the column matrix
contain from one to up to 18 carbon atoms in a hydrocarbon chain.
The longer this chain, the more hydrophobic is the matrix. The
hydrophobic patches of proteins interact with the hydrophobic
chromatographic matrix. Proteins are then eluted from the matrix by
increasing the hydrophobic nature of the solvent passing through
the column. Acetonitrile is a common solvent used, although other
organic solvents such as ethanol also may be employed. The solvent
is made acidic by the addition of trifluoroacetic acid, since
proteins have increased solubility at pH values further removed
from their pI. A gradient with increasing concentration of
hydrophobic solvent is passed through the column. Different
proteins have different hydrophobicities and are eluted from the
column depending on the “hydrophobic potential” of the
solvent.
This technique can be very powerful. It may
detect the addition of a single oxygen atom to the protein, as when
a methionyl residue is oxidized or when the hydrolysis of an amide
moiety on a glutamyl or asparaginyl residue occurs. Disulfide bond
formation or shuffling also changes the hydrophobic characteristic
of the protein. Hence, RP-HPLC can be used not only to assess the
homogeneity of the protein but also to follow degradation pathways
occurring during long-term storage.
Reversed-phase chromatography of proteolytic
digests of recombinant proteins may serve to identify this protein.
Enzymatic digestion yields unique peptides that elute at different
retention times or at different organic solvent concentrations.
Moreover, the map, or chromatogram, of peptides arising from
enzymatic digestion of one protein is quite different from the map
obtained from another protein. Several different proteases, such as
trypsin, chymotrypsin, and other endoproteinases, are used for
these identity tests (see below under “Mass Spectrometry”).
Hydrophobic Interaction Chromatography
A companion to RP-HPLC is hydrophobic interaction
chromatography (HIC), although in principle, this latter method is
normal-phase chromatography, i.e., here an aqueous solvent system
rather than an organic one is used to fractionate proteins. The
hydrophobic characteristics of the solution are modulated by
inorganic salt concentrations. Ammonium sulfate and sodium chloride
are often used since these compounds are highly soluble in water.
In the presence of high salt concentrations (up to several molar),
proteins are attracted to hydrophobic surfaces on the matrix of
resins used in this technique. As the salt concentration decreases,
proteins have less affinity for the matrix and eventually elute
from the column. This method lacks the resolving power of RP-HPLC,
but is a more gentle method, since low pH values or organic
solvents as used in RP-HPLC can be detrimental to some
proteins.
Ion-Exchange Chromatography
This technique takes advantage of the electronic
charge properties of proteins. Some of the amino acyl residues are
negatively charged and others are positively charged. The net
charge of the protein can be modulated by the pH of its environment
relative to the pI value of the protein. At a pH value lower than
the pI, the protein has a net positive charge, whereas at a pH
value greater than the pI, the protein has a net negative charge.
Opposites attract in ion-exchange chromatography. The resins in
this procedure can contain functional groups with positive or
negative charges. Thus, positively charged proteins bind to
negatively charged matrices and negatively charged proteins bind to
positively charged matrices. Proteins are displaced from the resin
by increasing salt, e.g., sodium chloride, concentrations. Proteins
with different net charges can be separated from one another during
elution with an increasing salt gradient. The choice of charged
resin and elution conditions are dependent upon the protein of
interest.
In lieu of changing the ionic strength of the
solution, proteins can be eluted by changing the pH of the medium,
i.e., with the use of a pH gradient. This method is called
chromatofocusing and proteins are separated based on their pI
values. When the solvent pH reaches the pI value of a specific
protein, the protein has a zero net charge and is no longer
attracted to the charged matrix and hence is eluted.
Other Chromatographic Techniques
Other functional groups may be attached to
chromatographic matrices to take advantage of unique properties of
certain proteins. These affinity methodologies, however, are more
often used in the manufacturing process than in analytical
techniques (see Chap.
3). For example, conventional affinity
purification schemes of antibodies use protein A or G columns.
Protein A or G specifically binds antibodies. Antibodies consist of
variable regions and constant regions (see Chap. 7). The variable regions are
antigen specific and hence vary in sequence from one antibody to
another, while the constant regions are common to each subgroup of
antibodies. The constant region binds to protein A or G.
Mixed-mode chromatography uses columns having
both hydrophobic and charged groups, i.e., combination of
ion-exchange and hydrophobic interaction chromatography. Mixed-mode
columns confer protein binding under conditions at which protein
binding normally does not occur. For example, protein binding to an
ion-exchange column requires low ionic strength. Under identical
conditions, mixed-mode columns can bind proteins through both ionic
and hydrophobic interactions.
Bioassays
Paramount to the development of a protein
therapeutic is to have an assay that identifies its biological
function. Chromatographic and electrophoretic methodologies can
address the homogeneity of a biotherapeutic and be useful in
investigating stability parameters. However, it is also necessary
to ascertain whether the protein has acceptable bioactivity.
Bioactivity can be determined either in vivo, i.e., by
administering the protein to an animal and ascertaining some change
within its body (function), or in vitro. Bioassays in vitro monitor
the response of a specific receptor or microbiological or tissue
cell line when the therapeutic protein is added to the system. An
example of an in vitro bioassay is the increase in DNA synthesis in
the presence of the therapeutic protein as measured by the
incorporation of radioactively labeled thymidine. The protein
factor binds to receptors on the cell surface that triggers
secondary messengers to send signals to the cell nucleus to
synthesize DNA. The binding of the protein factor to the cell
surface is dependent upon the amount of factor present. Figure
2.14
presents a dose–response curve of thymidine incorporation as a
function of concentration of the factor. At low concentrations, the
factor is too low to trigger a response. As the concentration
increases, the incorporation of thymidine occurs, and at higher
concentrations, the amount of thymidine incorporation ceases to
increase as DNA synthesis is occurring at the maximum rate. A
standard curve can be obtained using known quantities of the
protein factor. Comparison of other solutions containing unknown
amounts of the factor with this standard curve will then yield
quantitative estimates of the factor concentration. Through
experience during the development of the protein therapeutic, a
value is obtained for a fully functional protein. Subsequent
comparisons to this value can be used to ascertain any loss in
activity during stability studies or changes in activity when amino
acyl residues of the protein are modified.

Figure
2.14 ■
An in vitro bioassay showing a mitogenic
response in which radioactive thymidine is incorporated into DNA in
the presence of an increasing amount of a protein factor.
Other in vitro bioassays can measure changes in
cell number or production of another protein factor in response to
the stimulation of cells by the protein therapeutic. The amount of
the secondary protein produced can be estimated by using an
ELISA.
Mass Spectrometry
Recent advances in the measurement of the
molecular masses of proteins have made this technique an important
analytical tool. While this method was used in the past to analyze
small volatile molecules, the molecular weights of highly charged
proteins with masses of over 100 kilodaltons (kDa) can now be
accurately determined.
Because of the precision of this method,
posttranslational modifications such as acetylation or
glycosylation can be predicted. The masses of new protein forms
that arise during stability studies provide information on the
nature of this form. For example, an increase in mass of 16 Da
suggests that an oxygen atom has been added to the protein as
happens when a methionyl residue is oxidized to a methionyl
sulfoxide residue. The molecular mass of peptides obtained after
proteolytic digestion and separation by HPLC indicates from which
region of the primary structure they are derived. Such HPLC
chromatogram is called a “peptide map.” An example is shown in Fig.
2.15.
This is obtained by digesting a protein with pepsin and by
subsequently separating the digested peptides by reverse HPLC. This
highly characteristic pattern for a protein is called a “protein
fingerprint.” Peaks are identified by elution times on HPLC. If
peptides have molecular masses differing from those expected from
the primary sequence, the nature of the modification to that
peptide can be implicated. Moreover, molecular mass estimates can
be made for peptides obtained from unfractionated proteolytic
digests. Molecular masses that differ from expected values indicate
that a part of the protein molecule has been altered, that
glycosylation or another modification has been altered, or that the
protein under investigation still contains contaminants.

Figure
2.15 ■
Peptide map of pepsin digest of recombinant
human β-secretase. Each peptide is labeled by elution time in
HPLC.
Another way that mass spectrometry can be used as
an analytical tool is in the sequencing of peptides. A recurring
structure, the peptide bond, in peptides tends to yield fragments
of the mature peptide which differ stepwise by an amino acyl
residue. The difference in mass between two fragments indicates the
amino acid removed from one fragment to generate the other. Except
for leucine and isoleucine, each amino acid has a different mass
and hence a sequence can be read from the mass spectrograph.
Stepwise removal can occur from either the amino terminus or
carboxy terminus.
By changing three basic components of the mass
spectrometer, the ion source, the analyzer, and the detector,
different types of measurement may be undertaken. Typical ion
sources which volatilize the proteins are electrospray ionization,
fast atom bombardment, and liquid secondary ion. Common analyzers
include quadrupole, magnetic sector, and time-of-flight
instruments. The function of the analyzer is to separate the
ionized biomolecules based on their mass-to-charge ratio. The
detector measures a current whenever impinged upon by charged
particles. Electrospray ionization (El) and matrix-assisted laser
desorption (MALDI) are two sources that can generate
high-molecular-weight volatile proteins. In the former method,
droplets are generated by spraying or nebulizing the protein
solution into the source of the mass spectrometer. As the solvent
evaporates, the protein remains behind in the gas phase and passes
through the analyzer to the detector. In MALDI, proteins are mixed
with a matrix which vaporizes when exposed to laser light, thus
carrying the protein into the gas phase. An example of MALDI-mass
analysis is shown in Fig. 2.16, indicating the singly charged ion
(116,118 Da) and the doubly charged ion (5,8036.2) for a purified
protein. Since proteins are multi-charge compounds, a number of
components are observed representing mass-to-charge forms, each
differing from the next by one charge. By imputing various charges
to the mass-to-charge values, a molecular mass of the protein can
be estimated. The latter step is empirical since only the
mass-to-charge ratio is detected and not the net charge for that
particular particle.

Figure
2.16 ■
MALDI-mass analysis of a purified
recombinant human β-secretase. Numbers correspond to the singly
charged and doubly charged ions.
Concluding Remarks
With the advent of recombinant proteins as human
therapeutics, the need for methods to evaluate their structure,
function, and homogeneity has become paramount. Various analytical
techniques are used to characterize the primary, secondary, and
tertiary structure of the protein and to determine the quality,
purity, and stability of the recombinant product. Bioassays
establish its activity.
Self-Assessment Questions
Questions
1.
What is the net charge of
granulocyte-colony-stimulating factor at pH 2.0, assuming that all
the carboxyl groups are protonated?
2.
Based on the above calculation, do you expect the
protein to unfold at pH 2.0?
3.
Design an experiment using blotting techniques to
ascertain the presence of a ligand to a particular receptor.
4.
What is the transfer of proteins to a membrane
such as nitrocellulose or PDVF called?
5.
What is the assay in which the antibody is
adsorbed to a plastic microtitration plate and then is used to
quantify the amount of a protein using a secondary antibody
conjugated with horseradish peroxidase named?
6.
In 2-dimensional electrophoresis, what is the
first method of separation?
7.
What is the method for separating proteins in
solution based on molecular size called?
8.
Why are large protein particles more
immunogenic?
Answers
1.
Based on the assumption that glutamyl and
aspartyl residues are uncharged at this pH, all the charges come
from protonated histidyl, lysyl, arginyl residues, and the amino
terminus, i.e., 5 His + 4 Lys + 5 Arg + N-terminal = 15.
2.
Whether a protein unfolds or remains folded
depends on the balance between the stabilizing and destabilizing
forces. At pH 2.0, extensive positive charges destabilize the
protein, but whether such destabilization is sufficient or
insufficient to unfold the protein depends on how stable the
protein is in the native state. The charged state alone cannot
predict whether a protein will unfold.
3.
A solution containing the putative ligand is
subjected to SDS-PAGE. After blotting the proteins in the gel to a
membrane, it is probed with a solution containing the receptor. The
receptor, which binds the ligand, may be labeled with agents
suitable for detection or, alternatively, the complex can
subsequently be probed with an antibody to the receptor and
developed as for an immunoblot. Note that the reciprocal of this
can be done as well, in which the receptor is subjected to SDS-PAGE
and the blot is probed with the ligand.
4.
This method is called blotting. If an electric
current is used, then the method is called electroblotting.
5.
This assay is called an ELISA, enzyme-linked
immunosorbent assay.
6.
Either isoelectric focusing or native
polyacrylamide electrophoresis. The second dimension is performed
in the presence of the detergent sodium dodecyl sulfate.
7.
Size-exclusion chromatography.
8.
The immune systems are designed to fight against
virus infections and hence generate antibodies against foreign
particles with the size of the virus. When pharmaceutical proteins
aggregate into the particle size, the immune system recognizes them
as viruslike (cf. Chap.
6).
Further Reading
Butler JE (ed) (1991)
Immunochemistry of solid-phase immunoassay. CRC Press, Boca
Raton
Coligan J, Dunn B, Ploegh
H, Speicher D, Wingfield P (eds) (1995) Current protocols in
protein science. Wiley, New York
Crabb JW (ed) (1995)
Techniques in protein chemistry VI. Academic, San Diego
Creighton TE (ed) (1989)
Protein structure: a practical approach. IRL Press, Oxford
Crowther JR (1995) ELISA,
theory and practice. Humana Press, Totowa
Dunbar BS (1994) Protein
blotting: a practical approach. Oxford University Press, New
York
Gregory RB (ed) (1994)
Protein-solvent interactions. Marcel Dekker, New York
Hames BD, Rickwood D
(eds) (1990) Gel electrophoresis of proteins: a practical approach,
2nd edn. IRL Press, New York
Jiskoot W, Crommelin DJA
(eds) (2005) Methods for structural analysis of protein
pharmaceuticals. AAPS Press, Arlington
Landus JP (ed) (1994)
Handbook of capillary electrophoresis. CRC Press, Boca Raton
McEwen CN, Larsen BS
(eds) (1990) Mass spectrometry of biological materials. Dekker, New
York
Price CP, Newman DJ
(eds) (1991) Principles and practice of immunoassay. Stockton, New
York
Schulz GE, Schirmer RH
(eds) (1979) Principles of protein structure. Springer, New
York
Shirley BA (ed) (1995)
Protein stability and folding. Humana Press, Totowa