So far this book has focused mainly on
outcomes, the things that we can observe that tell us that
epigenetic events happen. But every biological phenomenon has a
physical basis and that’s what this chapter is about. The
epigenetic outcomes we’ve described are all a result of variations
in expression of genes. The cells of the retina express a different
set of genes from the cells in the bladder, for example. But how do
the different cell types switch different sets of genes on or
off?
The specialised cell types in the retina
and in the bladder are each at the bottom of one of the troughs in
Waddington’s epigenetic landscape. The work of both John Gurdon and
Shinya Yamanaka showed us that whatever mechanism cells use for
staying in these troughs, it’s not anything to do with changing the
DNA blueprint of the cell. That remains intact and unchanged.
Therefore keeping specific sets of genes turned on or off must
happen through some other mechanism, one that can be maintained for
a really long time. We know this must be the case because some
cells, like the neurons in our brains, are remarkably long-lived.
The neurons in the brain of an 85-year-old person, for example, are
about 85 years of age. They formed when the individual was very
young, and then stayed the same for the rest of their
life.
But other cells are different. The top
layer of skin cells, the epidermis, is replaced about every five
weeks, from constantly dividing stem cells in the deeper layers of
that tissue. These stem cells always produce new skin cells, and
not, for example, muscle cells. Therefore the system that keeps
certain sets of genes switched on or off must also be a mechanism
that can be passed on from parent cell to daughter cell every time
there is a cell division.
This creates a paradox. Researchers have
known since the work of Oswald Avery and colleagues in the
mid-1940s that DNA is the material in cells that carries our
genetic information. If the DNA stays the same in different cell
types in one individual, how can the incredibly precise patterns of
gene expression be transmitted down through the generations of cell
division?
Our analogy of actors reading a script is
again useful. Baz Luhrmann hands Leonardo DiCaprio Shakespeare’s
script for Romeo and Juliet, on which the director has
written or typed various notes – directions, camera placements and
lots of additional technical information. Whenever Leo’s copy of
the script is photocopied, Baz Luhrmann’s additional information is
copied along with it. Claire Danes also has the script for Romeo
and Juliet. The notes on her copy are different from those on
her co-star’s, but will also survive photocopying. That’s how
epigenetic regulation of gene expression occurs – different cells
have the same DNA blueprint (the original author’s script) but
carrying varied molecular modifications (the shooting script) which
can be transmitted from mother cell to daughter cell during cell
division.
These modifications to DNA don’t change
the essential nature of the A, C, G and T alphabet of our genetic
script, our blueprint. When a gene is switched on and copied to
make mRNA, that mRNA has exactly the same sequence, controlled by
the base-pairing rules, irrespective of whether or not the gene is
carrying an epigenetic addition. Similarly, when the DNA is copied
to form new chromosomes for cell division, the same A, C, G and T
sequences are copied.
Since epigenetic modifications don’t
change what a gene codes for, what do they do? Basically, they can
dramatically change how well a gene is expressed, or if it is
expressed at all. Epigenetic modifications can also be passed on
when a cell divides, so this provides a mechanism for how control
of gene expression stays consistent from mother cell to daughter
cell. That’s why skin stem cells only give rise to more skin cells,
not to any other cell type.
Sticking a grape on DNA
The first epigenetic modification to be
identified was DNA methylation. Methylation means the addition of a
methyl group to another chemical, in this case DNA. A methyl group
is very small. It’s just one carbon atom linked to three hydrogen
atoms. Chemists describe atoms and molecules by their ‘molecular
weight’, where the atom of each element has a different weight. The
average molecular weight of a base-pair is around 600 Da (the Da
stands for Daltons, the unit that is used for molecular weight). A
methyl group only weighs 15 Da. By adding a methyl group the weight
of the base-pair is only increased by 2.5 per cent. A bit like
sticking a grape on a tennis ball.
Figure 4.1 shows
what DNA methylation looks like chemically.
The base shown is C – cytosine. It’s the
only one of the four DNA bases that gets methylated, to form
5-methylcytosine. The ‘5’ refers to the position on the ring where
the methyl is added, not to the number of methyl groups; there’s
always only one of these. This methylation reaction is carried out
in our cells, and those of most other organisms, by one of three
enzymes called DNMT1, DNMT3A or DNMT3B. DNMT stands for
DNA methyltransferase. The DNMTs are examples of
epigenetic ‘writers’ – enzymes that create the epigenetic code.
Most of the time these enzymes will only add a methyl group to a C
that is followed by a G. C followed by G is known as
CpG.

Figure 4.1 The
chemical structures of the DNA base cytosine and its epigenetically
modified form, 5-methylcytosine. C: carbon; H: hydrogen; N:
nitrogen; O: oxygen. For simplicity, some carbon atoms have not
been explicitly shown, but are present where there is a junction of
two lines.
This CpG methylation is an epigenetic
modification, which is also known as an epigenetic mark. The
chemical group is ‘stuck onto’ DNA but doesn’t actually alter the
underlying genetic sequence. The C has been decorated rather than
changed. Given that the modification is so small, it’s perhaps
surprising that it will come up over and over again in this book,
and in any discussion of epigenetics. This is because methylation
of DNA has profound effects on how genes are expressed, and
ultimately on cellular, tissue and whole-body
functions.
In the early 1980s it was shown that if
you injected DNA into mammalian cells, the amount of methylation on
the injected DNA affected how well it was transcribed into RNA. The
more methylated the injected DNA was, the less transcription that
occurred1. In other words, high levels
of DNA methylation were associated with genes that were switched
off. However, it wasn’t clear how significant this was for the
genes normally found in the nuclei of cells, rather than ones that
were injected into cells.
The key work in establishing the
importance of methylation in mammalian cells came out of the
laboratory of Adrian Bird, who has spent most of his scientific
career in Edinburgh, Conrad Waddington’s old stomping ground.
Professor Bird is a Fellow of the Royal Society and a former
Governor of the Wellcome Trust, the enormously influential
independent funding agency in UK science. He is one of those
traditional British scientific types – understated, soft-spoken,
non-flashy and drily funny. His lack of self-promotion is in
contrast to his stellar international reputation, where he is
widely acknowledged as the godfather of DNA methylation and its
role in controlling gene expression.
In 1985 Adrian Bird published a key paper
in Cell showing that most CpG motifs were not randomly
distributed throughout the genome. Instead the majority of CpG
pairs were concentrated just upstream of certain genes, in the
promoter region2. Promoters are the stretches
of the genome where the DNA transcription complexes bind and start
copying DNA to form RNA. Regions where there is a high
concentration of CpG motifs are called CpG islands.
In about 60 per cent of the genes that
code for proteins, the promoters lie within CpG islands. When these
genes are active, the levels of methylation in the CpG island are
low. The CpG islands tend to be highly methylated only when the
genes are switched off. Different cell types express different
genes, so unsurprisingly the patterns of CpG island methylation are
also different across different cell types.
For quite some time there was
considerable debate about what this association meant. It was the
old cause or effect debate. One interpretation was that DNA
methylation was essentially a historical modification – genes were
repressed by some unknown mechanism and then the DNA became
methylated. In this model, DNA methylation was just a downstream
consequence of gene repression. The other interpretation was that
the CpG island became methylated, and it was this methylation that
switched the gene off. In this model the epigenetic modification
actually causes the change in gene expression. Although there is
still the occasional argument about this among competing labs, the
vast majority of scientists in this field now believe that the data
generated in the quarter of a century since Adrian Bird’s paper are
consistent with the second, causal model. Under most circumstances,
methylation of the CpG island at the start of a gene turns that
gene off.
Adrian Bird went on to investigate how
DNA methylation switches genes off. He showed that when DNA is
methylated, it binds a protein called MeCP2 (Methyl CpG
binding protein 2)3.
However, this protein won’t bind to unmethylated CpG motifs, which
is pretty amazing when we look back at Figure
4.1 and think how similar the methylated and unmethylated forms
of cytosine really are. The enzymes that add the methyl group to
DNA have been described as writers of the epigenetic code. MeCP2
doesn’t add any modifications to DNA. Its role is to enable the
cell to interpret the modifications on a DNA region. MeCP2 is an
example of a ‘reader’ of the epigenetic code.
Once MeCP2 binds to 5-methylcytosine in a
gene promoter it seems to do a number of things. It attracts other
proteins that also help to switch the gene off4. It
may also stop the DNA transcription machinery from binding to the
gene promoter, and this prevents mRNA messenger molecule from being
produced5. Where genes and their
promoters are very heavily methylated, binding of MeCP2 seems to be
part of a process where that region of a chromosome gets shut down
almost permanently. The DNA becomes incredibly tightly coiled up
and the gene transcription machinery can’t get access to the
base-pairs to make mRNA copies.
This is one of the reasons why DNA
methylation is so important. Remember those 85 year old neurons in
the brains of senior citizens? For over eight decades DNA
methylation has kept certain regions of the genome incredibly
tightly compacted and so the neuron has kept certain genes
completely repressed. This is why our brain cells never produce
haemoglobin, for example, or digestive enzymes.
But what about the other situation, the
example of skin stem cells dividing very frequently but always just
creating new skin cells, rather than some other cell type such as
bone? In this situation, the pattern of DNA methylation is passed
from mother cell to daughter cells. When the two strands of the DNA
double helix separate, each gets copied using the base-pairing
principle, as we saw in Chapter 3. Figure 4.2 illustrates what happens when this
replication occurs in a region where the CpG is methylated on the
C.

Figure 4.2 This
schematic shows how DNA methylation patterns can be preserved when
DNA is replicated. The methyl group is represented by the black
circle. Following separation of the parent DNA double helix in step
1, and replication of the DNA strands in step 2, the new strands
are ‘checked’ by the DNA methyltransferase 1 (DNMT1) enzyme. DNMT1
can recognise that a methyl group at a cytosine motif on one strand
of a DNA molecule is not matched on the newly synthesised strand.
DNMT1 transfers a methyl group to the cytosine on the new strand
(step 3). This only occurs where a C and a G are next to each other
in a CpG motif. This process ensures that the DNA methylation
patterns are maintained following DNA replication and cell
division.
DNMT1 can recognise if a CpG motif is
only methylated on one strand. When DNMT1 detects this imbalance,
it replaces the ‘missing’ methylation on the newly copied strand.
The daughter cells will therefore end up with the same DNA
methylation patterns as the parent cell. As a consequence, they
will repress the same genes as the parent cell and the skin cells
will stay as skin cells.
Miracle mice on
YouTube
Epigenetics has a tendency to crop up
in places where scientists really aren’t expecting it. One of the
most interesting examples of this in recent years has related to
MeCP2, the protein that reads the DNA methylation mark. Several
years ago, the now discredited theory of the MMR vaccine causing
autism was at its height, and getting lots of coverage in the
general media. One very respected UK broadsheet newspaper covered
in depth the terribly sad story of a little girl. As a baby she
initially met all the usual developmental milestones. Shortly after
receiving an MMR jab not long before her first birthday she began
to deteriorate rapidly, losing most of the skills she had gained.
By the time the journalist wrote the article, the little girl was
about four years old and was described as having the most severely
autistic symptoms the author had ever seen. She had not developed
language, appeared to have very severe learning difficulties and
her actions were very limited and repetitive, with very few
purposeful hand actions (she no longer reached out for food, for
example). Development of this incredibly severe disability was
undoubtedly a tragedy for her and for her family.
But if a reader with any sort of
background in neurogenetics read this article, two things probably
struck them immediately. The first was that it’s very unusual – not
unheard of but pretty uncommon – for girls to present with such
severe autism. This is much more common in boys. The second thing
that would have struck them was that this case sounded exactly the
same as a rare genetic disorder called Rett syndrome, right down to
the normal early development and the timing and types of symptoms.
It’s just coincidence that the symptoms of Rett syndrome, and
indeed of most types of autism, first start becoming obvious at
around the same age as when infants are typically given the MMR
vaccination.
But what does this have to do with
epigenetics? In 1999, a group led by the eminent neurogeneticist
Huda Zoghbi at the Baylor College of Medicine in Houston, Texas
showed that the majority of cases of Rett syndrome are caused by
mutations in MeCP2, the gene which encodes the reader of
methylated DNA. The children with this disorder have a mutation in
the MeCP2 gene which means that they don’t produce a
functional MeCP2 protein. Although their cells are perfectly
capable of methylating DNA correctly, the cells can’t read this
part of the epigenetic code properly.
The severe clinical symptoms of children
with the MeCP2 mutation tell us that reading the epigenetic
code properly is very important. But they also tell us other
things. Not all the tissues of girls with Rett syndrome are equally
affected, so perhaps this particular epigenetic pathway is more
important in some tissues than others. Because the girls develop
severe mental retardation, we can deduce that having the right
amount of normal MeCP2 protein is really important in the brain.
Given that these children seem to be fairly unaffected in other
tissues such as liver or kidney, perhaps MeCP2 activity isn’t as
important in these tissues. It could be that DNA methylation itself
isn’t so critical in these organs, or maybe these tissues contain
other proteins in addition to MeCP2 that can read this part of the
epigenetic code.
Long-term, scientists, physicians and
families of children with Rett syndrome would dearly love to be
able to use our increased understanding of the disease to help us
find better treatments. This is a huge challenge, as we would be
trying to intervene in a condition that affects the brain as a
result of a gene mutation that is present throughout development,
and beyond.
One of the most debilitating aspects of
Rett syndrome is the profound mental retardation that is an almost
universal symptom. Nobody knew if it would be possible to reverse a
neurodevelopmental problem such as mental retardation once it had
become established, but the general feeling about this wasn’t
optimistic. Adrian Bird remains a major figure in our story. In
2007 he published an astonishing paper in Science, in which
he and his colleagues showed that Rett syndrome could be reversed,
in a mouse model of the disease.
Adrian Bird and his colleagues created a
cloned strain of mice in which the Mecp2 gene was
inactivated. They used the types of technologies pioneered by
Rudolf Jaenisch. These mice developed severe neurological symptoms,
and as adults they exhibited hardly any normal mouse activities. If
you put a normal mouse in the middle of a big white box, it will
almost immediately begin to explore its surroundings. It will move
around a lot, it will tend to follow the edges of the box just like
a normal house mouse scurrying along by the skirting boards, and it
will frequently rear up on its back legs to get a better view. A
mouse with the Mecp2 mutation does very few of these things
– put it in the middle of a big white box and it will tend to stay
there.
When Adrian Bird created his mouse strain
with the Mecp2 mutation, he also engineered it so that the
mice would also be carrying a normal copy of Mecp2. However,
this normal copy was silent – it wasn’t switched on in the mouse
cells. The really clever bit of this experiment was that if the
mice were given a specific harmless chemical, the normal
Mecp2 gene became activated. This allowed the experimenters
to let the mice develop and grow up with no Mecp2 in their cells,
and then at a time of the scientists’ choosing, the Mecp2
gene could be switched on.
The results of switching on the
Mecp2 gene were extraordinary. Mice which previously just
sat in the middle of the white box suddenly turned into the curious
explorers that mice should be6. You can find clips of this
on YouTube, along with interviews with Adrian Bird where he
basically concedes that he really never expected to see anything so
dramatic7.
The reason this experiment is so
important is that it offers hope that we may be able to find new
treatments for really complex neurological conditions. Prior to the
publication of this Science paper, there had been an
assumption that once a complex neurological condition has
developed, it is impossible to reverse it. This was especially
presumed to be the case for any condition that arises
developmentally, i.e. in the womb or in early infancy. This is a
critical period when the mammalian brain is making so many of the
connections and structures that are used throughout the rest of
life. The results from the Mecp2 mutant mice suggest that in
Rett syndrome, maybe all the bits of cellular machinery that are
required for normal neurological function are still there in the
brain – they just need to be activated properly. If this holds true
for humans (and at a brain level we aren’t really that
different from mice) this offers hope that maybe we can start to
develop therapies to reverse conditions as complex as mental
retardation. We can’t do this the way it was done in the mouse, as
that was a genetic approach that can only be used in experimental
animals and not in humans, but it suggests that it is worth trying
to develop suitable drugs that have a similar effect.
DNA methylation is clearly really
important. Defects in reading DNA methylation can lead to a complex
and devastating neurological disorder that leaves children with
Rett syndrome severely disabled throughout their lives. DNA
methylation is also essential for maintaining the correct patterns
of gene expression in different cell types, either for several
decades in the case of our long-lived neurons, or in all daughters
of a stem cell in a constantly-replaced tissue such as
skin.
But we still have a conceptual problem.
Neurons are very different from skin cells. If both cells types use
DNA methylation to switch off certain genes, and to keep them
switched off, they must be using the methylation at different sets
of genes. Otherwise they would all be expressing the same genes, to
the same extent, and they would inevitably then be the same types
of cells instead of being neurons and skin cells.
The solution to how two cell types can
use the same mechanism to create such different outcomes lies in
how DNA methylation gets targeted to different regions of the
genome in different cell types. This takes us into the second great
area of molecular epigenetics. Proteins.
DNA has a friend
DNA is often described as if it’s a
naked molecule, i.e. DNA and nothing else. If we visualise it at
all in our minds, a DNA double helix probably looks like a very
long twisty railway track. This is pretty much how we described it
in the previous chapter. But in reality it’s actually nothing like
that, and many of the great breakthroughs in epigenetics came about
when scientists began to appreciate this fully.
DNA is intimately associated with
proteins, and in particular with proteins called histones. At the
moment most attention in epigenetics and gene regulation is focused
on four particular histone proteins called H2A, H2B, H3 and H4.
These histones have a structure known as ‘globular’, as they are
folded into compact ball-like shapes. However, each also has a
loose floppy chain of amino acids that sticks out of the ball,
which is called the histone tail. Two copies of each of these four
histone proteins come together to form a tight structure called the
histone octamer (so called because it’s formed of eight individual
histones).
It might be easiest to think of this
octamer as eight ping-pong balls stacked on top of each other in
two layers. DNA coils tightly around this protein stack like a long
liquorice whip around marshmallows, to form a structure called the
nucleosome. One hundred and forty seven base-pairs of DNA coil
around each nucleosome. Figure 4.3 is a very
simplified representation of the structure of a nucleosome, where
the white strand is DNA and the grey wiggles are the histone
tails.
If we had read anything about histones
even just fifteen years ago, they would probably have been
described as ‘packaging proteins’, and left at that. It’s certainly
true that DNA has to be packaged. The nucleus of a cell is usually
only about 10 microns in diameter – that’s 1/100th of a millimetre
– and if the DNA in a cell was just left all floppy and loose it
could stretch for 2 metres. The DNA is curled tightly around the
histone octamers and these are all stacked closely on top of each
other.
Certain regions of our chromosomes have
an extreme form of that sort of structure almost all the time.
These tend to be regions that don’t really code for any genes.
Instead, they are structural regions such as the very ends of
chromosomes, or areas that are important for separating chromosomes
after DNA has been duplicated for cell division.

Figure 4.3 The
histone octamer (2 molecules each of histones H2A, H2B, H3 and H4)
stacked tightly together, and with DNA wrapped around it, forms the
basic unit of chromatin called the nucleosome.
The regions of DNA that are really
heavily methylated also have this hyper-condensed structure and the
methylation is very important in establishing this configuration.
It’s one of the mechanisms used to keep certain genes switched off
for decades in long-lived cell types such as neurons.
But what about those regions that aren’t
screwed down tight, where there are genes that are switched on or
have the potential to be switched on? This is where the histones
really come into play. There is so much more to histones than just
acting as a molecular reel for wrapping DNA around. If DNA
methylation represents the semi-permanent additional notes on our
script of Romeo and Juliet, histone modifications are the
more tentative additions. They may be like pencil marks, that
survive a few rounds of photocopying but eventually fade out. They
may be even more transient, like Post-It notes, used very
temporarily.
A substantial number of the breakthroughs
in this field have come from the lab of Professor David Allis at
Rockefeller University in New York. He’s a trim, neat, clean-shaven
American who looks much younger than his 60 years and is
exceptionally popular amongst his peers. Like many epigeneticists,
he began his career in the field of developmental biology. Just
like Adrian Bird, and John Gurdon before him, David Allis wears his
stellar reputation in epigenetics very lightly. In a remarkable
flurry of papers in 1996, he and his colleagues showed that histone
proteins were chemically modified in cells, and that this
modification increased expression of genes near a specific modified
nucleosome8.
The histone modification that David Allis
identified was called acetylation. This is the addition of a
chemical group called an acetyl, in this case to a specific amino
acid named lysine on the floppy tail of one of the histones.
Figure 4.4 shows the structures of lysine
and acetyl-lysine, and we can again see that the modification is
relatively small. Like DNA methylation, lysine acetylation is an
epigenetic mechanism for altering gene expression which doesn’t
change the underlying gene sequence.

Figure 4.4 The
chemical structures of the amino acid lysine and its epigenetically
modified form, acetyl-lysine. C: carbon; H: hydrogen; N: nitrogen;
O: oxygen. For simplicity, some carbon atoms have not been
explicitly shown, but are present where there is a junction of two
lines.
So back in 1996 there was a nice simple
story. DNA methylation turned genes off and histone acetylation
turned genes on. But gene expression is much more subtle than genes
being either on or off. Gene expression is rarely an on-off toggle
switch; it’s much more like the volume dial on a traditional radio.
So perhaps it was unsurprising that there turned out to be more
than one histone modification. In fact, more than 50 different
epigenetic modifications to histone proteins have been identified
since David Allis’s initial work, both by him and by a large number
of other laboratories9. These modifications all
alter gene expression but not always in the same way. Some histone
modifications push gene expression up, others drive it down. The
pattern of modifications is referred to as a histone code10.
The problem that epigeneticists face is that this is a code that is
extraordinarily difficult to read.
Imagine a chromosome as the trunk of a
very big Christmas tree. The branches sticking out all over the
tree are the histone tails and these can be decorated with
epigenetic modifications. We pick up the purple baubles and we put
one, two or three purple baubles on some of the branches. We also
have green icicle decorations and we can put either one or two of
these on some branches, some of which already have purple baubles
on them. Then we pick up the red stars but are told we can’t put
these on a branch if the adjacent branch has any purple baubles.
The gold snowflakes and green icicles can’t be present on the same
branch. And so it goes on, with increasingly complex rules and
patterns. Eventually, we’ve used all our decorations and we wind
the lights around the tree. The bulbs represent individual genes.
By a magical piece of software programming, the brightness of each
bulb is determined by the precise conformation of the decorations
surrounding it. The likelihood is that we would really struggle to
predict the brightness of most of the bulbs because the pattern of
Christmas decorations is so complicated.
That’s where scientists currently are in
terms of predicting how all the various histone modification
combinations work together to influence gene expression. It’s
reasonably clear in many cases what individual modifications can
do, but it’s not yet possible to make accurate predictions from
complex combinations.
There are major efforts being made to
learn how to understand this code, with multiple labs throughout
the world collaborating or competing in the use of the fastest and
most complex technologies to address this problem. The reason for
this is that although we may not be able to read the code properly
yet, we know enough about it to understand that it’s extremely
important.
Build a better mousetrap
Some of the key evidence comes from
developmental biology, the field from which so many great
epigenetic investigators have emerged. As we have already
described, the single-celled zygote divides, and very quickly
daughter cells start to take on discrete functions. The first
noticeable event is that the cells of the early embryo split into
the inner cell mass (ICM) and the trophoectoderm. The ICM cells in
particular start to differentiate to form an increasing number of
different cell types. This rolling of the cells down the epigenetic
landscape is, to quite a large degree, a self-perpetuating
system.
The key concept to grasp at this stage is
the way that waves of gene expression and epigenetic modifications
follow on from each other. A useful analogy for this is the game of
Mousetrap, first produced in the early 1960s and still on
sale today. Players have to build an insanely complex mouse trap
during the course of the game. The trap is activated at one end by
the simple act of releasing a ball. This ball passes down and
through all sorts of contraptions including a slide, a kicking
boot, a flight of steps and a man jumping off a diving board. As
long as the pieces have been put together properly, the whole
ridiculous cascade operates perfectly, and the toy mice get caught
under a net. If one of the pieces is just slightly mis-aligned, the
crazy sequence judders to a halt and the trap doesn’t
work.
The developing embryo is like
Mousetrap. The zygote is pre-loaded with certain proteins,
mainly from the egg cytoplasm. These egg-derived proteins move into
the nucleus and bind to target genes, which we’ll call Boots
(in honour of Mousetrap), and regulate their expression.
They also attract a select few epigenetic enzymes to the
Boots genes. These epigenetic enzymes may also have been
‘donated’ from the egg cytoplasm and they set up longer-lasting
modifications to the DNA and histone proteins of chromatin, also
influencing how these Boots genes are switched on or off.
The Boots proteins bind to the Divers genes, and
switch these on. Some of these Divers genes may themselves
encode epigenetic enzymes, which will form complexes on members of
the Slides family of genes, and so on. The genetic and
epigenetic proteins work together in a seamless orderly procession,
just like the events in Mousetrap once the ball has been
released. Sometimes a cell will express a little more or a little
less of a key factor, one whose expression is on a finely balanced
threshold. This has the potential to alter the developmental path
that the cell takes, as if twenty Mousetrap games had been
connected up. Slight deviations in how the pieces were fitted
together, or how the ball rolled at critical moments, would trigger
one trap and not another.
The names in our analogy are made up, but
we can apply this to a real example. One of the key proteins in the
very earliest stages of embryonic development is Oct4. Oct4 protein
binds to certain key genes, and also attracts a specific epigenetic
enzyme. This enzyme modifies the chromatin and alters the
regulation of that gene. Both Oct4 and the epigenetic enzyme with
which it works are essential for development of the early embryo.
If either is absent, the zygote can’t even develop as far as
creating an ICM.
The patterns of gene expression in the
early embryo eventually feed back on themselves. When certain
proteins are expressed, they can bind to the Oct4 promoter
and switch off expression of this gene. Under normal circumstances,
somatic cells just don’t express Oct4. It would be too dangerous
for them to do so because Oct4 could disrupt the normal patterns of
gene expression in differentiated cells, and make them more like
stem cells.
This is exactly what Shinya Yamanaka did
when he used Oct4 as a reprogramming factor. By artificially
creating very high levels of Oct4 in differentiated cells, he was
able to ‘fool’ the cells into acting like early developmental
cells. Even the epigenetic modifications were reset – that’s how
powerful this gene is.
Normal development has yielded important
evidence of the significance of epigenetic modifications in
controlling cell fate. Cases where development goes awry have also
shown us how important epigenetics can be.
For example, a 2010 publication in
Nature Genetics identified the mutations that cause a rare
disease called Kabuki syndrome. Kabuki syndrome is a complex
developmental disorder with a range of symptoms that include mental
retardation, short stature, facial abnormalities and cleft palate.
The paper showed that Kabuki syndrome is caused by mutations in a
gene called MLL211. The MLL2 protein is an
epigenetic writer that adds methyl groups to a specific lysine
amino acid at position 4 on histone H3. Patients with this mutation
are unable to write their epigenetic code properly, and this leads
to their symptoms.
Human diseases can also be caused by
mutations in enzymes that remove epigenetic modifications, i.e.
‘erasers’ of the epigenetic code. Mutations in a gene called
PHF8, which removes methyl groups from a lysine at position
20 on histone H3, cause a syndrome of mental retardation and cleft
palate12. In these cases, the
patient’s cells put epigenetic modifications on without problems,
but don’t remove them properly.
It’s interesting that although the MLL2
and PHF8 proteins have different roles, the clinical symptoms
caused by mutations in these genes have overlaps in their
presentation. Both lead to cleft palate and mental retardation.
Both of these symptoms are classically considered as reflecting
problems during development. Epigenetic pathways are important
throughout life, but seem to be particularly significant during
development.
In addition to these histone writers and
erasers there are over 100 proteins that act as ‘readers’ of this
histone code by binding to epigenetic marks. These readers attract
other proteins and build up complexes that switch on or turn off
gene expression. This is similar to the way that MeCP2 helps turn
off expression of genes that are carrying DNA
methylation.
Histone modifications are different to
DNA methylation in a very important way. DNA methylation is a very
stable epigenetic change. Once a DNA region has become methylated
it will tend to stay methylated under most conditions. That’s why
this epigenetic modification is so important for keeping neurons as
neurons, and why there are no teeth in our eyeballs. Although DNA
methylation can be removed in cells, this is usually only
under very specific circumstances and it’s quite unusual for this
to happen.
Most histone modifications are much more
plastic than this. A specific modification can be put on a histone
at a particular gene, removed and then later put back on again.
This happens in response to all sorts of stimuli from outside the
cell nucleus. The stimuli can vary enormously. In some cell types
the histone code may change in response to hormones. These include
insulin signalling to our muscle cells, or oestrogen affecting the
cells of the breast during the menstrual cycle. In the brain the
histone code can change in response to addictive drugs such as
cocaine, whereas in the cells lining the gut, the pattern of
epigenetic modifications will alter depending on the amounts of
fatty acids produced by the bacteria in our intestines. These
changes in the histone code are one of the key ways in which
nurture (the environment) interacts with nature (our genes) to
create the complexity of every higher organism on
earth.
Histone modifications also allow cells to
‘try out’ particular patterns of gene expression, especially during
development. Genes become temporarily inactivated when repressive
histone modifications (those which drive gene expression down) are
established on the histones near those genes. If there is an
advantage to the cell in those genes being switched off, the
histone modifications may last long enough to lead to DNA
methylation. The histone modifications attract reader proteins that
build up complexes of other proteins on the nucleosome. In some
cases the complexes may include DNMT3A or DNMT3B, two of the
enzymes that deposit methyl groups on CpG DNA motifs. Under these
circumstances, the DNMT3A or 3B can ‘reach across’ from the complex
on the histone and methylate the adjacent DNA. If enough DNA
methylation takes place, expression of the gene will shut down. In
extreme circumstances the whole chromosome region may become
hyper-compacted and inactivated for multiple cell divisions, or for
decades in a non-dividing cell like a neuron.
Why have organisms evolved such complex
patterns of histone modifications to regulate gene expression? The
systems seem particularly complex when you contrast them with the
fairly all-or-nothing effects of DNA methylation. One of the
reasons is probably because the complexity allows sophisticated
fine-tuning of gene expression. Because of this, cells and
organisms can adapt their gene expression appropriately in response
to changes in their environment, such as availability of nutrients
or exposure to viruses. But as we shall see in the next chapter,
this fine-tuning can result in some very strange consequences
indeed.