The structure and the dynamics of the
Universe are determined by the so-called fundamental interactions:
gravitational, electromagnetic, weak, and strong. In their absence,
the Universe would be an immense space filled with ideal gases of
structureless particles. Interactions between “matter” particles
(fermions) are in relativistic quantum physics associated with the
exchange of “wave” particles (bosons)—note that bosons can also
interact among themselves. Such a picture can be visualized (and
observables related to the process can be computed) using the
schematic diagrams invented in 1948 by Richard Feynman: the
Feynman diagrams (Fig. 6.1), that we have shortly
presented in Chap. 1.
Each Feynman diagram corresponds to a
specific term of a perturbative expansion of the scattering
amplitude. It is a symbolic graph, where initial and final state
particles are represented by incoming and outgoing lines (which are
not space–time trajectories), and the internal lines represent the
exchange of virtual particles (the term “virtual” meaning that
their energy and momentum do not have necessarily to be related
through the relativistic equation ; if they are not, they are said to be
off the mass shell). Solid straight lines are associated with
fermions while wavy, curly, or broken lines are associated with
bosons. Arrows indicate the time flow of the external particles and
antiparticles (in the plot time runs usually from left to right,
but having it running from bottom to top is also a possible
convention). A particle (antiparticle) moving backward in time is
equivalent to its antiparticle (particle) moving forward in
time.
At the lowest order, the two initial
state particles exchange only a particle mediating the interaction (for instance a photon). Associated
with each vertex (a point where at least three lines meet) is a
number, the coupling parameter1 (in the case of electromagnetic
interaction for a particle with electrical charge
z), which indicates the
probability of the emission/absorption of the field particle and
thus the strength of the interaction. Energy–momentum, as well as
quantum numbers, is conserved at each vertex.
At higher orders, more than one field
particle can be exchanged (second diagram from the left in the Fig.
6.1) and there
is an infinite number of possibilities (terms in the perturbative
expansion) for which amplitudes and probabilities are proportional
to increasing powers of the coupling parameters. Although the
scattering amplitude is proportional to the square of the sum of
all the terms, if the coupling parameters are small enough, just
the first diagrams will be relevant. However, even low-order
diagrams can give an infinite contribution. Indeed in the second
diagram, there is a loop of internal particles and an integration
over the exchanged energy–momentum has to be carried out. Since
this integration is performed in a virtual space, it is not bound
and therefore it might, in principle, diverge. Curing divergent
integrals (or, in jargon, “canceling infinities”) became the
central problem of quantum field theory in the middle of the
twentieth century (classically the electrostatic self-energy of a
point charged particle is also infinite) and it was successfully
solved in the case of electromagnetic interaction, as it will be
briefly discussed in Sect. 6.2.12, within the renormalization scheme.
Fig.
6.1
Feynman diagrams
The quantum equations for “matter”
(Schrödinger, Klein–Gordon, Dirac equations) must be modified to
incorporate explicitly the couplings with the interaction fields.
The introduction of these new terms makes the equations invariant
to a combined local (space–time
dependent) transformation of the matter and of the interactions
fields (the fermion wave phase and the four-momentum potential
degree of freedom in case of the electromagnetic interactions).
Conversely requiring that the “matter” quantum equations should be
invariant with respect to local transformation within some internal
symmetry groups implies the existence of well-defined interaction
fields, the gauge fields. These ideas, developed in particular by
Feynman and by Yang and Mills in the 1950s, were applied to the
electromagnetic, weak, and strong interactions field theories; they
provided the framework for the unification of the electromagnetic
and weak interactions (electroweak interactions) which has been
extensively tested with an impressive success (see next chapter)
and may lead to further unification involving strong interaction
(GUTs—Grand Unified Theories) and even gravity (ToE—Theories of
Everything). One could think that we are close to the “end of
physics.” However, the experimental discovery that most of the
energy of the Universe cannot be explained by the known physical
objects quickly dismissed such claim—in fact dark matter and dark
energy represent around 95% of the total energy budget of the
Universe, and they are not explained by present theories.
6.1 The
Lagrangian Representation of a Dynamical System
In the quantum world, we usually find
it convenient to use the Lagrangian or the
Hamiltonian representation of a system to compute the equations of
motion. The Lagrangian L of a
system of particles is defined as
(6.1)
where K is the total kinetic
energy of the system and V its
total potential energy.
Any system with n degrees of freedom is fully described
by n generalized coordinates
and n generalized velocities . The equations of motion of the
system are the so-called Euler–Lagrange equations
(6.2)
where the index runs over the degrees of freedom. For
example, in the case of a single particle in a conservative field
in one dimension, x, one can
write
(6.3)
and applying the Euler–Lagrange equations
(Newton’s law).
Although the mathematics required for
Lagrange’s equations might seem more complicated than Newton’s law,
Lagrange equations make often the solution easier, since the
generalized coordinates can be conveniently chosen to exploit
symmetries in the system, and constraint forces are incorporated in
the geometry of the problem.
The Lagrangian is of course not unique:
you can multiply it by a constant factor, for example, or add a
constant, and the equations will not change. You can also add the
four-divergence of an arbitrary vector function: it will cancel
when you apply the Euler–Lagrange equations, and thus the dynamical
equations are not affected.
The so-called Hamiltonian representation uses instead the
Hamiltonian function :
(6.4)
We have already shortly discussed in the previous chapter this
function, which represents the total energy in terms of generalized
coordinates and of generalized momenta
(6.5)
The time evolution of the system is obtained by the Hamilton’s
equations:
(6.6)
The two representations, Lagrangian and Hamiltonian, are
equivalent. For example, in the case of a single particle in a
conservative field in one dimension,
(6.7)
and Hamilton’s equations become
(6.8)
We shall use more frequently Lagrangian mechanics. Let us now see
how Lagrangian mechanics simplifies the description of a complex
system.
6.1.1 The
Lagrangian and the Noether Theorem
Noether’s theorem is particularly simple when the Lagrangian
representation is used. If the Lagrangian does not depend on the
variable , the Euler–Lagrange equation related
to this coordinate becomes
(6.9)
and thus the quantity
(6.10)
is conserved. For example, the invariance to space translation
implies that linear momentum is conserved. By a similar approach,
we could see that the invariance to rotational translation implies
that angular momentum is conserved.
6.1.2
Lagrangians and Fields; Lagrangian Density
The Euler–Lagrange equations are derived
imposing the stationarity of an action S defined as ; such a form, giving a special role
to time, does not allow a relativistically covariant Lagrangian
L.
We can recover relativistic covariance
using instead of the Lagrangian a “Lagrangian density” , such that the Lagrangian will be the
integral of over all space,
(6.11)
Now we can write
(6.12)
In a quantum mechanical world can depend, instead than on
coordinates and velocities, on fields, , which are meaningful quantities in
the four-dimensional space of relativity. Quantum mechanics
guarantees the invariance of physics with respect to a global rotation of the wave function in
complex space, i.e., the multiplication for a constant phase:
. This means that, in general, a
Lagrangian will be the combination of functions or . The latter are called, with obvious
meaning, kinetic terms.
The same argument leading to the
Euler–Lagrange equations leads now to generalized Euler–Lagrange
equations
(6.13)
for fields ().
Noether’s theorem guarantees that, if
the Lagrangian density does not depend explicitly on the field
, we have a four-current
(6.14)
subject to the continuity condition
(6.15)
where is the charge density and
is the current density. The total
(conserved) charge will be
(6.16)
Hamilton’s formalism can be also extended to relativistic quantum
fields.
In the rest of the book, we shall in
general make use of Lagrangian densities , but unless otherwise specified we
shall refer to the Lagrangian densities simply as Lagrangians.
6.1.3
Lagrangian Density and Mas
A Lagrangian is in general composed of
generalized coordinates and of their derivatives (or of fields and
their derivatives).
We shall show later that a nonzero
mass—i.e., a positive energy for a state at rest—is associated in
field theory to an expression quadratic in the field; for instance,
in the case of a scalar field,
(6.17)
The dimension of the Lagrangian density is [energy] since the action (6.12) is dimensionless;
the scalar field has thus the dimension of an
energy.
6.2 Quantum
Electrodynamics (QED)
Electromagnetic effects were known since
the antiquity, but just during the nineteenth century the
(classical) theory of electromagnetic interactions was firmly
established. In the twentieth century, the marriage between electrodynamics and quantum
mechanics (Maxwell’s equations were
already relativistic even before the formulation of Einstein’s
relativity) gave birth to the theory of Quantum Electrodynamics
(QED) , which is themost accurate theory ever formulated. QED
describes the interactions between charged electrical particles
mediated by a quantized electromagnetic field.
6.2.1
Electrodynamics
In 1864, James Clerk Maxwell accomplished the “second great unification
in Physics” (the first one was realized by Isaac Newton)
formulating the theory of electromagnetic field and summarizing it
in a set of coupled differential equations. Maxwell’s equations can
be written using the vector notation introduced by Heaviside and
following the Lorentz–Heaviside
convention for units (see Chap. 2) as
(6.18)
(6.19)
(6.20)
(6.21)
A scalar potential and a vector potential can be introduced such that
(6.22)
(6.23)
Then two of the Maxwell equations are automatically satisfied:
(6.24)
(6.25)
and the other two can be written as:
(6.26)
(6.27)
However, the potential fields are not totally determined, having a
local degree of freedom. In fact, if is a scalar function of the time and
space coordinates, then the potentials defined as
(6.28)
(6.29)
give origin to the same and fields. These transformations are
designated as gauge transformations and
generalize the freedom that exist in electrostatics in the
definition of the space points where the electric potential is zero
(the electrostatic field is invariant under a global transformation
of the electrostatic potential, but the electromagnetic field is
invariant under a joint local transformation of the scalar and
vector potential).
The arbitrariness of these
transformations can be used to write the Maxwell equations in a
simpler way. What we are going to do is to use our choice to fix
things so that the equations for and for are separated but have the same form.
We can do this by taking (this is called the Lorenz gauge):
(6.30)
Thus
(6.31)
(6.32)
The last two equations can be written in an extremely compact way
if four-vectors and are introduced and if the D’Alembert operator is used. Defining
(6.33)
(notice that the Lorenz gauge ), the two equations are summarized by
(6.34)
In the absence of charges and currents (free electromagnetic field)
(6.35)
This equation is similar to the Klein–Gordon
equation for a particle with (see Sects. 3.2.1 and 6.2.5) but with spin 1.
is identified with the wave function
of a free photon, and the solution of the above equation is,
up to some normalization factor:
(6.36)
where q is the four-momentum of
the photon and its the polarization four-vector. The
four components of are not independent. The Lorenz
condition imposes one constraint, reducing the number of
independent component to three. However, even after imposing the
Lorenz condition, there is still the possibility, if , of a further gauge transformation
(6.37)
This extra gauge transformation can be used to set the time
component of the polarization four-vector to zero and thus converting the Lorenz
condition into
(6.38)
This choice is known as the Coulomb
gauge, and it makes clear that there are just two degrees of
freedom left for the polarization which is the case of mass zero
spin 1 particles .
6.2.1.1
Modification for a Nonzero Mass: The Proca Equation
In the case of a photon with a tiny
mass :
(6.39)
Maxwell equations would be transformed into the
Proca2 equations:
(6.40)
(6.41)
(6.42)
(6.43)
In this scenario, the electrostatic field would show a Yukawa-type
exponential attenuation, . Experimental tests of the validity
of the Coulomb inverse square law have been performed since many
years in experiments using different techniques, leading to
stringent limits: eV 10g. Stronger limits eV) are reported from the
analyses of astronomical data, but are model dependent.
6.2.2
Minimal Coupling
Classically, the coupling between a
particle with charge e and the
electromagnetic field is given by the Lorentz
force:
(6.44)
which can be written in terms of scalar and vector potential as
Referring to the Euler–Lagrange equations:
with the nonrelativistic Lagrangian L defined as
(6.45)
a generalized potential for this dynamics is
(6.46)
The momentum being given by
one has for and for the Hamiltonian H
(6.47)
(6.48)
Then the free-particle equation
is transformed in the case of the coupling with the electromagnetic
field in:
(6.49)
This is equivalent to the following replacements for the
free-particle energy and momentum:
(6.50)
i.e., in terms of the relativistic energy–momentum four-vector:
(6.51)
or, in the operator view ():
(6.52)
The operator is designated the covariant derivative .
The replacement is called the minimal coupling prescription . This prescription involves only the charge
distribution and is able to account for all electromagnetic
interactions.
Wave equations can now be generalized to
account for the coupling with the electromagnetic field using the
minimal coupling prescription.
For instance, the free-particle
Schrödinger equation
(6.53)
becomes under such a replacement
(6.54)
The Schrödinger equation couples directly to the scalar and vector
potential and not to the force, and quantum effects not foreseen in
classic physics appear. One of them is the well-known
Bohm–Aharonoveffect predicted in 1959 by
David Bohm and his student Yakir Aharonov.3 Whenever a
particle is confined in a region where the electric and the
magnetic field are zero but the potential four-vector is not, its
wave function changes the phase.
This is the case of particles crossing
a region outside an infinite thin solenoid (Fig. 6.2, left). In this
region, the magnetic field is zero but the vector potential
vector is not
The line integral of the vector potential around a closed loop is equal to the
magnetic flux through the area enclosed by the loop. As
inside the solenoid is not zero, the
flux is also not zero and therefore is not null.
This effect was experimentally
verified observing shifts in an interference pattern whether or not
the current in a microscopic solenoid placed in between the two
fringes is turned on (Fig. 6.2, right).
Fig.
6.2
Left: Vector potential in the region outside
an infinite solenoid. Right: Double-slit experiment demonstrating
the Bohm–Aharonov effect.
From D. Griffiths, “Introduction to quantum
mechanics,” second edition, Pearson 2004
6.2.3 Gauge
Invariance
We have seen that physical observables
connected to a wave function are invariant to global change in the
phase of the wave function itself
(6.55)
where is a real number.
The free-particle Schrödinger equation
in particular is invariant with respect to a global change in the
phase of the wave function. It is easy, however, to verify that
this does not apply, in general, to a local change
(6.56)
On the other hand, the electromagnetic field is, as it was
discussed in Sect. 6.2.1, invariant under a combined local
transformation of the scalar and vector potential:
(6.57)
(6.58)
where is a scalar function of the time and
space coordinates.
Remarkably, the Schrödinger equation
modified using the minimal coupling prescription is invariant under
a joint local transformation both of the phase of the wave function
and of the electromagnetic four-potential:
(6.59)
(6.60)
Applying the minimal coupling prescription to the relativistic wave
equations (Klein–Gordon and Dirac equations), these equations
become also invariant under local gauge transformations, as we
shall verify later.
Conversely, imposing the invariance
under a local gauge transformation of the free-particle wave
equations implies the introduction of a gauge field.
The gauge transformation of the wave functions can be
written in a more general form as
(6.61)
where is a real function of the space
coordinates and a unitary operator (see
Sect. 5.3.3).
In the case of QED, Herman Weyl, Vladmir
Foch, and Fritz London found in the late 1920s that the invariance
of a Lagrangian including fermion and field terms with respect to
transformations associated with the U(1) group, corresponding to
local rotations by of the wave function phase, requires
(and provides) the interaction term with the electromagnetic field,
whose quantum is the photon.
The generalization of this symmetry to
non-Abelian groups was introduced in 1954 by Chen Yang and Robert Mills.4 Indeed we shall see that:
The weak interaction is modeled by a
“weak isospin” symmetry linking “weak isospin up” particles
(identified, e.g., with the u-type quarks and with the neutrinos) and
“weak isospin down” particles (identified, e.g., with the
d-type quarks and with the
charged leptons). We have seen that SU(2) is the minimal
representation for such a symmetry. If is chosen to be one of the generators
of the SU(2) group, then the associated gauge transformation
corresponds to a local rotation in a spinor space. The gauge fields
needed to ensure the invariance of the wave equations under such
transformations are the weak fields, which imply the existence of
the and Z mediators (see Sect. 6.3).
The strong interaction is modeled by
QCD, a theory exploiting the invariance of the strong interaction
with respect to a rotation in color space. We shall see that SU(3)
is the minimal representation for such a symmetry. If is chosen to be one of the generators
of the SU(3) group, then the associated gauge transformation
corresponds to a local rotation in a complex three-dimensional
vector space, which represents the color space. The gauge fields
needed to assure the invariance of the wave equations under such
transformations are the strong fields whose quanta are called
gluons (see Sect. 6.4).
Figure 6.3 shows schematic
representations of such transformations.
6.2.4 Dirac
Equation Revisited
Dirac equation was briefly introduced in Sect. 3.2.1. It is a linear equation describing free
relativistic particles with spin 1 / 2 (electrons and
positrons for instance); linearity allows overcoming some
difficulties coming from the nonlinearity of the Klein–Gordon
equation, which was the translation in quantum mechanical form of
the relativistic Hamiltonian
replacing the Hamiltonian itself and the momentum with the
appropriate operators:
(6.62)
Fig.
6.3
Schematic representations of U(1), SU(2), and
SU(3) transformations applied to the models of QED, weak, and
strong interactions
Dirac searched for an alternative
relativistic equation starting from the generic form describing the
evolution of a wave function, in the familiar form:
(6.63)
with a Hamiltonian operator linear in , t (Lorentz invariance requires that if
the Hamiltonian has first derivatives with respect to time also the
spatial derivatives should be of first order):
(6.64)
This must be compatible with the Klein–Gordon equation, and thus
(6.65)
Therefore, the parameters and cannot be numbers. However, things
work if they are matrices (and if these matrices are Hermitian it
is guaranteed that the Hamiltonian is also Hermitian). It can be
demonstrated that their lowest possible rank is 4.
Using the explicit form of the
momentum operator , the Dirac equation can be written as
(6.66)
The wave functions must thus be of the form:
(6.67)
We arrived at an interpretation of the Dirac equation as a
four-dimensional matrix equation in which the solutions are
four-component wavefunctions called bi-spinors. Plane wave solutions are
(6.68)
where is also a four-component bi-spinor
satisfying the eigenvalue equation
(6.69)
This equation has four solutions: two with positive energy
and two with negative energy
. We will discuss later the
interpretation of the negative energy solutions. The Dirac equation
accounts “for free” for the existence of two spin states, which had
to be inserted by hand in the Schrödinger equation of
nonrelativistic quantum mechanics, and therefore explains the
magnetic moment of point-like fermions. In addition, since spin is
embedded in the equation, the Dirac’s equation allows computing
correctly the energy splitting of atomic levels with the same
quantum numbers due to the spin–orbit and spin–spin interactions in
atoms (fine and hyperfine splitting).
We shall now write the free-particle
Dirac equation in a more compact form, from which relativistic
covariance is immediately visible. This requires the introduction
of a new set of important 44 matrices, the matrices,
which replace the and matrices discussed before. To account for
electromagnetic interactions, the minimal coupling prescription can
once again be used.
A possible choice, the Dirac-Pauli
representation, for and satisfying the conditions
(6.65) is the
set of matrices:
(6.70)
being the Pauli
matrices (see Sect. 5.7.2) and I the unit matrix.
and introducing the Pauli–Dirac matrices defined as
(6.71)
then:
(6.72)
If we use a four-vector notation
(6.73)
taking into account that
(6.74)
the Dirac equation can be finally written as:
(6.75)
This is an extremely compact form of writing a set of four
differential equations applied to a four-component vector
(often called a bi-spinor) . We call it the covariant form of the Dirac
equation (its form is preserved in all the inertial frames).
Let us examine now the solutions of the
Dirac equation in some particular cases.
6.2.4.1
Particle at Rest
Particles at rest have and thus
(6.76)
(6.77)
being and spinors:
(6.78)
(6.79)
In this simple case, the two spinors are subject to two independent
differential equations:
(6.80)
(6.81)
which have as solution (up to some normalization factor):
with energy ;
with energy
or in terms of each component of the wavefunction vector
(6.82)
(6.83)
There are then four solutions which can accommodate a spin
1 / 2 particle or antiparticle. The positive energy
solutions and correspond to fermions (electrons for
instance) with spin up and down, respectively, while the negative
energy solutions and correspond to antifermions (positrons
for instance) with spin up and down.
6.2.4.2 Free
Particle
Free particles have and their wave function is a plane
wave of the form:
(6.84)
where
is a bi-spinor (, are spinors) and N a normalization factor.
The Dirac equation can be written as a
function of the energy–momentum operators as
(6.85)
Inserting the equation of a plane wave as a trial solution and
using the Pauli–Dirac representation of the matrices:
(6.86)
I is again the unity matrix which is often omitted
writing the equations and
(6.87)
For , the “particle at rest” solution
discussed above is recovered. Otherwise, there are two coupled
equations for the spinors and :
(6.88)
(6.89)
and then the u bi-spinor can be
written either in terms of the spinor or in term of the spinor :
(6.90)
(6.91)
The first solution corresponds to states with (particles) and the second to states
with (antiparticles) as can be seen by
going to the limit. These last states can be
rewritten changing the sign of E and and labeling the bi-spinor
as v ( is then labeled just as u).
(6.92)
Both and can be written in a base of unit
vectors with
(6.93)
(6.94)
Finally, we have then again four solutions: two for the particle
states and two for the antiparticle states.
The normalization factor N is often defined as
(6.95)
ensuring a standard relativistic normalization convention of
2E particles per box of volume
V. In fact, introducing the
bi-spinors transpose conjugate and
(6.96)
6.2.4.3Helicity
The spin operator introduced in Sect. 5.7.2 can now be generalized in
this bi-spinor space as
(6.97)
where
(6.98)
More generally, defining the helicity operator h as the projection
of the spin over the momentum direction:
(6.99)
there are always four eigenstates of this operator. Indeed, using
spherical polar coordinates :
(6.100)
and the helicity operator is given by
(6.101)
The eigenstates of the operator h can also be written as
(6.102)
(6.103)
Note that helicity is Lorentz invariant only in the case of
massless particles (otherwise the direction of can be inverted choosing an
appropriate reference frame).
6.2.4.4
Dirac Adjoint, the Matrix, and Bilinear Covariants
The Dirac bi-spinors are not real
four-vectors, and it can be shown that the product is not a Lorentz invariant (a scalar).
On the contrary, the product is a Lorentz invariant being
named the adjoint Dirac spinor and defined as:
(6.104)
The parity operator P in the
Dirac bi-spinor space is just the matrix (it reverts the sign of the terms
which are function of ), and
(6.105)
as .
Other quantities can be constructed
using and (bilinear covariants). In particular introducing
as
(6.106)
is a pseudoscalar.
is a four-vector.
is a pseudo four-vector.
, where , is an antisymmetric tensor.
6.2.4.5
Dirac Equation in the Presence of an Electromagnetic Field
The Dirac equation in the presence of
an electromagnetic field can be obtained applying the minimal
coupling prescription discussed in Sect. 6.2.2. In practice this is
obtained by replacing the derivatives by the covariant
derivative :
(6.107)
Then
(6.108)
(6.109)
The interaction with a magnetic field can be then described
introducing the two spinors and and using the Pauli–Dirac
representation of the matrices:
(6.110)
In the nonrelativistic limit (, the Dirac equation reduces to
(6.111)
where the magnetic field has been reintroduced.
There is thus a coupling of the form
between the magnetic field and the
spinof a point-like charged particle (the
electron or the muon for instance), and the quantity
(6.112)
can be identified with the intrinsic magnetic moment of a charged particle with
spin .
Defining the gyromagnetic ratiog as the ratio
between and the classical magnetic moment
of a charged particle with an angular
momentum :
(6.113)
6.2.4.6
The value of the coupling between the
magnetic field and the spin of the point charged particle is
however modified by higher-order corrections which can be
translated in successive Feynman diagrams, as the ones we have seen
in Fig. 6.1. In second order, the main correction is
introduced by a vertex correction, described by the diagram
represented in Fig. 6.4 computed in 1948 by Schwinger, leading to
deviation of g from 2
(anomalous magnetic moment) with magnitude:
(6.114)
Fig.
6.4
Second-order vertex correction to g
Nowadays, the theoretical corrections
are completely computed up to the eighth-order (891 diagrams)
and the most significant tenth-order terms as well as electroweak
and hadronic corrections are also computed. There is a remarkable
agreement with the present experimental value of:
(6.115)
Historically, the first high precision measurements were accomplished by H. Richard
Crane and his group in the years
1950–1967 at the University of Michigan, USA. A beam of electrons
is first polarized and then trapped in a magnetic bottle for a
(long) time T. After this time, the beam is extracted and the
polarization is measured (Fig. 6.5).
Fig.
6.5
Schematic drawing of the g – 2 experiment from H.
Richard Crane
Under the influence of the magnetic
field B in the box, the spin of
the electron precesses with angular velocity
(6.116)
while the electron follows a helicoidal trajectory with an angular
velocity of
(6.117)
The polarization of the outgoing beam is thus proportional to the
ratio
(6.118)
Nowadays, Penning traps are used to keep
electrons (and positrons) confined for months. Such a device,
invented by H. Dehmelt in the 1980s, uses a homogeneous static
magnetic field and a spatially inhomogeneous static electric field
to trap charged particles (Fig. 6.6).
Fig.
6.6
Schematic representation of the electric and
magnetic fields inside a Penning trap.
The muon and electron magnetic moments
are equal at first order. However, the loop corrections are
proportional to the square of the respective masses and thus those
of the muon are much larger . In particular, the sensitivity to
loops involving hypothetical new particles (see
Chap. 7 for a survey) is much higher, and a
precise measurement of the muon anomalous magnetic moment
may be used as a test of the standard
model.
The most precise measurement of
so far was done by the experiment
E821 at Brookhaven National Laboratory (BNL). A beam of polarized
muons circulates in a storage ring with a diameter of 14 m under the influence of an
uniform magnetic field (Fig. 6.7).The muon spin
precesses, and the polarization of the beam is a function of time.
After many turns, muons decay to electron (and neutrinos) whose
momentum is basically aligned with the direction of the muon spin
(see Sect. 6.3). The measured value is
(6.119)
This result is more than away from the expected one which
leads to a wide discussion both on the accuracy of the theoretical
computation (in particular in the hadronic contribution) and the
possibility of an indication of new physics (SUSY particles, dark
photon, extra dimensions, additional Higgs bosos, ...). Meanwhile
the E821 storage ring has been moved to Fermilab, and it is
presently used by the E989 experiment which aims to improve the
precision by a factor of four. Results are expected in few years
(2018–2020).
Fig.
6.7
The E821 storage ring.
From Brookhaven National Laboratory
6.2.4.7 The
Lagrangian Density Corresponding to the Dirac Equation
Consider the Lagrangian density
(6.120)
and apply the Euler–Lagrange equations to . One finds
which is indeed the Dirac equation for a free particle. Notice
that:
the mass (i.e., the energy associated
with rest—whatever this can mean in quantum mechanics) is
associated with a term quadratic in the field
the dimension of the field is [energy] ( is a scalar).
6.2.5
Klein–Gordon Equation Revisited
The Klein–Gordon equation was briefly introduced in Sect. 3.2.1. It describes free
relativistic particles with spin 0 (scalars or pseudoscalars). With
the introduction of the four-vector notation, it can be written in
a covariant form. To account for electromagnetic interactions, the
minimal coupling prescription can be used.
6.2.5.1
Covariant Form of the Klein–Gordon Equation
In Sect. 5.7.2, the Klein–Gordon equation
was written as
where is a scalar wave function.
Remembering that
the Klein–Gordon equation can be written in a covariant form:
(6.121)
The solutions are, as it was discussed before, plane waves
(6.122)
with
(6.123)
(the positive solutions correspond to particles and the negative
ones to antiparticles).
Doing some arithmetic with the
Klein–Gordon equation and its conjugate, a continuity equation can
also be obtained for a particle with charge e:
(6.124)
where
or in terms of four-vectors:
(6.125)
where
(6.126)
In the case of plane waves:
(6.127)
6.2.5.2
Klein–Gordon Equation in Presence of an Electromagnetic Field
In the presence of an electromagnetic
field, the Klein–Gordon equation can be modified applying, as it
was done previously for the Schrödinger and the Dirac equations,
the minimal coupling prescription. The normal derivatives are
replaced by the covariant derivatives:
(6.128)
and thus
The term is of second order and can be
neglected. Then the Klein–Gordon equation in presence of an
electromagnetic field can be written at first order as
(6.129)
where
(6.130)
is the potential.
6.2.5.3 The
Lagrangian Density Corresponding to the Klein–Gordon Equation
Consider the Lagrangian density
(6.131)
and apply the Euler–Lagrange equations to . We find
which is indeed the Klein–Gordon equation for a free scalar
field.
Notice that:
the mass (i.e., the energy associated
with rest—or better, in a quantum mechanical language, to the
ground state) is associated with a term quadratic in the field
the dimension of the field is [energy] ( is a scalar).
6.2.6 The
Lagrangian for a Charged Fermion in an Electromagnetic Field:
Electromagnetism as a Field Theory
Let us draw a field theory equivalent to
the Dirac equations in the presence of an external field.
We already wrote a Lagrangian density
equivalent to the Dirac equation for a free particle (Eq.
6.120):
(6.132)
Electromagnetism can be translated into the quantum world by
assuming a Lagrangian density
(6.133)
where is called the covariant derivative
(remind the “minimal prescription”), and is the four-potential of the
electromagnetic field; is the electromagnetic field tensor
(see Sect. 2.9.8).
If the field transforms under a local gauge
transformation as
(6.134)
the Lagrangian is invariant with respect to a local U(1) gauge transformation
.
Substituting the definition of
D into the Lagrangian gives us
(6.135)
Differentiating with respect to , one finds
(6.136)
This is the Dirac equation including electrodynamics, as we have
seen when discussing the minimal coupling prescription.
Let us now apply the Euler–Lagrange
equations this time to the field in the Lagrangian (6.133):
(6.137)
We find
and substituting these two terms into (6.137) gives:
(6.138)
For the spinor matter fields, the current takes the simple form:
(6.139)
where is the charge of the field
in units of e. The equation
(6.140)
is equivalent, as we discussed in Chap. 2, to the nonhomogeneous Maxwell
equations. Notice that the two homogeneous Maxwell equations
are automatically satisfied due to the definition of the tensor
when we impose the Lorenz gauge
.
Again, if we impose the Lorenz gauge
,
(6.141)
which is a wave equation for the four-potential—the QED version of
the classical Maxwell equations in the Lorenz gauge.
Notice that the Lagrangian
(6.133) of
QED, based on a local gauge invariance, contains all the physics of
electromagnetism. It reflects also some remarkable properties,
confirmed by the experiments:
The interaction conserves separately
P, C, and T.
The current is diagonal in flavor space
(i.e., it does not change the flavors of the particles).
We can see how the massless
electromagnetic field “appears” thanks the gauge
invariance. This is the basis of QED, quantum electrodynamics.
If a mass were associated with A, this new field would enter in the
Lagrangian with a Proca term
(6.142)
which is not invariant under local phase transformation. The field
must, thus, be massless.
Summarizing, the requirement of local
phase invariance under U(1), applied to the free Dirac Lagrangian,
generates all of electrodynamics and specifies the electromagnetic
current associated to Dirac particles; moreover, it introduces a
massless field which can be interpreted as the photon. This is
QED.
Notice that introducing local phase
transformations just implies a simple difference in the calculation
of the derivatives: we pick up an extra piece involving
. We replace the derivative with the
covariant derivative
(6.143)
and the invariance of the Lagrangian is restored. Substituting
with transforms a globally invariant
Lagrangian into a locally invariant one.
6.2.7 An
Introduction to Feynman Diagrams: Electromagnetic Interactions
Between Charged Spinless Particles
Electrons and muons have spin 1/2;
but, for a moment, let us see how to compute transition
probabilities in QED in the case of hypothetical spinless charged
point particles, since the computation of the electromagnetic
scattering amplitudes between charged spinless particles is much
simpler.
Fig.
6.8
Left: Schematic representation of the
first-order interaction of a particle in a field. Right: Schematic
representation (Feynman diagram) of the first-order elastic
scattering of two charged nonidentical particles
6.2.7.1
Spinless Particles in an Electromagnetic Field
The scattering of a particle due to an
interaction that acts only in a finite time interval can be
described, as it was discussed in Sect. 2.7, as the transition between an
initial and a final stationary states characterized by well-defined
momentum. The first-order amplitude for such transition is written,
in relativistic perturbative quantum mechanics, as (see
Fig. 6.8,
left):
(6.144)
In the case of the electromagnetic field, the potential is given by
(see Eq. 6.130) and
(6.145)
Integrating by parts assuming that the field vanishes at or
and introducing a “transition” current between the initial and final states
defined as:
this amplitude can be transformed into:
(6.146)
In the case of plane waves describing particles with charge
e, the current can be written as:
(6.147)
Considering now, as an example, the classical case of the
Rutherford scattering (i.e., the elastic scattering of a spin-0
positive particle with charge e
by a Coulomb potential originated by a static point particle
(infinite mass) with a charge Ze in the origin), we have:
with
Then
(6.148)
Factorizing the integrals in time and space and remarking that
(6.149)
The first integral is in fact a function which ensures energy
conservation (there is no recoil of the scattering point particle
and therefore no energy transfer),
(6.150)
while the second integral gives
(6.151)
where
is the transfered momentum.
The transition amplitude for the
Rutherford scattering is, in this way, given by:
(6.152)
The corresponding differential cross section can now be computed
applying the relativistic Fermi golden rule discussed in
Chap. 2:
(6.153)
Taking into account the convention adopted for:
the invariant wave function
normalization factor:
the invariant phase space:
the incident flux for a single
incident particle:
then
(6.154)
Since
we find again the Rutherford differential cross section, previously
obtained in the Classical Mechanics and in the nonrelativistic
quantum mechanical frameworks (Chap. 2):
(6.155)
6.2.7.2
Elastic Scattering of Two Nonidentical Charged
Spinless Particles
The interaction of two charged
particles can be treated as the interaction of one of the particles
with the field created by the other (which thus acts as the source
of the field).
The initial and final states of
particle 1 are labeled as the states A and C, respectively, while
for the particle 2 (taken as the source of the field) the
corresponding labels are B and D (see Fig. 6.8, right). Let us assume
that particles 1 and 2 are not of the same type (otherwise they
would be indistinguishable) and have charge e. Then:
(6.156)
with
(6.157)
Being generated by the current associated
with particle 2 (see Sect. 6.2.1)
(6.158)
with
(6.159)
defining the exchanged four-momentum q as:
and since
(6.160)
the field is given by
(6.161)
Therefore
(6.162)
Solving the integral ():
(6.163)
where ensures the conservation of
energy–momentum, and the amplitude is defined as
(6.164)
Fig.
6.9
Scattering of two charged particles in the
center-of-mass reference frame
With the scattering angle in the
center-of-mass (c.m.) reference frame (see Fig. 6.9) and p the module of momentum still in the
c.m., the four-vectors of the initial and final states at
high-energy can be written as
Then:
and
(6.165)
On the other hand, the differential cross section of an elastic
two-body scattering between spinless nonidentical particles in the
c.m. frame is given by (see Sect. 2.9.7):
where is the square of the c.m. energy
(s is one of the Mandelstam
variables, see Sect. 2.9.6).
Thus:
(6.166)
where
(6.167)
is the fine structure constant.
Note that when the cross section diverges. This fact
is a consequence of the infinite range of the electromagnetic
interactions, translated into the fact that photons are
massless.
6.2.7.3
Feynman Diagram Rules
The invariant amplitude computed in
the previous subsection,
can be obtained directly from the Feynman diagram (Fig. 6.8, right) using
appropriate “Feynman rules.”
In particular, for this simple case,
the different factors present in the amplitude are:
the vertex factors: , corresponding to the vertex
A-C-photon, and , corresponding to the vertex
B-D-photon;
the propagator factor: , corresponding to the only internal
line, the exchanged photon, existing in the diagram.
The energy–momentum is conserved at
each vertex, which is trivially ensured by the definition of
.
6.2.8
Electron–Muon Elastic Scattering ()
Electron and muon have spin 1/2 and
are thus described by Dirac bi-spinors (see Sect. 6.2.4). The computation
of the scattering amplitudes is more complex than the one discussed
in the previous subsection for the case of spinless particles but
the main steps, summarized hereafter, are similar.
Fig.
6.10
Lowest-order Feynman diagram for
electron–muon scattering
The Dirac equation in presence of an
electromagnetic field is written as
(6.168)
The corresponding current is
(6.169)
The transition amplitude for the electron (states A and C)/muon
(states B and D) scattering can then be written as
(Fig. 6.10):
(6.170)
where
(6.171)
(6.172)
with
Solving the integral,
(6.173)
where the amplitude is given by
(6.174)
The cross section is proportional to the square of the transition
amplitude (see the Fermi golden rule—Chap.
2). However, the amplitude written
above depends on the initial and final spin configurations. In
fact, as there are four possible initial configurations (two for
the electron and two for the muon) and also four possible final
configurations, there are sixteen such amplitudes to be computed.
Using the orthogonal helicity state basis (Sect. 6.2.4.3), each of these
amplitudes are independent (there is no interference between the
corresponding processes) and can be labeled according to the
helicities of the corresponding initial and final states. For
instance, if all the states have Right (positive) helicity the
amplitude is labelled as .
In the case of an experiment with
unpolarized beams (all the initial helicities configurations are
equiprobable) and in which no polarization measurements of the
helicities of the final states are made, the corresponding cross
section must be obtained averaging over the initial configurations
and summing over the final ones. A mean squared amplitude is then
defined as:
(6.175)
Luckily, in the limit of high energies (whenever the electron and
the muon masses can be neglected), many of these amplitudes are
equal to zero. Taking for example ,
(6.176)
the last factor corresponding to the muonic current is equal to
zero,
(6.177)
Indeed, remembering the definitions of the helicity eigenvectors
(Eqs. 6.102, Sect. 6.2.4.3), and of the
matrix (Sect. 6.2.4) and working in the
c.m. frame (), (), :
(6.178)
and since
(6.179)
(6.180)
then
(6.181)
The only amplitudes that are nonzero are those where the helicity
of the electron and the helicity of the muon are conserved, i.e.,
This fact is a direct consequence of the conservation of chirality
in the QED vertices and that, in the limit of high energies,
chirality and helicity coincide (see Sect. 6.3.4). If the fermions
masses cannot be neglected, all the currents are nonzero but the
total angular momentum of the interaction will be conserved, as it
should. In this case, the computation of the amplitudes is more
complex but the sum over all internal indices and products of
matrices can be considerably
simplified using the so-called trace theorems (for a pedagogical
introduction see for instance the books of Thomson [F6.1] and of
Halzen and Martin [F6.6]).
In the case of unpolarized beams, of
no polarization measurements of the helicities of the final states
and whenever masses can be neglected, the mean squared amplitude is
thus:
(6.182)
Each of the individual amplitudes are expressed as a function of
the electronic and muonic currents which can be computed following
a similar procedure of the one sketched above for the computation
of . The relevant four-vector currents are:
(6.183)
(6.184)
(6.185)
(6.186)
and the amplitudes are given by:
(6.187)
(6.188)
(6.189)
(6.190)
The angular dependence of the denominators reflects the t channel character of this interaction
() while the angular dependence of the
numerators reflects the total angular momentum of the initial and
final states ( and correspond to initial and final
states with a total angular momentum , the other two amplitudes correspond
to initial and final states with a total angular momentum
).
The mean squared amplitude
(6.2.8) is
now easily computed to be:
(6.191)
This amplitude is often expressed in terms of the Mandelstam
variables s, t, u, as:
(6.192)
since, in this case, , and .
Remembering once again the Fermi
golden rule for the differential cross section of two body elastic
scattering discussed in Chap. 2, we have then in the c.m. reference
frame:
(6.193)
which in the laboratory reference frame (muon at rest) is converted
to:
(6.194)
This is the Rosenbluth formulareferred in
Sect. 5.5.1.
6.2.9
Feynman Diagram Rules for QED
The invariant amplitude computed in
the previous subsection,
can be obtained directly from the Feynman diagram (Fig. 6.10) using appropriate
“Feynman rules.”
The Feynman rules consist in drawing
all topologically distinct and connected Feynman diagrams for a
given process and making the product of appropriate multiplicative
factors associated with the various elements of each diagram.
In particular the different factors
present in the amplitude computed in the previous subsection are:
the vertex factors: ;
the propagator factor: , corresponding to the only internal
line, the exchanged photon;
the external lines factors: for the
initial particles A and B, the spinors and ; for the final particles C and D, the
adjoint spinors and ,
and again energy–momentum conservation is imposed at each
vertex.
The Dirac currents (e.g.,
) involve both the electric and
magnetic interactions of the charged spin 1/2 particles. This can
be explicitly shown using the so-called Gordon
decomposition of the vectorial current,
(6.195)
where the tensor is defined as
(6.196)
Higher-order terms correspond to more complex diagrams which may
have internal loops and fermion internal lines (see
Fig. 6.14). In this case, the factor associated with
each internal fermion line is
and one should not forget that every internal four-momentum loop
has to be integrated over the full momentum range.
The complete set of the Feynman diagram
rules for the QED should involve thus all the possible particles
and antiparticles (spin 0, 1/2, spin 1) in the external and
internal lines.
Multiplicative factors associated
with each element of Feynman diagrams in the Feynman rules are
summarized in Table 6.1) (from Ref. [F6.6]).
Table
6.1
Feynman rules
for
Multiplicative factor
External Lines
Spin-0 boson
1
Spin- fermion (in, out)
Spin- antifermion (in, out)
Spin-1 photon (in, out)
Internal Lines − Propagators
Spin-0 boson
Spin- fermion
Massive spin-1 boson
Massless spin-1 boson
(Feynman gauge)
Vertex Factors
Photon−spin-0 (charge e)
Photon−spin- (charge e)
Loops: over loop momentum; include
if fermion loop and take the trace of
associated -matrices
Identical fermions: between diagrams which differ only in
or initial final
The total amplitude at a given order is
then obtained adding up the amplitudes corresponding to all the
diagrams that can be drawn up to that order. Minus signs
(antisymmetrization) must be included between diagrams that differ
only in the interchange of two incoming or outgoing fermions (or
antifermions), or of an incoming fermion with an outgoing
antifermion (or vice versa).
Some applications follow in the next
subsections.
6.2.10 Muon
Pair Production from Annihilation ()
Applying directly the Feynman diagram
rules discussed above the invariant amplitude for (see Fig. 6.11) gives:
(6.197)
where the spinors v are used to
describe the antiparticles.
Fig.
6.11
Lowest-order Feynman diagram for
electron–positron annihilation into a muon pair
As we already know this amplitude
depends on the initial and final spin configurations and each
configuration can be computed independently. In the limit where
masses can be neglected it can be shown, similarly to the case of
the channel discussed above, that only
four helicity combinations give a nonzero result. These
configurations correspond to initial and final states and:
where is the angle in the c.m. reference
frame between the electron and the muon.
The angular dependence of these
amplitudes could have been predicted observing the total angular
momentum of the initial states. In fact, these amplitudes
correspond, as stated before, to initial and final states with a
total angular momentum . The projection of the initial and
final angular momentum along the beam direction implies then, according to the
quantum mechanics spin-1 rotation matrices, the factor .
Once again, in the case of an
experiment with unpolarized beams and in which no polarization
measurements of the helicities of the final states are made, the
cross section is obtained averaging over the initial configurations
and summing over the final ones. The mean squared amplitude is
therefore defined as:
(6.198)
The differential cross section in the c.m. reference frame is then
given by:
(6.199)
Finally, one should note that the mean squared amplitude obtained
above can also be expressed in terms of the Mandelstam variables
s, t, u, as:
(6.200)
This formula is equivalent to the one obtained in the case of the
elastic scattering of and (see Eq. 6.192) if one makes the
following correspondences between the Mandelstam variables computed
in the two channels:
(6.201)
In fact, the scattering (t
channel) and the pair production (s channel) Feynman diagrams can be
transformed in each other just exchanging an incoming (in) external line by an outgoing
(out) external line and
transforming in this operation the corresponding particle into its
antiparticle with symmetric momenta and helicity (and vice versa).
These exchanges are translated in exchanging the four-momenta as
follows:
Such relations between amplitudes corresponding to similar Feynman
diagrams are called Crossing Symmetries.
Fig.
6.12
Feynman diagrams contributing at first order
to the Bhabha cross section
6.2.11
Bhabha Scattering ()
Two first-order (tree level) diagrams
(Fig. 6.12) contribute to this process:
The first diagram corresponds to the
exchange of a photon in the s
channel and is, if masses are neglected, identical to the
diagram we computed above:
(6.202)
The second diagram corresponds to the
exchange of a photon in the t
channel and is, if masses are neglected, similar (just exchanging a
particle by an antiparticle) to the diagram computed above:
(6.203)
The total amplitude is the sum of
these two amplitudes:
(6.204)
The minus sign comes from the antisymmetrization imposed by the
Fermi statistics, and it is included in the Feynman rules (see
Sect. 6.2.9).
Remembering the amplitudes computed
before for the s and t channels, the nonzero spin
configuration amplitudes are:
One should note that the and amplitudes are the sum of two
amplitudes corresponding to the s and t channels and therefore when squaring
them interference terms will appear.
The mean squared amplitude is, in the
case of an experiment with unpolarized beams and in which no
polarization measurements of the helicities of the final states are
made:
(6.205)
that, using the Mandelstam variables, gives (for a more detailed
calculation see reference [F6.8]):
(6.206)
or
(6.207)
The first and second terms correspond to the mean squared
amplitudes obtained, respectively, for the s and the t channels and the third is the
contribution from the interference terms discussed above.
Since, in the center-of-mass
reference frame,
and
the mean squared amplitude can be expressed as:
(6.208)
Finally the differential cross section in the c.m. reference frame
is:
(6.209)
This differential cross section is highly peaked forward (in the
limit of massless fermions it diverges).
Fig.
6.13
Differential Bhabha cross section measured by
L3 collaboration at GeV.
From L3 Collaboration, Phys. Lett. B623
(2005) 26
The agreement between the QED
predictions (including higher-order diagrams) and the experimental
measurements is so remarkable (Fig. 6.13) that this process
was used at LEP to determine the beam luminosity thanks to small
but precise calorimeters installed at low angles.
Fig.
6.14
A higher-order diagram with a fermion
loop
6.2.12Renormalizationand Vacuum Polarization
High-order diagrams often involve
closed loops where integration over momentum should be performed
(see Fig. 6.14). As these loops are virtual, they
represent phenomena that occur in timescales compatible with the
Heisenberg uncertainty relations. Since there is no limit on the
range of the integration and on the number of diagrams, the
probabilities may a priori diverge to infinity. We shall see,
however, that the effect of higher-order diagrams is the
redefinition of some quantities; for example, the “bare” (naked)
charge of the electron becomes a new quantity e that we measure in experiments. A
theory with such characteristics—i.e., a theory for which the
series of the contributions from all diagrams converges—is said to
be renormalizable.
To avoid confusion in what follows,
shall call now the “pure” electromagnetic
coupling.
Following the example of the
amplitude corresponding to the diagram represented in
Fig. 6.14, the photon propagator is modified by the
introduction of the integration over the virtual
fermion/antifermion loop leading to
where is the “bare” coupling parameter
, in the case of QED; refers to the “bare” coupling,
without renormalization).
The integral can be computed by
setting some energy cutoff M
and making in the end of the calculation. Then
it can be shown that
having dimensions of [m], and
The divergence is now logarithmic but it is still present.
Fig.
6.15
Higher-order diagrams with a fermion loop
leading to the renormalization of the fermion mass (left) and of
the magnetic moment (right)
The “renormalization miracle” consists in absorbing the
infinity in the definition of the coupling parameter. Defining
(6.210)
and neglecting terms (for that many other diagrams
have to be summed up, but the associated probability is expected to
become negligible)
is no more divergent but the coupling
parameter (the electric charge) is now a
function of :
(6.211)
Other diagrams as those represented Fig. 6.15 lead to the
renormalization of fundamental constants. In the left diagram,
“emission” and “absorption” of a virtual photon by one of the
fermion external lines contribute to the renormalization of the
fermion mass, while in the one on the right, “emission” and
“absorption” of a virtual photon between the fermion external lines
from a same vertex contribute to the renormalization of the fermion
magnetic moment and thus are central in the calculation of
as discussed in
Sect. 6.2.4.6. The contribution of these kinds of
diagrams to the renormalization of the charge cancels out, ensuring
that the electron and the muon charges remain the same.
The result in Eq. 6.211 can be written at
first order as
(6.212)
The electromagnetic coupling can be obtained by an appropriate
renormalization of the electron charge defined at an arbitrary
scale . The electric charge, and the
electromagnetic coupling parameter, “run” and increase with
. At momentum transfers close to the
electron mass , while close to the Z mass . The “running” behavior of the
coupling parameters is not a mathematical artifact: it is
experimentally well established that the strength of the
electromagnetic interaction between two charged particles increases
as the center-of-mass energy of the collision increases
(Fig. 6.16).
Fig.
6.16
Evolution of the QED effective coupling
parameter with momentum transfer. The theoretical curve is compared with measurements at
the Z mass at CERN’s LEP
collider.
From CERN Courier, August 2001
Fig.
6.17
Left: Artistic representation of the
screening of a charge by its own cloud of virtual charged
particle–antiparticle pairs. Right: Artistic view of the Casimir
effect.
From the Scientific American blog of Jennifer
Ouellette, April 19, 2012
Such an effect can be qualitatively
described by the polarization of the cloud of the virtual
fermion/antifermions pairs (mainly electron/positrons) by the
“bare” charge that is at the same time the source of the
electromagnetic field (Fig. 6.17, left). This bare charge is screened by
this polarized medium and its intensity decreases with the distance
to the charge (increases with the square of the transferred
momentum).
Even in the absence of any “real”
matter particle (i.e., in the vacuum), there is no empty space in
quantum field theory. A rich spectrum of virtual wave particles
(e.g., photons) can be created and destroyed under the protection
of the Heisenberg uncertainty relations and within its limits be
transfigurated into fermion/antifermion pairs. Space is thus full
of electromagnetic waves and the energy of its ground state (the
zero point energy ) is, like the ground state of any harmonic
oscillator, different from zero. The integral over all space of
this ground-state energy will be infinite, which leads to an
enormous challenge to theoretical physicists: what is the relation
of this effect with a nonzero cosmological constant which may
explain the accelerated expansion of the Universe observed in the
last years as discussed in Sect. 8.1?
A spectacular consequence is the
attraction experimented by two neutral planes of conductor when
placed face to face at very short distances, typically of the order
of the micrometer (see Fig. 6.17, right). This effect is known as the
Casimir effect, since it was predicted by
Hendrick Casimir5 in 1948 and later
experimentally demonstrated. The two plates impose boundary
conditions to the electromagnetic waves originated by the vacuum
fluctuations, and the total energy decreases with the distance in
such a way that the net result is a very small but measurable
attractive force.
A theory is said to be renormalizable
if (as in QED) all the divergences at all orders can be absorbed
into physical constants; corrections are then finite at any order
of the perturbative expansion. The present theory of the so-called
standard model of particle physics was proven to be renormalizable.
In contrast, the quantization of general relativity leads easily to
non-renormalizable terms and this is one of the strong motivations
for alternative theories (see Chap. 7). Nevertheless, the fact that a
theory is not renormalizable does not mean that it is useless: it
might just be an effective theory that works only up to some
physical scale.
6.3 Weak
Interactions
Weak interactionshave short range and contrary to the
other interactions do not bind particles together. Their existence
was first revealed in decay, and their universality was the
object of many controversies until being finally established in the
second half of the twentieth century. All fermions have weak
charges and are thus subject to their subtle or dramatic effects.
The structure of the weak interactions was found to be similar to
the structure of QED, and this fact is at the basis of one of the
most important and beautiful pieces of theoretical work in the
twentieth century: the Glashow–Weinberg–Salam model of electroweak
interactions, which, together with the theory of strong
interactions (QCD), constitutes the standard model (SM) of particle physics, that will be discussed in the
next chapter.
There are however striking differences
between QED and weak interactions: parity is conserved, as it was
expected, in QED, but not in weak interactions; the relevant
symmetry group in weak interactions is SU(2) (fermions are grouped in left doublets and
right singlets) while in QED the symmetry group is U(1); in QED there is only one massless vector
boson, the photon, while weak interactions are mediated by three
massive vector bosons, the and the
Z .
6.3.1 The
Fermi Model of Weak Interactions
The decay was known since long time when
Enrico Fermi in 1933 realized that the
associate transition amplitude could be written in a way similar to
QED (see Sect. 6.2.8). Assuming time reversal symmetry (see
discussion on crossing symmetries at the end of Sect. 6.2.10), one can see that
the transition amplitude for decay,
(6.213)
is, for instance, the same as:
(6.214)
The transition amplitude can then be seen as the interaction of a
hadronic and a leptonic current (Fig. 6.18) and may be written,
in analogy to the electron–muon elastic scattering discussed before
(Fig. 6.10), as
(6.215)
Fig.
6.18
Current–current description of the
decay in the Fermi model
Contrary to QED, in the Fermi model of
weak interactions fermions change their identity in the interaction
(; ), currents mix different charges (the
electric charges of the initial states are not the same as those of
the final states) and there is no propagator (the currents meet at
a single point: we are in front of a contact interaction).
The coupling parameter , known nowadays as the Fermi constant, replaces the factor present in the QED amplitudes
and thus has dimensions (GeV in natural units). Its order of
magnitude, deduced from the measurements of the decay rates, is (see Sect. 6.3.3). Assuming
point-like interactions has striking consequences: the Fermi weak
interaction cross sections diverge at high energies. On a
dimensional basis, one can deduce for instance that the
neutrino–nucleon cross section behaves like:
(6.216)
The cross section grows with the square of the center-of-mass
energy, and this behavior is indeed observed in low-energy neutrino
scattering experiments.
However, from quantum mechanics, it
is well known that a cross section can be decomposed in a sum over
all the possible angular momenta l and then
(6.217)
Being , this relation just means that
contribution of each partial wave is bound and its scale is given
by the area ( “seen” by the incident particle. In a
contact interaction, the impact parameter is zero and so the only
possible contribution is the S
wave (). Thus, the neutrino–nucleon cross
section cannot increase forever. Given the magnitude of the Fermi
constant , the Fermi model of weak interactions
cannot be valid for center-of-mass energies above a few hundreds of
GeV (this bound is commonly said to be imposed by unitarity in the
sense that the probability of an interaction cannot be larger than
1).
In 1938 Oscar Klein suggest that the
weak interactions may be mediated by a new field of short range,
the weak field, whose massive charged bosons (the ) act as propagators. In practice (see
Sect. 6.3.5),
(6.218)
Within this frame the weak cross sections no longer diverges and
the Fermi model is a low-energy approximation which is valid
whenever the center-of-mass energy (GeV).
The discovery of the muon extended
the applicability of the Fermi model of weak interactions. Bruno
Pontecorvo realized in the late 1940s that the capture of a muon by
a nucleus,
Current–current description of the muon decay
in the Fermi model
Although and decays are due to the same type of
interaction, their phenomenology is different:
the neutron lifetime is 900 s while the muon lifetime is
2.2 s;
the energy spectrum of the decay
electron is in both cases continuum (three-body decay) but its
shape is quite different (Fig. 6.20). While in
decay it vanishes at the endpoint, in
the case of is clearly nonzero.
These striking differences are
basically a reflection of the decay kinematics.
Fig.
6.20
Electron energy spectrum in decay of thallium 206 (left) and in
decay (right). Sources: F.A. Scott,
Phys. Rev. 48 (1935) 391; ICARUS Collaboration (S. Amoruso et al.),
Eur. Phys. J. C33 (2004) 233
Using once again dimensional
arguments, the decay width of these particles should behave as
(6.220)
where is the energy released in the decay.
In the case of the decay:
while in the decay
and therefore
On the other hand, the shape of the electron energy spectrum at the
endpoint is determined by the available phase space. At the
endpoint, the electron is aligned against the other two decay
products but, while in the decay the proton is basically at rest
(or remains “imprisoned” inside the nucleus) and there is only one
possible configuration in the final state, in the case of
decay, as neutrinos have negligible
mass, the number of endpoint configurations is quite large
reflecting the different ways to share the remaining energy between
the neutrino and the antineutrino.
6.3.2 Parity
Violation
The conservation of parity (see Sect. 5.3.6) was a dogma for physicists
until the 1950s. Then, a puzzle appeared: apparently two strange
mesons, denominated and (we know nowadays that and are the same particle: the
meson), had
the same mass, the same lifetime but different parities according
to their decay modes:
(6.221)
(6.222)
In the 1956 Rochester conference, the conservation of parity in
weak decays was questioned by Feynman reporting a suggestion of
Martin Block. Few months later, Lee and
Yang reviewed all the past experimental data and found that there
was no evidence of parity conservation in weak interactions, and
they proposed new experimental tests based on the measurement of
observables depending on axial vectors.
Fig.
6.21
Conceptual (left) and schematic (right)
diagram of the experimental apparatus used by Wu et al. (1957) to
detect the violation of the parity symmetry in decay. The green arrow in the left
panel indicates the direction of the electron flow through the
solenoid coils. The left plot comes from Wikimedia commons; the
right plot from the original article by Wu et al. Physical Review
105 (1957) 1413
C. S. Wu (known as “Madame Wu”) was
able, in a few months, to design and perform a decay experiment where nuclei of
(with total angular momentum
) decay into an excited state
(with total angular momentum
):
(6.223)
The was polarized (a strong magnetic field
was used, and the temperatures were as low as a few mK) and the
number of decay electrons emitted in the direction (or opposite to)
of the polarization field was measured (Fig. 6.21). The observed angle
between the electron and the
polarization direction followed a distribution of the form:
(6.224)
where P is the degree of
polarization of the nuclei and is the speed of the electron
normalized to the speed of light.
The electrons were emitted
preferentially in the direction opposite to the polarization of the
nuclei, thus violating parity conservation. In fact under a parity
transformation, the momentum of the electron (a vector) reverses
its direction while the magnetic field (an axial vector) does not
(Fig. 6.22). Pauli placed a bet: “I don’t believe
that the Lord is a weak left-hander, and I am ready to bet a very
high sum that the experiment will give a symmetric angular
distributions of electrons”—and lost.
Fig.
6.22
Parity transformation of electron and
magnetic field direction. The Wu experiment preferred the right
side of the mirror to the left one
6.3.3 V-A
Theory
The universality of the Fermi model of weak interactions was questioned long
before the Wu experiment. In the original Fermi model, only
decays in which there was no angular
momentum change in the nucleus (Fermi transitions) were allowed,
while the existence of decays where the spin of the nucleus
changed by one unity (the Gamow–Teller transitions) was already
well established. The Fermi model had to be generalized.
In the most general way, the currents
involved in the weak interactions could be written as a sum of
Scalar (S), Pseudoscalar (P), Vector (V), Axial (A), or Tensor (T)
terms following the Dirac bilinear forms referred in
Sect. 6.2.4:
(6.225)
where are arbitrary complex constants and
the are S, P, V, A, T operators.At the
end of 1956, George Sudarshan, a young Indian Ph.D. student working
in Rochester University under the supervision of Robert Marshak,
realized that the results on the electron–neutrino angular
correlation reported by several experiments were not consistent.
Sudarshan suggested that the weak interaction had a V-A structure. This structure was (in the own
words of Feynman) “publicized by
Feynman and Gell-Mann” in 1958 in a widely cited
article.
Each vectorial current in the Fermi
model is, in the (V-A) theory, replaced by a vectorial minus an
axial-vectorial current. For instance, the neutrino–electron
vectorial current present in the decay and in the muon decay
amplitudes (Eqs. 6.215, 6.219, and Fig. 6.18, respectively):
(6.226)
is replaced by
(6.227)
In terms of the Feynman diagrams, the factor associated with the
vertex becomes
(6.228)
Within the (V-A) theory, the transition amplitude of the muon
decay, which is a golden example of a leptonic weak interaction,
can then be written as:
(6.229)
The factor is introduced in order that
keeps the same numerical value. The
only relevant change in relation to the Fermi model is the
replacement:
The muon lifetime can now be computed using the Fermi golden rule.
This detailed computation, which is beyond the scope of the present
text, leads to:
(6.230)
showing the dependence anticipated in
Sect. 6.3.1 based just on dimensional
arguments.
In practice, it is the measurement of
the muon lifetime which is used to derive the value of the Fermi
constant:
(6.231)
The transition amplitude of the decay can, in analogous way, be
written as
(6.232)
The and constants reflect the fact that the
neutron and the proton are not point-like particles and thus form
factors may lead to a change on their weak charges. Experimentally,
the measurement of many nuclear decays is compatible with the
preservation of the value of the “vector weak charge” and a 25%
change in the axial charge:
The value of was found to be slightly lower (2%)
than the one found from the muon decay. This “discrepancy” was
cured with the introduction of the Cabibbo angle as it will be
discussed in Sect. 6.3.6.
6.3.4 “Left”
and “Right” Chiral Particle States
The violation of parity in weak
interactions observed in the Wu experiment and embedded in the
(V-A) structure can be translated in terms of interactions between
particles with well-defined states of chirality.
“Chiral” states are eigenstates of
, and they coincide with the helicity
states for massless particles; however, no such particles (massless
4-spinors) appear to exist, to our present knowledge—neutrinos have
very tiny mass. The operators and , when applied to a generic particle
bi-spinor u,
(Sect. 6.2.4) project, respectively, on eigenstates
with chirality 1 (R—Right) and −1 (L—Left). Chiral particle spinors can thus
be defined as
(6.233)
with . The adjoint spinors are given by
(6.234)
For antiparticles
(6.235)
(6.236)
Chiral states are closely related to helicity states but they are
not identical. In fact, applying the chiral projection operators
defined above to the helicity eigenstates (Sect. 6.2.4) one obtains, for
instance, for the right helicity eigenstate:
(6.237)
In the limit or , right helicity and right chiral
eigenstates coincide, otherwise not.
There is also a subtle but important
difference: helicity is not Lorentz invariant but it is time
invariant , while chirality is Lorentz invariant
but it is not time invariant . The above relation is basically valid
for .
Now, since
(6.238)
the weak (V-A) neutrino–electron current (Eq. 6.227) can be written
as:
(6.239)
the weak charged leptonic current involves then only chiral left
particles (and right chiral antiparticles).
In the case of the decay (the Wu experiment), the
electron and antineutrino masses can be neglected and so the
antineutrino must have right helicity and the electron left
helicity. Thus, as the electron and antineutrino have to add up
their spin to compensate the change by one unity in the spin
of the nucleus, the electron is preferentially emitted in the
direction opposite to the polarization of the nucleus
(Fig. 6.23).
Fig.
6.23
Schematic representation of the spin
alignment in the decay
The confirmation of the negative
helicity of neutrinos came from a sophisticated and elegant
experiment by M. Goldhaber, L. Grodzins, and A. Sunyar in 1957,
studying neutrinos produced in a K capture process (). A source emits europium nuclei
(, J 0) on a polarized electron target
producing excited (J 1) and a neutrino,
and the decays in the ground state
(J 0),
The longitudinal polarization of the decay photon was then
correlated with the helicity of the emitted neutrino in the K
capture process. The result was conclusive: neutrinos were indeed
left-handed particles.
The accurate calculation of the ratio
of the decay width of charged mesons into electron neutrinos with
respect to muon neutrinos was also one of the successes of the
(V-A) theory. According to (V-A) theory at first order:
(6.240)
while at the time this ratio was first computed the experimental
limit was wrongly much smaller . In fact, the (V-A) theoretical
prediction is confirmed by the present experimental determination:
(6.241)
In the framework of the (V-A) theory, if leptons were massless
these weak decays would be forbidden. In fact, the pion has spin 0,
the antineutrino is a right-handed particle and thus to conserve
angular momentum the helicity of the electron should be positive
(Fig. 6.24) which is impossible for a massless left
electron. However, the suppression of the decay into electron
neutrino face to the decay into muon neutrino, contrary to what
would be expected from the available decay phase space, is not a
proof of the (V-A) theory. It can be shown that a theory with V or
A couplings (or any combination of them) would also imply a
suppression factor of the order (for a detailed discussion see Sect.
7.4 of reference [F6.2]).
Fig.
6.24
Schematic representation of the spin
alignment in the decay
As a last example, the neutrino and
antineutrino handedness is revealed in the observed ratio of cross
sections for neutrino and antineutrino in isoscalar nuclei (with an
equal number of protons and neutrons) N at GeV energies:
(6.242)
Note that at these energies, the neutrinos and the antineutrinos
interact directly with the quarks and antiquarks the protons and
neutrons are made of (similarly to the electrons in the deep
inelastic scattering discussed in Sect. 5.5.3).
Let us now consider just valence
quarks in a first approximation. As electric charge and leptonic
number are conserved, a neutrino can just pick up a d quark transforming it into a u quark and emitting a . Antineutrinos will do the opposite.
In these conditions, neglecting masses, all fermions have negative
helicity and all antifermions have positive helicity. The total
angular momentum is therefore 0 for neutrino interactions and 1 for
antineutrino interactions (Fig. 6.25). Thus, the former
interaction will be isotropic while the amplitude of the latter
will be weighted by a factor . Then
(6.243)
and integrating over the solid angle
(6.244)
Fig.
6.25
Schematic representation of the spin
alignments in (left) and in (right) interactions
6.3.5
Intermediate Vector Bosons
Four-fermion interaction theories (like
Fermi model—see Sect. 6.3.1) violate unitarity at high energy and are
not renormalizable (all infinities cannot be absorbed into running
physical constants—see Sect. 6.2.12). The path to
solve such problem was to construct, in analogy with QED, a gauge
theory of weak interactions leading to the introduction of
intermediate vector bosons with spin 1: the and the Z. However, in order to model the short
range of the weak interactions, such bosons could not have zero
mass, and thus would violate the gauge symmetry. The problem was
solved by the introduction of spontaneously broken symmetries,
which then led to the prediction of the existence of the so-called
Higgs boson.
In this section, the modification
introduced on the structure of the charged weak currents as well as
the discovery of the neutral currents and of the and the Z bosons will be briefly reviewed. The
overall discussion on the electroweak unification and its
experimental tests will be the object of the next chapter.
6.3.5.1
Charged Weak Currents
The structure of the weak charged and
of the electromagnetic interactions became similar with the
introduction of the bosons, with the relevant difference
that weak-charged interactions couple left-handed fermions
(right-handed antifermions) belonging to SU(2) doublets, while
electromagnetic interactions couple fermions belonging to U(1)
singlets irrespective of chirality.
The muon decay amplitude deduced in (V-A) theory
(Eq. 6.229) is now, introducing the massive
propagator
(Fig. 6.26), written as:
(6.245)
or
(6.246)
Introducing explicitly the left and right spinors:
(6.247)
The derivation of the expression of the propagator for massive spin
1 boson is based on the Proca equation (Sect. 6.2.1) and it is out of
the scope of the present text. But whenever the term can be neglected, a Yukawa-type
expression, , is recovered. In the low-energy
limit, the two coupling parameters
(Eqs. 6.229 and 6.245) are thus related by:
(6.248)
is thus much smaller than
which is of the same order of
magnitude of the electromagnetic coupling g.
Fig.
6.26
First-order Feynman diagram for muon
decay
6.3.5.2
Neutral Weak Currents
Neutral weak currents were predicted
long before their discovery at CERN in 1973 (N. Kemmer 1937, O.
Klein 1938, S. A. Bludman 1958). Indeed the SU(2) structure of
charged interactions (leptons organized in weak isospin doublets)
suggested the existence of a triplet of weak bosons similarly to
the pion triplet responsible for the proton–neutron strong isospin
rotations.
However, if the charged components
would be the , the neutral boson could not be the
, which has no weak charge.
Furthermore, in the 1960s it was discovered that
strangeness-changing neutral currents (for instance ) were highly suppressed and thus some
thought that neutral weak interactions may not exist. Many
theorists however became enthusiastic about neutral currents around
the 1970s since they were embedded in the work by Glashow, Salam,
and Weinberg on electroweak unification (the GSW model, see
Sect. 7.2). From the experimental point of
view, it was clearly a very difficult issue and the previous
experimental searches on neutral weak processes lead just to upper
limits.
Neutrino beams were the key to such
searches. In fact, as neutrinos do not have electromagnetic and
strong charges, their only possible interaction is the weak one.
Neutrino beams are produced in laboratory
(Fig. 6.27, left) by the decay of secondary pions and
kaons coming from a primary high-energy proton interaction on a
fixed target. The charge and the momentum range of the pions and
kaons can be selected using a sophisticated focusing magnetic
optics system (narrow-band beam) or just loosely selected
maximizing the beam intensity (wide-band beam). The energy spectra
of such beams are quite different (Fig. 6.27, right). While the
narrow-band beam has an almost flat energy spectrum, the wide band
is normally peaked at low energies.
Fig.
6.27
Left: Neutrino narrow-band beam (top) and
wide-bam beam (bottom) production. Right: Narrow-band (lower curve)
and wide-band (upper curve) neutrino energy spectra. The y-axis represents the number of particles
per bunch
In the 1960s, a large heavy liquid
bubble chamber (18 tons of freon under a pressure of 10–15
atmospheres, in a magnetic field of 2 T) called Gargamelle was proposed by André Lagarrigue from
the École Polytechnique in Paris. The chamber was built in Saclay
and installed at CERN. Gargamelle could collect a significant
number (one order of magnitude above the previous experiments) of
neutrino interactions (Fig. 6.28). Its first physics priority was, in the
beginning of the 1970s, the test of the structure of protons and
neutrons just revealed in the deep inelastic scattering experiment
at SLAC (Sect. 5.5.3).
Fig.
6.28
Technicians at work in the Gargamelle bubble
chamber at CERN. Source: CERN
In a batch of about 700 000
photos of neutrino interactions, one event emerged as anomalous. In
that photo (Fig. 6.29, left), taken with an antineutrino beam,
just an electron was visible (giving rise to a small
electromagnetic cascade). This event is a perfect candidate for a
interaction (Fig. 6.29, right). The
background in the antineutrino beam was estimated to be negligible.
Fig.
6.29
Left: Gargamelle image (top) and sketch
(bottom) of the first observed neutral-current process . A muon antineutrino coming from the
left knocks an electron forward, creating a small shower of
electron–positron pairs. Source: CERN. Right: First-order Feynman
diagram for the neutral leptonic weak interactions
Neutral-current interactions should
be even more visible in the semileptonic channel. Their signature
should be clear: in charged semileptonic weak interactions, an
isolated muon and several hadrons could be produced in the final
state, while in the interactions mediated by the neutral current
there could be no muon (Fig. 6.30).
Fig.
6.30
First-order Feynman diagrams for the charged
(left) and neutral (right) semileptonic weak interactions
However, the background resulting from
neutron interactions in the chamber, being the neutrons produced in
neutrino interactions upstream the detector, is not negligible.
Careful background estimation had to be performed. The final
result, after several months of work and public discussions, was
that the number of events without a muon was clearly above the
expected number of background events. The existence of the weak
neutral currents was finally firmly established.
6.3.5.3 The
Discovery of the W and
Z Bosons
Neutral currents did exist, and the GSW
model proposed a complete and unified framework for electroweak
interactions: the intermediate vector bosons should be there (with
expected masses around 65 and 80 GeV for the and the Z, respectively, based on the data known
at that time). They had to be found.
In 1976, Carlo Rubbia pushed the idea to convert the existing
Super Proton Synchrotron accelerator at CERN (or the equivalent
machine at Fermilab) into a proton/antiproton collider. It was not
necessary to build a new accelerator (protons and antiprotons would
travel in opposite directions within the same vacuum tube) but
antiprotons had to be produced and kept alive during many hours to
be accumulated in an auxiliary storage ring. Another big challenge
was to keep the beam focused. Simon van der Meer made this possible developing an ingenious
strategy of beam cooling, to decrease the angular dispersion while
maintaining monochromaticity. In beginning of the 1980s, the CERN
SPS collider operating at a center-of-mass energy of 540 GeV was
able to produce the first and Z (Fig. 6.31) by quark/antiquark
annihilation (; ; ; ).
Fig.
6.31
and Z production in proton/antiproton
colliders
The leptonic decay channels with
electrons and muons in the final state were the most obvious
signatures to detect the so awaited bosons. The hadronic decay
channels as well as final states with tau leptons suffer from a
huge hadronic background due to the “normal” quark and gluon strong
interactions. Priority was then given to searches into the
channels:
(6.249)
and
(6.250)
Two general-purpose experiments, UA1 and
UA2, were built having the usual “onion”
structure (a tracking detector surrounded by electromagnetic and
hadronic calorimeters, surrounded by an exterior layer of muon
detectors). In the case of UA1, the central detector (tracking and
electromagnetic calorimeter) was immersed in a 0.7 T magnetic
field, perpendicular to the beam line, produced by a magnetic coil
(Fig. 6.32); the iron return yoke of the field was
instrumented to operate as a hadronic calorimeter. UA1 was designed
to be as hermetic as possible.
The first and Z events were recorded in 1983.
events were characterized by two
isolated high-energy deposits in the cells of the electromagnetic
calorimeter (Fig. 6.33 left) while events were characterized by an
isolated high-energy deposit in the cells of the electromagnetic
calorimeter and an important transverse missing energy
(Fig. 6.33 right).
Fig.
6.33
Left: Two high-energy deposits from a
event seen in the electromagnetic
calorimeter of the UA2 experiment. Right: A high-energy deposit
with accompanying missing transverse momentum from a event.
The Z mass in this type of events can be
reconstructed just computing the invariant mass of the final state
electron and positron:
(6.251)
where is the angle between the electron and
positron.
The distribution of the measured
for the first and candidate events by UA1 and UA2 is
represented in Fig. 6.34. The best-fit value presented by Carlo
Rubbia in his Nobel lecture (1984) was of GeV—the present value, after LEP, is
GeV.
The reconstruction of the
mass is more subtle—the missing
energy does not allow a full kinematical constraint. The best way
is to take it from the shape of the differential cross section as a function of the
transverse momentum (the so-called Jacobian peak method). In fact,
neglecting the electron and neutrino masses, the transverse
momentum of the is given by
(6.252)
where is the production angle in the
center-of-mass reference frame. Then
The measured value for by UA1 and UA2 was, respectively,
GeV and ) GeV—the present world average is
GeV.
Fig.
6.35
Differential cross section as a function of
transverse momentum. The gray (black) line refers to a measurement
with an ideal (real) detector
Finally the V-A character of the
charged weak interactions, as well as the fact that the W has spin 1, is revealed by the
differential cross section as a function of for the electron produced in the
W semileptonic decay, which
displays a dependence (Fig. 6.36).
In fact, at CERN collider energies,
neglecting the masses of the quarks and leptons and considering
that are mainly produced by the
interaction of valence quarks (from the proton) and valence
antiquarks (from the antiproton), the helicity of the third
component of spin of the is along the antiproton beam
direction and thus the electron (positron) is emitted
preferentially in the proton (antiproton) beam direction
(Fig. 6.37).
Fig.
6.37
Helicity in the production and decay
6.3.6 The
Cabibbo Angle and the GIM Mechanism
The universality of weak interactions
established in the end 1940s (see Sect. 6.3.1) was questioned
when it was discovered that some strange particle decays (as for
instance or ) were suppressed by a factor around
20 in relation to what expected.
The problem was solved in 1963 by
Nicola Cabibbo,7 who suggested
that the quark weak and strong eigenstates may be not the same. At
that time only the u,
d, and s quarks were known
(Sect. 5.7.2) and Cabibbo conjectured that
the two quarks with electromagnetic charge (d and s) mixed into a weak eigenstate
such as:
(6.257)
where is a mixing angle, designated as the
Cabibbo angle.
Then the quark couplings involved in the
, and decays are, respectively , and (Fig. 6.38). The value of the
Cabibbo angle is not predicted in the theory of electroweak
interactions. Its present (PDG 2016) experimental value is
, which corresponds to an angle of
about .
Fig.
6.38
Weak decay couplings: Leptonic (top),
semileptonic involving (bottom), and not involving (middle) strange
quarks
Fig.
6.39
Possible s and d quark transitions generated by
Z (top) and (bottom) couplings (three
families)
In the Cabibbo model transitions
between the s and d quarks would happen both via neutral
currents (through the Z) or
charged currents (through double exchange) as shown in
Fig. 6.39. Decays like would then be allowed
(Fig. 6.40), both at leading order and at one loop.
However, the experimental branching ratio of the process is of the order of
: flavor-changing neutral currents
(FCNC) appear to be strongly suppressed, even below what is
predicted taken into account only the diagram involving double
W exchange.
Fig.
6.40
decay diagrams
Fig.
6.41
FCNC suppression by diagram cancellation
Fig.
6.42
The two orthogonal combinations of the quarks
s and d in the and states
Glashow, Iliopoulos, and Maiani
proposed in 1970 the introduction of a fourth quark, the charmc, to
symmetrize the weak currents, organizing the quarks into two SU(2)
doublets. Such scheme, known as the GIM mechanism, solves the
FCNC puzzle and was spectacularly
confirmed with the discovery of the J/meson (see
Sect. 5.4.4). FCNC are in this mechanism
suppressed by the cancelation of the two lowest diagrams in
Fig. 6.41. In fact, in the limit of equal masses the
cancelation would be perfect but, as the c mass is much higher than u mass, the sum of the diagrams will lead
to terms proportional to .
There are now two orthogonal
combinations of the quarks s
and d (Fig. 6.42):
which couple, via the , respectively to the u and c quarks.
The GIM mechanism can be translated
in a matrix form as
(6.258)
where is a 2 2 rotation matrix.
6.3.7
Extension to Three Quark Families: The CKM Matrix
A generic mixing matrix for three
families can be written as
(6.259)
meaning that, for example, the square of the coupling of the
b quark to the u quark in the weak transition (which is
in turn proportional to the probability of the transition) would
be:
(6.260)
The Japanese physicists Makoto Kobayashi
and Toshihide Maskawa proposed this form
of quark mixing matrix in 1973. Their work was built on that of
Cabibbo and extended the concept of quark mixing from two to three
generations of quarks. It should be noted that, at that time, the
third generation had not been observed yet and even the second was
not fully established. But, as we shall see, the extension to three
families would allow to qualitatively explain the violation of the
CP symmetry, i.e., of the
product of the operations of charge conjugation and parity. In
2008, Kobayashi and Maskawa shared one half of the Nobel Prize in
Physics “for the discovery of the origin of the broken symmetry
which predicts the existence of at least three families of quarks
in nature.”
A priori, being the complex numbers, the CKM matrix might
have degrees of freedom; however, the
physical constraints reduce the free elements to The physical constraints are:
Unitarity. If there are only three
quark families, one must have
(6.261)
where I is the identity matrix.
This will guarantee that in an effective transition each u-type quark will transform into one of
the three d-type quarks (i.e.,
that the current is conserved and no fourth generation is present).
This constraint reduces the number of degrees of freedom to
; the six equations underneath can be
written explicitly as (the so-called weak invariance):
(6.262)
and
(6.263)
This last equation is a constraint on three sets of three complex
numbers, telling that these numbers form the sides of a triangle in
the complex plane. There are three independent choices of
i and j, and hence three independent triangles;
they are called unitarity triangles, and
we shall discuss them later in larger detail.
Phase invariance. of these parameters leave physics
invariant, since one phase can be absorbed into each quark field,
and an overall common phase is unobservable. Hence, the total
number of free variables is
Four independent parameters are thus
required to fully define the CKM matrix (). This implies that the most general
33 unitary matrix cannot be constructed
using real numbers only: Eq. 6.261 implies that a
real matrix has only three degrees of freedom, and thus at least
one imaginary parameter is required.
Many parameterizations have been
proposed in the literature. An exact parametrization derived from
the original work by Kobayashi and Maskawa (KM) extends the concept of Cabibbo angle;
it uses three angles , , , and a phase :
(6.264)
with the standard notations and ( is the Cabibbo angle).
Another frequently used
parametrization of the CKM matrix is the so-called Wolfenstein
parametrization. It refers to four free
parameters , A, , and , defined as
(6.265)
(6.266)
(6.267)
( is the sine of the Cabibbo angle). We
can use the experimental fact that and expand the matrix in powers of
. We obtain at order :
(6.268)
As we shall see in the following, the combination of parameters
and can be very useful.
The experimental knowledge of the
terms of the CKM matrix comes essentially for the comparative study
of probability of transitions between quarks. It is anyway
challenging and difficult, since quarks are embedded in hadrons,
and form factors for which only numerical QCD calculations are
possible play a relevant role. In any case, the present (PDG 2017)
experimental knowledge of the CKM matrix can be summarized in terms
of the Wolfenstein parameters as:
6.3.8CPViolation
Weak interactions violate the parity
and the charge conjugation symmetries. But, for a while, it was
thought that the combined action of charge and parity
transformation (CP) would
restore the harmony physicists like so much. Indeed a left-handed
neutrino transforms under CP
into a right-handed antineutrino and the conjugate CP world still obeys to the V-A theory.
However, surprisingly, the study of the system revealed in 1964 a small
violation of the CP symmetry.
In the turn of the century, CP
violation was observed in many channels in the B sector. Since
then, an intense theoretical and experimental work has been
developed for the interpretation of these effects in the framework
of the standard model, in particular by the precise determination
of the parameters of the CKM matrix and by testing its
self-consistency.
6.3.8.1 Mixing
Already in
1955, Gell-Mann and Pais had observed that the and the , which are eigenstates of the strong
interaction, could mix through weak box diagrams as those
represented in Fig. 6.43.
Fig.
6.43
Leading box diagrams for the mixing
A pure () beam will thus develop a
() component, and at each time, a
linear combination of the and of the may be observed. Since CP is conserved in hadronic decays, the
combinations which are eigenstates of CP are of particular relevance.
or are not CP eigenstates: in fact, they are the
antiparticle of each other and the action of the CP operator may be, choosing an
appropriate phase convention, written as:
(6.269)
(6.270)
Then the linear combinations
(6.271)
(6.272)
are CP eigenstates with
eigenvalues and , respectively.
The can thus decay into a two-pion system
(which has CP eigenvalue), while the can, if CP is conserved, only decay into a
three-pion system (which has CPeigenvalue).
The phase spaces associated with
these decay modes are, however, quite different: MeV; MeV. Thus, the corresponding
lifetimes are also quite different:
The short and the long lifetime states are usually designated by
K-short () and K-long (), respectively. These states are
eigenstates of the free-particle Hamiltonian, which includes weak
mixing terms, and if CP were a
perfect symmetry, they would coincide with and , respectively. The and wavefunctions evolve with time,
respectively, as
(6.273)
(6.274)
where () and () are, respectively, the mass and the
width of the () mesons (see Sect. 2.6).
and , being a combination of and ,
(6.275)
(6.276)
will also evolve in time. Indeed, considering initially a beam of
pure with an energy of a few GeV, just
after a few tens of cm, the large majority of the mesons will decay and the beam will
become a pure beam. The probability to find a
in this beam after a time t can be expressed as:
(6.277)
where , , and is the difference between the masses
of the two eigenstates. The last term, coming from the interference
of the two amplitudes, provides a direct measurement of
.
Similarly, the probability to find a
in this beam after a time t can be expressed as:
(6.278)
In the limit ,, a pure flavor oscillation between
or would occur:
In the real case, however, the oscillation is damped and the
survival probability of both and converges quickly to .
Measuring the initial oscillation
through the study of semileptonic decays, which will be discussed
later on in this section, was determined to be
and have quite different lifetimes but
almost the same mass.
6.3.8.2CP Violation in 2Modes
In 1964, Christenson, Cronin, Fitch, and
Turlay8 performed the historical experience
(Fig. 6.44) that revealed by the first time the
existence of a small fraction of two-pion decays in a beam:
The beam was produced in a primary target
placed downstream the experiment, and the
observed decays occurred in a volume of He gas to minimize
interactions. Two spectrometers each composed by two spark chambers
separated by a magnet and terminated by a scintillator and a water
Cherenkov measured and identified the charged decay products.
The presence of two-pion decay modes
implied that the long-lived was not a pure eigenstate of
CP. The and should then have a small component of
and , respectively:
(6.279)
(6.280)
where is a small complex parameter
(6.281)
and and are, respectively, the differences
between the masses and the decay widths of the two
eigenstates.
Alternatively, and can be expressed as a function of the
flavor eigenstates and as
(6.282)
(6.283)
or, inverting the last two equations,
(6.284)
(6.285)
The probability that a state initially produced as a pure
or will decay into a system will then evolve in time. A
“2 asymmetry” is usually defined as:
(6.286)
Fig.
6.45
Asymmetry in 2 decays between and tagged events. Time is measured in
lifetimes.
From A. Angelopoulos et al. Physics Reports
374 (2003) 165
This asymmetry depends on
and and was measured, for instance, by
the CPLEAR experiment at CERN (Fig. 6.45) as a function of
the time. Fixing to the world average, it was
obtained:
6.3.8.3CP Violation in Semileptonic
, Decays
and decay also semileptonically through
the channels:
and thus CP violation can also
be tested measuring the charge asymmetry ,
(6.287)
This asymmetry is related to the CP violating parameter :
(6.288)
The measured value is positive, and it is in good
agreement with the measurement of obtained in the 2 decay modes. The number of
having in their decay products an
electron is slighter smaller (0.66%) than the number of
having in their decay products a
positron. There is thus an unambiguous way to define what is matter
and what is antimatter.
6.3.8.4
Direct CPViolation
CP
violation was so far discussed, in the mixing system , in terms of a not perfect
identification between the free-particle Hamiltonian eigenstates
(, ) and the CP eigenstates (, ) as it was expressed in equations
6.279 and
6.280.
In this context, the decays of
and into 2 modes are only due to the presence in
both states of a component. It is then expected that
the ratio of the decay amplitudes of the and of the into 2 modes should be equal to and independent of the charges of the
two pions:
(6.289)
However, it was experimentally established that
(6.290)
and
(6.291)
although having both a similar value (about 2 ) are significantly, different. In
fact, their present experimental ratio is:
(6.292)
This difference is interpreted as the existence of a directCP violation in the decays. In other words, the decay
rate of a meson to a given final state is not equal to the decay
rate of its antimeson to the corresponding CP-conjugated final state:
(6.293)
The CP violation discussed
previously in the mixing of the system is now denominated indirectCPviolation.
This CPviolation is related to the observation that the
oscillation of a given meson to its antimeson may be different from
the inverse oscillation of the antimeson to the meson:
(6.294)
Finally, CP violation may also
occur whenever both the meson and its antimeson can decay to a
common final state with or without mixing:
(6.295)
In this case, both direct and indirect CP violations may be present.
The directCP violation is usually quantified by a
parameter . Assuming that this direct CP violation occurs in the K decays into 2 modes due to the fact that the
2 system may be formed in different
isospin states () and the corresponding decay
amplitudes may interfere, it can be shown that and can be written as
(6.296)
(6.297)
The ratio between the CP
violating parameters can also be related to the double ratio of the
decay probabilities and into specific 2 modes:
(6.298)
The present (PDG 2016) experimental value for this ratio is
(6.299)
6.3.8.5CP Violation in the BSector
Around 40 years after the discovery
of the CP violation in the
system, a large CP violation in the system was observed. The () differs at the quark level from the
( just by the replacement of the
s () quark by a b () quark. Thus, and should mix through similar weak box
diagrams (Fig. 6.46), and the CP eigenstates should be also a
combination of both.
Fig.
6.46
Leading box diagrams for the mixing.
From S. Braibant, G. Giacomelli, and M.
Spurio, “Particles and fundamental interactions”, Springer 2012
However, these CP eigenstates have similar lifetimes
since the b quark has a much
larger mass than the s quark
and thus the decay phase space is large for both CP eigenstates. These eigenstates are
called B-Light () and B-Heavy () according to their masses, although
their mass difference, GeV, is small. The and meson cannot, therefore, be
disentangled just by allowing one of them to decay and thus there
are no pure or beams. Another strategy has to be
followed.
In fact, the observation of the
CP violation in the B sector was first found studying the
time evolution of the decay rates of the and the mesons to a common final state
(), namely to .
At the BaBar
experiment,9 B mesons pairs were produced in the
reaction
The states evolved entangled, and
therefore, if one of the mesons was observed (“tagged”) at a given
time, the other had to be its antiparticle. The “tag” of the flavor
of the B mesons could be done
through the determination of the charge of the lepton in B semileptonic decays:
(6.300)
Fig.
6.47
Decay rate to as a function of time of each of the
B flavor states (top) and the
derived time asymmetry (bottom).
From C. Chen (BaBar), Contribution to the
34th International Conference on High-Energy Physics (July
2008)
It was thus possible to determine the
decay rate of the untagged B
meson to as a function of its decay time. This
rate is shown, both for “tagged” and in Fig. 6.47. The observed
asymmetry:
(6.301)
is a clear proof of the CP
violation in this channel. This asymmetry can be explained by the
fact that the decays can occur with or without mixing. The decay
amplitudes for these channels may interfere. In the case of the
, the relevant amplitudes are
and .
Nowadays, after the experiments Belle
and BaBar at the B factories at
KEK and SLAC, respectively, and after the first years of the LHCB
experiment at LHC, there is already a rich spectrum of B channels where CP violation was observed at a level
above . These results allowed a precise
determination of most of the parameters of the CKM matrix and
intensive tests of its unitarity as it will be briefly discussed in
the next section.
6.3.8.6CP Violation in the Standard
Model
CP violation in weak interactions can be
linked to the existence of the complex phase of the CKM matrix
which is expressed by the parameters and , respectively, in the KM and in the
Wolfenstein parametrizations (see Sect. 6.3.7). As a consequence,
a necessary condition for the appearance of the complex phase, and
thus for CP violation, is the
presence of at least three generations of quarks (this clarifies
the power of the intuition by Kobayashi and Maskawa). The reason
why a complex phase in the CKM matrix causes CP violation can be seen as follows.
Consider a process and the CP-conjugated between their antiparticles, with the
appropriate helicity reversal. If there is no CP violation, the amplitudes, let us call
them and , respectively, must be given by the
same complex number (except that the CKM terms get conjugated). We
can separate the magnitude and phase by writing
(6.302)
(6.303)
where is the phase term introduced from the
CKM matrix (called often “weak phase”) and is the phase term generated by
CP-invariants interactions in
the decay (called often “strong phase”). The exact values of these
phases depend on the convention but the differences between the
weak phases and between the strong phases in any two different
terms of the decay amplitude are independent of the
convention.
Since physically measurable reaction
rates are proportional to , so far nothing is different.
However, consider a process for which there are different paths
(say for simplicity two paths). Now we have:
(6.304)
(6.305)
and in general . Thus, a complex phase may give rise
to processes that proceed at different rates for particles and
antiparticles, and the CP
symmetry may be violated. For example, the decay is 13% more common than its
CP conjugate .
The unitarity of the CKM matrix
imposes, as we have discussed in Sect. 6.3.7), three independent
orthogonality conditions:
These conditions are sums of three complex numbers and thus can be
represented in a complex plane as triangles, usually called the
unitarity triangles.
In the triangles obtained by taking
scalar products of neighboring rows or columns, the modulus of one
of the sides is much smaller than the other two. The equation for
which the moduli of the triangle are most comparable is
One of the six unitarity triangles. The
description of the sides in terms of the parameters in the
Wolfenstein parametrization is shown
The triangle is represented in the
() phase space (see the discussion on
the Wolfenstein parametrization in Sect. 6.3.7); its sides were
divided by , which is the best-known element in
the sum; and it is rotated in order that the side with unit length
is aligned along the real () axis. The apex of the triangle is by
construction located at (), and the angles can be defined by:
(6.307)
It can also be demonstrated that the areas of all unitarity
triangles are the same, and they equal half of the so-called
Jarlskog invariant (from the Swedish
physicist Cecilia Jarlskog), which can be expressed as in the Wolfenstein
parametrization.
The fact that the Jarlskog invariant is
proportional to shows that the unitarity triangle is
a measure of CP violation: if
there is no CP violation, the
triangle degenerates into a line. If the three sides do not close
to a triangle, this might indicate that the CKM matrix is not
unitary, which would imply the existence of new physics, in
particular the existence of a fourth quark family.
The present (2016) experimental
constrains on the CKM unitarity triangle, as well as a global fit
to all the existing measurements by the CKMfitter
group,10 are shown in Fig. 6.49.
Fig.
6.49
Unitarity triangle and global CKM fit in
the plane (). Results from PDG 2017; updated
results and plots are available at http://ckmfitter.in2p3.fr
All present results are consistent with
the CKM matrix being the only source of CP violation in the standard model.
Nevertheless, it is widely believed that the observed
matter–antimatter asymmetry in the Universe (see next section)
requires the existence of new sources of CP violation that might be revealed
either in the quark sector as small inconsistencies at the CKM
matrix, or elsewhere, like in precise measurements of the neutrino
oscillations or of the neutron electric dipole moments. The real
nature of CP violation is still
to be understood.
6.3.9
Matter–Antimatter Asymmetry
The existence of antimatter predicted
by Dirac in 1930 and discovered by Anderson (see
Chap. 3) is still today the object of
intense study and speculation: Would the physics of an
antimatter-dominated Universe be identical to the physics of the
matter-dominated Universe we are leaving in? Is there any other
CP violation process than the
tiny ones observed so far? How, in the framework of the Big Bang
model, did the Universe became matter dominated?
Antiparticles are currently produced in
accelerators and observed in cosmic rays interactions in a small
amount level (for instance, ) (see Chap. 10). At CERN the study of antimatter
atoms has been pursued in the last 20 years. Antihydrogen
atoms have been formed and trapped for periods as long as
16 min and recently the first antihydrogen beams were
produced. The way is open to detailed studies of the antihydrogen
hyperfine transitions and to the measurement of the gravitational
interactions between matter and antimatter. The electric charge of
the antihydrogen atom was found by the ALPHA experiment to be
compatible with zero to eight decimal places ().
No primordial antimatter was observed
so far, while the relative abundance of baryons () to photons () was found to be (see
Sect. 8.1.3):
(6.308)
Although apparently small, this number is many orders of magnitude
higher than what could be expected if there would be in the early
Universe a equal number of baryons and antibaryons. Indeed in such
case the annihilation between baryons and antibaryons would have
occurred until its interaction rate equals the expansion rate of
the Universe (see Sect. 8.1.2) and the expected ratios were
computed to be:
(6.309)
The excess of matter over antimatter should then be present before
nucleons and antinucleons are formed. On the other hand, inflation
(see Sect. 8.3.2) would wipe out any excess of
baryonic charge present in the beginning of the Big Bang. Thus,
this excess had to be originated by some unknown mechanism
(baryogenesis) after inflation and before or during the quark–gluon
plasma stage.
In 1967, soon after the discovery of
the CMB and of the violation of CP in the system (see Sect. 6.3.8.2), Andrej
Sakharov11 modeled the Universe evolution from a
baryonic number initial state to the present state. This model imposed
three conditions which are nowadays known as the Sakharov conditions:
1.
Baryonic number (B) should be violated.
2.
Charge (C) and Charge and Parity (CP) symmetries should be violated.
3.
Baryon-number violating interactions
should have occurred in the early Universe out of thermal
equilibrium.
The first condition is obvious. The
second is necessary since if C
and CP were conserved any
baryonic charge excess produced in a given reaction would be
compensated by the conjugated reaction. The third is more subtle:
if the baryon-number violating interactions would have occurred in
thermal equilibrium, other processes would restore the symmetry
between baryons and antibaryons imposed by Boltzmann
distribution.
Thermal equilibrium may have been
broken when symmetry-breaking processes had occurred. Whenever two
phases are present, the boundary regions between these (for
instance the surfaces of bubbles in boiling water) are out of
thermal equilibrium. In the framework of the standard model (see
Chap. 7), this fact could in principle had
occurred at the electroweak phase transition. However, it was
demonstrated analytically and numerically that, for a Higgs with a
mass as the one observed recently (), the electroweak phase transition
does not provide the thermal instability required for the formation
of the present baryon asymmetry in the Universe.
The exact mechanism responsible for the
observed matter–antimatter asymmetry in the Universe is still to be
discovered. Clearly the standard model is not the end of
physics.
6.4 Strong
Interactions and QCD
The quark model simplifies the
description of hadrons. We saw that deep inelastic scattering
evidences a physical reality for quarks, although the interaction
between these particles is very peculiar, since no free quarks have
been observed up to now. A heuristic form of the potential between
quarks with the characteristics needed has been shown.
Within the quark model, we needed to
introduce a new quantum number, the color, to explain how bound stated of three identical
quarks can exist and not violate the Pauli exclusion principle.
Invariance with respect to color can be described by a symmetry
group SU(3), where the subscript c indicates color.
The theory of quantum chromodynamics
(QCD) enhances the concept of color from a role of
label to the role of charge and is the basis for the description of
the interactions binding quarks in hadrons. The phenomenological
description through an effective potential can be seen as a limit
of this exact description, and the strong interactions binding
nucleons can be explained as van der Waals forces between neutral
objects.
QCD has been extensively tested and is
very successful. The American physicists David J. Gross, David Politzer,
and Frank Wilczek shared the 2004 Nobel
Prize for physics by devising an elegant mathematical framework to
express the asymptotic (i.e., in the limit of very short distances,
equivalent to the high momentum transfer limit) freedom of quarks
in hadrons, leading to the development of QCD.
However, a caveat should be stressed.
At very short distances, QCD is essentially a theory of free quarks
and gluons—with relatively weak interactions, and observables can
be perturbatively calculated. At longer wavelengths, of the order
the proton size 1 , the coupling parameter between
partons becomes too large to compute observables (we remind that
exact solutions are in general impossible, and perturbative
calculations must be performed): the Lagrangian of QCD, that in
principle contains all physics, becomes de facto of little help in
this regime. Parts of QCD can thus be calculated in terms of the
fundamental parameters using the full dynamical (Lagrangian)
representation, while for other sectors one should use models,
guided by the characteristics of the theory, whose effective
parameters cannot be calculated but can be constrained by
experimental data.
6.4.1
Yang–Mills Theories
Before formulating QCD as a gauge
theory, we must extend the formalism shown for the description of
electromagnetism (Sect. 6.2.6) to a symmetry group like SU(3). This
extension is not trivial, and it was formulated by Yang and Mills
in the 1950s.
U(1). Let us first summarize the
ingredients of the U(1) gauge theory—which is the prototype
of the abelian gauge theories, i.e., of the
gauge theories defined by symmetry groups for which the generators
commute. We have seen in Sect. 6.2.3 that the
requirement that physics is invariant under local U(1) phase
transformation implies the existence of the photon gauge field. QED
can be derived by requiring the Lagrangian to be invariant under
local U(1) transformations of the form —note the identity operator I, which, in the case of U(1), is just
unity. The recipe is:
Find the gauge invariance of the
theory—in the case of electromagnetism U(1):
(6.310)
Replace the derivative in the
Lagrangian with a covariant
derivative
(6.311)
where transforms as
(6.312)
The Lagrangian
(6.313)
with
(6.314)
is invariant for the local gauge transformation, and the field
and its interactions with
are defined by the invariance itself.
Note that the Lagrangian can be written as
where is the locally invariant Lagrangian
for the particle, is the field Lagrangian.
What we have seen for U(1) can be
trivially extended to symmetries with more than one generator, if
the generators commute (Abelian symmetry groups).
Non-Abelian Symmetry Groups and Yang–Mills
Theories . When the symmetry
group is non-Abelian, i.e., generators do not commute, the above
recipes must be generalized. If the generators of the symmetry are
, with 1, ..., n, one can write the gauge
invariance as
(6.315)
From now on, we shall not explicitly write the sum over a—the index varying within the set of the
generators, or of the gauge bosons, which will be assumed
implicitly when the index is repeated; generators are a group. We
do not associate any particular meaning to the fact that a is subscript or superscript.
If the commutation relations hold
(6.316)
one can define the covariant derivative as
(6.317)
where are the vector potentials, and
g is the coupling parameter. In
four dimensions, the coupling parameter g is a pure number and for a SU(n) group
one has .
The gauge field Lagrangian has the
form
(6.318)
The relation
(6.319)
can be derived by the commutator
(6.320)
The field is self-interacting: from the given Lagrangian, one can
derive the equations
(6.321)
A source enters into the equations of motion
as
(6.322)
One can demonstrate that a Yang–Mills theory is not renormalizable
for dimensions greater than four.
6.4.2 The
Lagrangian of QCD
QCD is based on the gauge group SU(3),
the Special Unitary group in 3 dimensions (each dimension is a
color, conventionally ). This group is represented by the
set of unitary complex matrices with determinant one
(see Sect. 5.3.5).
Since there are nine linearly
independent unitary complex matrices, there are a total of eight
independent directions in this matrix space, i.e., the carriers of
color (called gluons) are eight. Another
way of seeing that the number of gluons is eight is that SU(3) has
eight generators; each generator represents a color exchange, and
thus a gauge boson (a gluon) in color space.
These matrices can operate both on each
other (combinations of successive gauge transformations, physically
corresponding to successive gluon emissions and/or gluon
self-interactions) and on a set of complex 3-vectors, representing
quarks in color space.
Due to the presence of color, a
generic particle wave function can be written as a three-vector
which is a superposition of fields
with a definite color index . The SU(3) symmetry corresponds to
the freedom of rotation in this three-dimensional space. As we did
for the electromagnetic gauge invariance, we can express the local
gauge invariance as the invariance of the Lagrangian with respect
to the gauge transformation
(6.323)
where the are the eight generators of the SU(3)
group, and the are generic local transformations.
is the strong coupling, related to
by the relation ; we shall return to the strong
coupling in more detail later.
Usually, the generators of SU(3) are
written as
(6.324)
where the are the so-called Gell–Mann matrices,
defined as:
As discussed in Sect. 5.3.5, these generators are just
the SU(3) analogs of the Pauli matrices in SU(2) (one can see it by
looking at , and ). Note that superscribing or
subscribing an index for a matrix makes no difference in this
case.
As a consequence of the local gauge
symmetry, eight massless fields will appear (one for each generator);
these are the gluon fields. The covariant derivative can be written
as
(6.325)
Finally, the QCD Lagrangian can be written as
(6.326)
where is the quark mass, and is the gluon field strength tensor
for a gluon with color index a,
defined as
(6.327)
and the are defined by the commutation
relation . These terms arise since the
generators do not commute.
To guarantee the local invariance,
the field transforms as:
(6.328)
6.4.3Vertices in QCD; Color Factors
The only stable hadronic states are
neutral in color. The simplest example is the combination of a
quark and antiquark, which in color space corresponds to
(6.329)
A random (color-uncorrelated) quark–antiquark pair has a
chance to be in a singlet state,
corresponding to the symmetric wave function ; otherwise it is in an overall octet
state (Fig. 6.50).
Fig.
6.50
Combinations of a quark and an antiquark in
color space
Correlated production processes like
or will project out specific components
(here the singlet and octet, respectively).
In final states, we average over all
incoming colors and sum over all possible outgoing ones. Color factors are thus associated with
QCD processes; such factors basically count the number of “paths
through color space” that the process can take, and multiply the
probability for a process to happen.
A simple example is given by the
decay (see Sect. 7.5.1). This vertex contains a
in color space: the outgoing quark
and antiquark must have identical (anti-)colors. Squaring the
corresponding matrix element and summing over final state colors
yields a color factor
(6.330)
since i and j are quark indices.
Another example is given by the
so-called Drell–Yan process,
(Sect. 6.4.7.1) which is just
the reverse of the previous one. The square of the matrix element
must be the same as before, but since the quarks are here incoming,
we must average rather than sum
over their colors, leading to
(6.331)
and the color factor entails now a suppression due to the fact that only
quarks of matching colors can produce a Z boson. The chance that a quark and an
antiquark picked at random have a corresponding color–anticolor is
.
Color factors enter also in the
calculation of probabilities for the vertices of QCD. In
Fig. 6.51, one can see the definition of color
factors for the three-body vertices , (notice the difference from QED:
being gluons colored, the “triple gluon vertex” can exist, while
the vertex does not exist) and
.
After tedious calculations, the color
factors are
(6.332)
6.4.4 The
Strong Coupling
When we discussed QED, we analyzed the
fact that renormalization can be absorbed in a running value for
the charge, or a running value for the coupling parameter.
This can be interpreted physically as
follows. A point-like charge polarizes the vacuum, creating
electron–positron pairs which orient themselves as dipoles
screening the charge itself. As increases (i.e., as the distance from
the bare charge decreases), the effective charge perceived
increases, because there is less screening. Mathematically, this is
equivalent to the assumption that the coupling parameter increases
as increases.
Fig.
6.51
Basic three-body vertices of QCD, and
definition of the color factors
Also in the case of QCD, the
calculation based on the currents gives a logarithmic expression
for the coupling parameter, which is governed by the so-called
beta function ,
(6.333)
where
(6.334)
with
(6.335)
(6.336)
In the expression for , the first term is due to gluon loops
and the second to the quark loops. In the same way, the first term
in the coefficient comes from double gluon
loops, and the others represent mixed quark–gluon loops.
Fig.
6.52
Dependence of on the energy scale Q; a fit to QCD is superimposed.
From K.A. Olive et al. (Particle Data Group),
Chin. Phys. C 38 (2014) 090001
At variance with the QED expression
(6.212), the
running parameter increases with decreasing .
(6.337)
There is thus no possibility to define a limiting value for
, starting from which a perturbative
expansion could be made (this was the case for QED). The value of
the strong coupling must thus be specified at a given reference
scale, typically (where most measurements have been
performed thanks to LEP), from which we can obtain its value at any
other scale by solving Eq. 6.333,
(6.338)
The running coupling parameter is shown as calculated from
, in Fig. 6.52, and compared to the
experimental data.
The dependence of on the number of flavors entails a dependence of the slope of
the energy evolution on the number of contributing flavors: the
running changes slope across quark flavor thresholds. However, from
GeV to present accelerator energies,
an effective approximation is reasonable, being
the production of heavier quarks strongly suppressed.
Notice that in QCD, quark–antiquark
pairs screen the color charge, like pairs in QED. Antiscreening (which
leads to increase the charge at larger distances) comes from gluon
loops; getting closer to a quark the antiscreening effect of the
virtual gluons is reduced. Since the contribution from virtual
quarks and virtual gluons to screening is opposite, the winner is
decided by the number of different flavors. For standard QCD with
three colors, antiscreening prevails for .
6.4.5
Asymptotic Freedom and Confinement
When quarks are very close to each
other, they behave almost as free particles. This is the famous
“asymptotic freedom” of QCD. As a
consequence, perturbation theory becomes accurate at higher
energies (Eq. 6.337). Conversely, the potential grows at
large distances.
In addition, the evolution of
with energy must make it comparable
to the electromagnetic and weak couplings at some (large) energy,
which, looking to our present extrapolations, may lie at some
–GeV—but such “unification” might
happen at lower energies if new, yet undiscovered, particles
generate large corrections to the evolution. After this point, we
do not know how the further evolution could behave.
At a scale
(6.339)
the perturbative coupling (6.337) starts diverging; this is called the
Landau pole. Note however that
Eq. 6.337 is perturbative, and more terms are
needed near the Landau pole: strong interactions indeed do not
exhibit a divergence for .
6.4.5.1
Quark–Gluon Plasma
Asymptotic freedom entails that at
extremely high temperature and/or density, a new phase of matter
should appear due to QCD. In this phase, called quark–gluon plasma (QGP), quarks and
gluons become free: the color charges of partons are screened. It
is believed that during the first few ms after the Big Bang the
Universe was in a QGP state, and flavors were equiprobable.
QGP should be formed when temperatures
are close to 200 MeV and density is large enough. This makes the
ion–ion colliders the ideal place to reproduce this state.
One characteristic of QGP should be
that jets are “quenched”: the high density of particles in the
“fireball” which is formed after the collision absorbs jets in such
a way that in the end no jet or just one jet appears.
Many experiments at hadron colliders
tried to create this new state of matter in the 1980s and 1990s,
and CERN announced indirect evidence for QGP in 2000. Current
experiments at the Relativistic Heavy Ion Collider (RHIC) at BNL
and at CERN’s LHC are continuing this effort, by colliding
relativistically accelerated gold (at RHIC) or lead (at LHC) ions.
Also RHIC experiments have claimed to have created a QGP with a
temperature 4 K (about 350 MeV).
The observation and the study of the
QGP at the LHC are discussed in more detail in Sect. 6.4.7.3.
6.4.6
Hadronization; Final States from Hadronic Interactions
Hadronization is the process by which a
set of colored partons becomes a set of color-singlet hadrons.
At large energies, QCD processes can be
described directly by the QCD Lagrangian. Quarks radiate gluons,
which branch into gluons or generate pairs, and so on. This is a parton
shower, quite similar in concept to the electromagnetic showers
described by QED.
However, at a certain hadronization scale we are not able anymore to perform
perturbative calculations. We must turn to QCD-inspired
phenomenological models to describe a transition of colored partons
into colorless states, and the further branchings.
The
problem of hadron generation from a high-energy collision is thus
modeled through four steps
(Fig. 6.53):
1.
Evolution of partons through a parton
shower.
2a.
Grouping of the partons onto high-mass
color-neutral states. Depending on the model these states are
called “strings” or “clusters”—the difference is not relevant for
the purpose of this book; we shall describe in larger detail the
“string” model in the following.
2b.
Map of strings/clusters onto a set of
primary hadrons (via string break or cluster splitting).
3.
Sequential decays of the unstable
hadrons into secondaries (e.g., , , , ...).
The physics governing steps 2a and 2b
is nonperturbative, and pertains to hadronization; some properties
are anyway bound by the QCD Lagrangian.
Fig.
6.53
The creation of a multihadronic final state
from the decay of a Z boson or
from a virtual photon state generated in an collision
An important result in lattice
QCD,12 confirmed by quarkonium spectroscopy, is
that the potential of the color-dipole field between a charge and
an anticharge at distances fm can be approximated as
(Fig. 6.54). This is called
“linear confinement,” and it justifies the string model of hadronization, discussed
below in Sect. 6.4.6.1.
6.4.6.1
String Model
The Lund string model, implemented
in the Pythia [F6.10] simulation
software, is nowadays commonly used to model hadronic interactions.
We shall shortly describe now the main characteristics of this
model; many of the basic concepts are shared by any string-inspired
method. A more complete discussion can be found in the book by
Andersson [F6.9].
Fig.
6.54
The QCD effective potential
Consider the production of a
pair, for instance in the process
. As the quarks move apart, a potential
(6.340)
is stretched among them (at short distances, a Coulomb term
proportional to 1 / r
should be added). Such a potential describes a string with energy
per unit length , which has been determined from
hadron spectroscopy and from fits to simulations to have the value
(Fig. 6.54). The color flow in
a string stores energy (Fig. 6.55).
Fig.
6.55
The color flow in a string stores energy in a
tube.
Adapted from a lecture by T. Sjöstrand
A soft gluon possibly emitted does
not affect very much the string evolution (string fragmentation is
“infrared safe” with respect to the emission of soft and collinear
gluons). A hard gluon, instead, can store enough energy that the
qg and the elements operate as two different
strings (Fig. 6.56). The quark fragmentation is different
from the gluon fragmentation since quarks are only connected to a
single string, while gluons have one on either side; the energy
transferred to strings by gluons is thus roughly double compared to
quarks.
Fig.
6.56
Illustration of a system. Color conservation entails
the fact that the color string goes from quarks to gluons and vice
versa rather than from quark to antiquark
As the string endpoints move apart,
their kinetic energy is converted into potential energy stored in
the string itself (Eq. 6.340). This process continues until by
quantum fluctuation a quark–antiquark pair emerges transforming
energy from the string into mass. The original endpoint partons are
now screened from each other, and the string is broken in two
separate color-singlet pieces, , as shown in Fig. 6.57. This process then
continues until only final state hadrons remain, as described in
the following.
Fig.
6.57
String breaking by quark pair creation in the
string field; time evolution goes from bottom to top
The individual string breaks are
modeled from quantum mechanical tunneling, which leads to a
suppression of transverse energies and masses:
(6.341)
where is the mass of the produced quark and
is the transverse momentum with
respect to the string. The spectrum of the quarks is thus
independent of the quark flavor, and
(6.342)
The mass suppression implied by Eq. 6.341 is such that
strangeness suppression with respect to the creation of u or d, , is 0.2–0.3. This suppression is
consistent with experimental measurements, e.g., of the
ratio in the final states from
Z decays.
By inserting the charm quark mass in
Eq. 6.341, one obtains a relative suppression of
charm of the order of . Heavy quarks can therefore be
produced only in the perturbative stage and not during
fragmentation.
Baryon production can be incorporated
in the same picture if string breaks
occur also by the production of pairs of diquarks, bound states of two quarks in a
representation (e.g.,
“red blue antigreen”). The relative probability
of diquark–antidiquark to quark–antiquark production is extracted
from experimental measurements, e.g., of the ratio.
The creation of excited states (e.g.,
hadrons with nonzero orbital momentum between quarks) is modeled by
a probability that such events occur; this probability is again
tuned on the final multiplicities measured for particles in hard
collisions.
With and in the simulation of the
fragmentation fixed from the extraction of random numbers distributed as in Eq. 6.341, the final
step is to model the fraction,
z, of the initial quark’s
longitudinal momentum that is carried by the final hadron; in first
approximation, this should scale with energy for large enough
energies. The form of the probability density for z used in the Lund model, the so-called
fragmentation function f(z),
is
(6.343)
which is known as the Lund symmetric
fragmentation function (normalized to unit integral). These
functions can be flavor dependent, and they are tuned from the
experimental data. The mass dependence in f(z)
suggests a harder fragmentation function for heavier quarks
(Fig. 6.58): this means that charm and beauty primary
hadrons take most of the energy.
Fig.
6.58
Fragmentation function in the Lund
parametrization for quark–antiquark strings. Curves from left to
right correspond to higher masses.
Adapted from a lecture by T. Sjöstrand
The process of iterative selection of
flavors, transverse momenta, and z values for pairs breaking a string is
illustrated in Fig. 6.59. A quark u produced in a hard process at high
energy emerges from the parton shower, and lies at one extreme of a
string. A pair is created from the vacuum; the
combines with the u and forms a , which carries a fraction
of the total momentum . The next hadron takes a fraction
of the remaining momentum, etc. The
are random numbers generated
according to a probability density function corresponding to the
Lund fragmentation function.
Fig.
6.59
Iterative selection of flavors and momenta in
the Lund string fragmentation model.
Average multiplicity is one of the
basic observables characterizing hadronic final states. It is
extensively studied both theoretically and experimentally at
several center-of-mass energies. Experimentally, since the
detection of charged particles is simpler than the detection of
neutrals, one studies the average charged particle multiplicity. In
the limit of large energies, most of the particles in the final
state are pions, and one can assume, by isospin symmetry, that the
number of neutral pions is half the number of charged pions (pions
are an isospin triplet).
In order to define the number of
particles, one has to define what a stable hadron is. Typically,
multiplicity is computed at a time s after the collision–this interval
is larger than the typical lifetime of particles hadronically
decaying, s, but shorter than the typical
weak decay lifetimes.
The problem of the energy dependence
of the multiplicity was already studied by Fermi and Landau in the
1930 s. With simple thermodynamical arguments, they concluded
that the multiplicity from a hard interaction should be
proportional to the square root of the center-of-mass energy:
(6.344)
A more precise expression has been obtained from QCD. The
expression including leading- and next-to-leading order calculation
is:
(6.345)
where a is a parameter (not
calculable from perturbation theory) whose value should be fitted
from the data. The constants and are calculated from the theory.
The summary of the experimental data is
shown in Fig. 6.60; a plot comparing the charge multiplicity
in annihilations with expression
6.345 in a
wide range of energies will be discussed in larger detail in the
next chapter (Fig. 7.18). The charged particle
multiplicity at the Z pole,
91.2 GeV, is about 21 (the total multiplicity including
before their decays is about 30).
The thermodynamical model by Fermi
and Landau predicts that the multiplicity of a particle of mass
m is asymptotically
proportional to .
Fig.
6.60
Charged particle multiplicity in and collisions, pp and ep collisions versus the center-of-mass
energy.
From K.A. Olive et al. (Particle Data Group),
Chin. Phys. C 38 (2014) 090001
6.4.6.3 Jets
in Electron–Positron Annihilation
In the quark–antiquark fragmentation
into hadrons at low energies, the dominant feature is the
production of resonances.
When energy increases, however, primary
quarks and antiquarks start carrying a relevant momentum, large
enough to allow string breakings. The fragmentation, as seen in the
previous section, is essentially a soft process for what is related
to the generation of transverse momenta. The phenomenological
consequence is the materialization of jets of particles along the
direction of the primary quark and antiquark
(Fig. 6.61, left).
Since transverse momenta are almost
independent of the collision energy while longitudinal momenta are
of the order of half the center-of-mass energy, the collimation of
jets increases as energy increases.
The angular distribution of jet axes
in a blob of energy generated by annihilation follows the dependence
expected for spin 1/2 objects.
Some characteristics of quarks can be
seen also by the ratio of the cross section into hadrons to the
cross section into pairs, as discussed in
Sect. 5.4.2. QED predicts that this ratio
should be equal to the sum of squared charges of the charged
hadronic particles produced; due to the nature of QCD, the sum has
to be extended over quarks and over colors. For ,
The process (Fig. 6.56) can give events
with three jets (Fig. 6.61, right). Notice that, as one can see from
Fig. 6.56, one expects an excess of particles in the
direction of the gluon jet, with respect of the opposite direction,
since this is where most of the color field is. This effect is
called the string effect and has been
observed by the LEP experiments at CERN in the 1990s; we shall
discuss it in the next chapter. This is evident also from the
comparison of the color factors—as well as from considerations
based on color conservation.
Fig.
6.61
A two-jet event (left) and a three-jet event
(right) observed by the ALEPH experiment at LEP. Source: CERN
Jet production was first observed at
colliders only in 1975. It was not an
easy observation, and the reason is that the question “how many
jets are there in an event,” which at first sight seems to be
trivial, is in itself meaningless, because there is arbitrariness
in the definition of jets. A jet is a bunch of particles flying
into similar directions in space; the number of jets in a final
state of a collision depends on the clustering criteria which
define two particles as belonging to the same bunch.
6.4.6.4 Jets
in Hadron–Hadron Collisions
The situation is more complicated
when final state hadrons come from a hadron–hadron interaction. On
top of the interaction between the two partons responsible for a
hard scattering, there are in general additional interactions
between the beam remnant partons; the results of such interaction
are called the “underlying event”
(Fig. 6.62).
Fig.
6.62
Pictorial representation of
a hadron–hadron interaction.
From J.M. Campbell et al. Rept. Prog. Phys.
70 (2007) 89
Usually, the underlying event comes
from a soft interaction involving low momentum transfer; therefore,
perturbative QCD cannot be applied and it has to be described by
models. Contributions to the final energy may come from additional
gluon radiation from the initial state or from the final state
partons; typically, the products have small transverse momentum
with respect to the direction of the collision (in the
center-of-mass system). In particular, in a collision at
accelerators, many final products of the collision will be lost in
the beam pipe.
To characterize QCD interactions, a
useful quantity is the so-called rapidityy of a particle:
(6.346)
where z is the common direction
of the colliding hadrons in the center-of-mass13 (the “beam”
axis).
Under a boost in the z direction, rapidity transforms by the
addition of a fixed quantity. This means that rapidity differences
between pairs of particles are invariant with respect to Lorentz
boosts along z.
In most collisions in high-energy
hadronic scattering, the distribution of final state hadrons is
approximately uniform in rapidity, within kinematic limits: the
distribution of final state hadrons is approximately invariant
under boosts in the z
direction. Thus, detector elements should be approximately
uniformly spaced in rapidity—indeed they are.
For a nonrelativistic particle,
rapidity is the same as velocity along the z-axis:
(6.347)
Note that nonrelativistic velocities transform as well additively
under boosts (as guaranteed by the Galilei transformation).
The rapidity of a particle is not
easy to measure, since one should know its mass. We thus define a
variable easier to measure: the pseudorapidity
(6.348)
where is the angle of the momentum of the
particle relative to the axis. One can derive an expression
for rapidity in terms of pseudorapidity and transverse momentum:
(6.349)
in the limit , . This explains the name
“pseudorapidity.” Angles, and hence pseudorapidity, are easy to
measure—but it is really the rapidity that is of physical
significance.
To make the distinction between
rapidity and pseudorapidity clear, let us examine the limit on the
rapidities of the produced particles of a given mass at a given
c.m. energy. There is clearly a limit on rapidity, but there is no
limit on pseudorapidity, since a particle can be physically
produced at zero angle (or at ), where pseudorapidity is infinite.
The particles for which the distinction is very significant are
those for which the transverse momentum is substantially less than
the mass. Note that always.
6.4.7
Hadronic Cross Section
The two extreme limits of QCD,
asymptotic freedom (perturbative) and
confinement (nonperturbative), translate
in two radical different strategies in the computation of the cross
sections of the hadronic processes. At large momentum transfer
(hard processes), cross sections can be computed as the convolution
of the partonic (quarks and gluons) elementary cross sections over
the parton distribution functions (PDFs). At low transfer momentum
(soft interactions), cross sections must be computed using
phenomenological models that describe the distribution of matter
inside hadrons and whose parameters must be determined from data.
The soft processes are dominant. At the LHC for instance
(Fig. 6.63), the total proton–proton cross section is
of the order of 100 millibarn while the Higgs production cross
section is of the order of tens of picobarn (a difference of 10
orders of magnitude!).
Fig.
6.63
Proton–(anti)proton cross sections at high
energies. Cross-sectional values for several important processes
are given. The right vertical axis reports the number of events for
a luminosity value cms.
From N. Cartiglia, arXiv:1305.6131
[hep-ex]
At high momentum transfer, the number
of partons, mostly gluons, at small x, increases very fast as shown in
Fig. 5.25. This fast rise, responsible
for the increase of the total cross sections, can be explained by
the possibility, at these energies, that gluons radiated by the
valence quarks radiate themselves new gluons forming gluonic
cascades. However, at higher energies, the gluons in the cascades
interact with each other suppressing the emission of new soft
gluons and a saturation state often described as the Color Glass Condensate (CGC) is
reached. In high-energy, heavy-ion collisions, high densities may
be accessible over extended regions and a Quark–Gluon Plasma (QGP) may be formed.
Fig.
6.64
First-order representation of a hadronic hard
interaction producing a final state X
6.4.7.1 Hard
Processes
In hadronic hard processes the
factorization assumption, tested first in the deep inelastic
scattering, holds. The time scale of the elementary interaction
between partons (or as in case of deep inelastic scattering between
the virtual photon and the quarks) is basically given by the
inverse of the transferred momentum Q
(6.350)
while the hadron timescale is given by the inverse of the QCD
nonperturbative scale
(6.351)
Hence, whenever the processes at each timescale can
be considered independent. Thus in the production of the final
state X (for instance a
dilepton, or a multijet system, or a
Higgs boson, ...) by the collision of two hadrons and with, respectively, four-momenta
and :
(6.352)
the inclusive cross section can be given in leading order (LO) by
(see Fig. 6.64):
(6.353)
where and are the parton distribution functions
evaluated at the scale Q,
and are the fractions of momentum
carried, respectively, by the partons i and j, and is the partonic cross section
evaluated at an effective squared c.m. energy
(6.354)
s being the square of the c.m.
energy of the hadronic collision.
The scale Q is usually set to the effective c.m.
energy (if X is a resonance, its mass) or to half of
the jet transverse energy for high processes. The exact value of this
scale is somehow arbitrary. If one were able to compute all order
diagrams involved in a given process, then the final result would
not depend on this particular choice. However, in practice, it is
important to set the right scale in order that the corrections of
higher-order diagrams would have a small contribution.
Lower-order diagrams give the right
order of magnitude, but to match the present experimental accuracy
(in particular at the LHC) higher-order diagrams are needed.
Next-to-leading-order (NLO), one-loop calculations were computed
for many processes since many years and nowadays several
predictions at two-loop level, next-to-next-to-leading-order
(NNLO), are already available for several processes, as for
instance the Higgs boson production at the LHC.
The partons not involved in the hard
scattering (spectator partons) carry a non-negligible fraction of
the total energy and may be involved in interactions with small
momentum transfer. These interactions contribute to the so-called
underlying event.
Drell–Yan Processes. The production of
dileptons in the collision of two hadrons (known as the
Drell–Yan process) was first interpreted
in terms of quark–antiquark annihilation by Sydney Drell and
Tung-Mow Yan in 1970. Its leading-order diagram
(Fig. 6.65) follows the factorization scheme
discussed above where the annihilation cross section is a pure QED process given by:
(6.355)
is the quark charge, and is the square of the c.m. energy of
the system of the colliding quark–antiquark pair (i.e., the square
of the invariant mass of the dilepton system). is thus given by
(6.356)
Fig.
6.65
Leading-order diagram of the Drell–Yan
process.
By user:E2m [public domain], via Wikimedia
Commons
Finally note that, as it was already
discussed in Sect. 5.4.2, the color factor
appears in the denominator (average
over the incoming colors) in contrast with what happens in the
reverse process (sum over outgoing colors) whose
cross section is given by
(6.357)
There is a net topological difference between the final states of
the and processes. While in interactions, the scattering into two
leptons or two jets implies a back-to-back topology, in the
Drell–Yan the topology is back-to-back in the plane transverse to
the beam axis but, since each quark or antiquark carries an
arbitrary fraction of the momentum of the parent hadron, the system
has in general nonzero momentum component along the beam
axis.
It is then important to observe that
the rapidity of the dilepton system is by energy–momentum
conservation equal to the rapidity of the quark–antiquark system,
(6.358)
Neglecting the transverse momentum, the rapidity is given by
(6.359)
Then, if the mass M and the
rapidity y of the dilepton are
measured, the momentum fractions of the quark and antiquark can, in
this particular case, be directly accessed. In fact, inverting the
equations relating M and
y with one obtains:
(6.360)
The Drell–Yan differential cross section can now be written in
terms of M and y. Computing the Jacobian of the change
of the variables from to ,
(6.361)
It can be easily shown that the differential Drell–Yan cross
section for the collision of two hadrons is just:
(6.362)
where is the combined PDF for the fractions
of momentum carried by the colliding quark and antiquark weighted
by the square of the quark charge. For instance, in the case of
proton–antiproton scattering one has, assuming that the quark PDFs
in the proton are identical to the antiquark PDFs in the antiproton
and neglecting the contributions of the antiquark (quark) of the
proton (antiproton) and of other quarks than u and d:
(6.363)
where
(6.364)
(6.365)
In proton–proton collisions at the LHC, the antiquark must come
from the sea. Anyhow, to have a good description of the dilepton
data (see Fig. 6.66) it is not enough to consider the
leading-order diagram discussed above. In fact, the peak observed
around GeV corresponds to the Z resonance, not accounted in the naïve
Drell–Yan model, and next-to-next leading-order (NNLO) diagrams are
needed to have a good agreement between data and theory.
Fig.
6.66
Dilepton cross section measured by CMS.
From V. Khachatryan et al. (CMS
Collaboration), The European Physical Journal C75 (2015) 147
Multijet Production. Multijet events in hadronic interactions at high
energies are an important background for all the hard physics
channels with final hadronic states, in particular for the searches
for new physics; the calculation of their characteristics is a
direct test of QCD. At large transferred momentum, their cross
section may be computed following the factorization scheme
discussed above but involving at LO already a large number of
elementary two-parton diagrams (,,, , ...).
Fig.
6.67
Inclusive jet cross section measured by
CMS.
From S. Chatrchyan et al. (CMS
Collaboration), Phys. Rev. Lett. 107 (2011) 132001
The transverse momentum () of the jets is, in these processes,
a key final state variable and together with the jets rapidities
() has to be related to the partonic
variables in order that a comparison data/theory may be possible.
For instance, in the production of two jets from the t-channel gluon exchange, the elementary
LO cross section is given by
(6.366)
and the following relations can be established between the partonic
and the final state variables:
(6.367)
(6.368)
(6.369)
In practice, such calculations are performed numerically using
sophisticated computer programs. However, the comparison of the
prediction from this calculation with the LHC data provides a
powerful test of QCD which spans many orders of magnitude (see
Fig. 6.67).
6.4.7.2 Soft
Processes
At low
momentum transfer, the factorization assumption breaks down.
Therefore, it is no longer possible to compute the cross sections
adding up perturbative interactions between partons, being the
nonperturbative aspects of the hadrons “frozen” in the Parton
Distribution Functions. The interaction between hadrons is thus
described by phenomenological models.
A strategy is to use optical models
and their application to quantum mechanics (for an extended
treatment see Ref. [F6.4] ). The interaction of a particle with
momentum with a target may be seen as the
scattering of a plane wave by a diffusion center (see
Fig. 6.68). The final state at large distance from
the collision point can then be described by the superposition of
the incoming plane wave with an outgoing spherical scattered wave:
(6.370)
where z is the coordinate along
the beam axis, is the scattering angle, E the energy, and is denominated as the elastic
scattering amplitude.
Fig.
6.68
Plane wave scattering by a diffusion center
having as result an outcoming spherical wave
The elastic differential cross
section can be shown to be
(6.371)
In the forward region (, the interference between the
incident and the scattered waves is non-negligible. In fact, this
term has a net effect on the reduction of the incident flux that
can be seen as a kind of “shadow” created by the diffusion center.
An important theorem, the Optical
Theorem , connects the total cross
section with the imaginary part of the forward elastic scattering
amplitude:
(6.372)
The elastic cross section is just the integral of the elastic
differential cross section,
(6.373)
and the inelastic cross section just the difference of the two
cross sections
(6.374)
It is often useful to decompose the elastic scattering amplitude in
terms of angular quantum number l (for spinless particles scattering the
angular momentum is conserved; in the case of
particles with spin the good quantity will be the total angular
momentum ):
(6.375)
where the functions are the partial wave amplitudesand are the Legendre polynomials
which form an orthonormal basis.
Cross sections can be also written as
a function of the partial wave
amplitudes:
(6.376)
(6.377)
and again is simply the difference between
and .
The optical theorem applied now at
each partial wave imposes the following relation (unitarity
condition):
(6.378)
Noting that
(6.379)
this condition can be expressed as
(6.380)
This relation is automatically satisfied if the partial wave
amplitude is written as
(6.381)
being a complex number.
Whenever is a pure real number
(6.382)
and the scattering is totally elastic (the inelastic cross section
is zero).
On the other hand, if the wavelength
associated with the beam particle is much smaller than the target
region,
(6.383)
a description in terms of the classical impact parameter b (Fig. 6.69) is appropriate.
Fig.
6.69
Impact parameter definition in the scattering
of a particle with momentum over a target region with radius
R
Defining
(6.384)
the elastic scattering amplitude can then be expressed as
(6.385)
with which is the granularity of the
sum.
In the limit , , and the sum can be approximated by
an integral
(6.386)
where the Legendre polynomials were replaced by the Legendre
functions , being a real positive number, and the
partial wave amplitudes were interpolated giving rise to the
scattering amplitude .
For small scattering angles, the
Legendre functions may be approximated by a zeroth-order Bessel
Function and finally one can write
(6.387)
The scattering amplitude is thus related to the elastic wave
amplitude discussed above basically by a Bessel–Fourier
transform.
Following a similar strategy to
ensure automatically unitarity, may be parametrized as
(6.388)
where
(6.389)
is called the eikonal function.
It can be shown that the cross
sections are related to the eikonal by the following expressions:
(6.390)
(6.391)
(6.392)
(the integrations run over the target region with a radius
R).
Note that:
if then and all the interactions are
elastic;
if and for , then and . This is the so-called black
disk limit.
In a first approximation, hadrons may
be described by gray disks with mean
radius R and for and 0 otherwise. The opacity
is a real number (). In fact, the main features of
proton–proton cross sections can be reproduced in such a simple
model (Fig. 6.70). In the high energy limit, the gray disk
tends asymptotically to a black disk and thus thereafter the
increase of the cross section, limited by the Froissart Boundto
, is just determined by the
increase of the mean radius.
Fig.
6.70
The total cross section (left) and the ratio
of the elastic and total cross sections in proton–proton
interactions as a function of the c.m. energy. Points are
experimental data and the lines are coming from a fit using a gray
disk model.
From R. Conceia̧o et al. Nuclear Physics A
888 (2012) 58
The eikonal has no dimensions: it is
just a complex number and it is a function of the impact parameter.
Using a semiclassical argument, its imaginary part can be
associated with the mean number of parton–parton collisions
. In fact, if such collisions were
independent (no correlation means no diffraction), the probability
to have n collisions at an
impact parameter b would follow
a Poisson distribution around the average:
(6.393)
The probability to have at least one collision is given by
(6.394)
and thus
(6.395)
Hence in this approximation
(6.396)
is often computed as the sum of the
different kind of parton–parton interactions, factorizing each term
into a transverse density function and the corresponding cross
section:
(6.397)
For instance,
(6.398)
where qq, qg, gg stay respectively for the quark–quark,
quark–gluon, and gluon–gluon interactions.
On the other hand, there are models
where is divided in perturbative (hard) and
nonperturbative (soft) terms:
(6.399)
The transverse density functions must take into account the overlap of
the two hadrons and can be computed as the convolution of the
Fourier transform of the form factors of the two hadrons.
This strategy can be extended to
nucleus–nucleus interactions which are then seen as an independent
sum of nucleon–nucleon interactions. This approximation, known as
the Glauber14model , can be
written as:
(6.400)
The function takes now into account the
geometrical overlap of the two nuclei and indicates the probability
per unit of area of finding simultaneously one nucleon in each
nucleus at a given impact parameter.
6.4.7.3 High
Density, High Energy; Quark–Gluon Plasma
At high density and high energy new
phenomena may appear.
At high density, whenever one is able
to pack densely hadronic matter, as for instance in the core of
dense neutron stars, in the first seconds of the Universe (the Big
Bang), or in heavy-ion collisions at high energy (the little
bangs), we can expect that some kind of color screening occurs and
partons become asymptotically free. The confinement scale is
basically set by the size of hadrons, with an energy density
of the order of 1 GeV/fm; thus, if in larger space regions
such an energy density is attained, a free gas of quarks and gluons
may be formed. That order of magnitude, which corresponds to a
transition temperature of around 170–190 MeV, is confirmed by
nonperturbative QCD calculations using lattices (see
Fig. 6.71). At this temperature, following a
simplified Stefan–Boltzmann law for a relativistic free gas, there
should be a fast increase of the energy density corresponding to
the increase of the effective internal number degrees of freedom
from a free gas of pions
() to a new state of matter where
quarks and gluons are asymptotically free (, considering two quark flavors). This
new matter state is usually dubbed as the Quark–Gluon Plasma (QGP).
Fig.
6.71
Energy density of hadronic state of matter,
with baryonic number zero, according to a lattice calculation. A
sharp rise is observed near the critical temperature –190 MeV.
From C. Bernard et al. hep-lat/0610017
The phase transition between hadronic
and QGP states depends also strongly on the net baryon contents of
the system. At the core of dense neutron stars, QGP may occur at
very low temperatures. The precise QCD phase diagram is therefore
complex and still controversial. A simplified sketch is presented
in Fig. 6.72 where the existence of a possible critical
point is represented.
Fig.
6.72
Schematic representation of the QCD phase
diagram as a function of the temperature and of the baryonic
potential (measure the difference in the quark and antiquark
contents of the system).
In Pb–Pb collisions at the LHC, c.m.
energies per nucleon of , corresponding to an energy density
for central events (head-on collisions, low-impact parameters)
above 15 GeV/fm, have been attained. The multiplicity
of such events is huge with thousand of particles detected
(Fig. 6.73). Such events are an ideal laboratory to
study the formation and the characteristics of the QGP. Both global
observables, as the asymmetry of the flow of the final state
particles, and hard probes like high transverse momentum particles,
di-jets events, and specific heavy hadrons, are under intense
scrutiny.
Fig.
6.73
First lead-lead event recorded by ALICE
detector at LHC at c.m. energy per nucleon of 2.76 TeV. Thousands
of charged particles were recorded by the time-projection chamber.
Source: CERN
Fig.
6.74
Artistic representation of a heavy-ion
collision. The reaction plane is defined by the momentum vectors of
the two ions, and the shape of the interaction region is due to the
sharp pression gradients.
Shear viscosity to entropy density ratio for
several fluids. is the critical temperature at which
transition occurs (deconfinement in the case of QCD).
From S. Cremonini et al. JHEP 1208 (2012)
An asymmetry of the flow of the final
state particles can be predicted as a consequence of the
anisotropies in the pressure gradients due to the shape and
structure of the nucleus–nucleus interaction region
(Fig. 6.74). In fact, more and faster particles are
expected and seen in the region of the interaction plane (defined
by the directions of the two nuclei in the c.m. reference frame)
where compression is higher. Although the in-out modulation
(elliptic flow) is qualitatively in agreement with the predictions,
quantitatively the effect is smaller than the expected with the
assumption of a QGP formed by a free gas of quarks and gluons. Some
kind of collective phenomenon should exist. In fact, the QGP
behaves rather like a strongly coupled liquid with low viscosity.
The measured ratio of its shear (dynamic) viscosity to its entropy
density () is lower than in ordinary liquids
and is near to the ideal hydrodynamic limit (Fig. 6.75). Such surprising
behavior was first discovered at the RHIC collider at energies
lower than the LHC.
The study at the LHC of two-particle
correlation functions for pairs of charged particles showed also
unexpected features like a “ridge”-like structure at extending by several units (Fig. 6.76)
Fig.
6.76
2-D two-particle correlation function for
high-multiplicity p-Pb
collision events at 5.02 TeV for pairs of charged particles. The
sharp near-side peaks from jet correlations were truncated to
better visualize the “ridge”-like structure.
From CMS Collaboration, Phys. Lett. B718
(2013) 795
Fig.
6.77
Display of an unbalanced di-jet event
recorded by the CMS experiment at the LHC in lead–lead collisions
at a c.m. energy of 2.76 TeV per nucleon. The plot shows the sum of
the electromagnetic and hadronic transverse energies as a function
of the pseudorapidity and the azimuthal angle. The two identified
jets are highlighted.
From S. Chatrchyan et al. (CMS
Collaboration), Phys. Rev. C84 (2011) 024906
Partons resulting from elementary
hard processes inside the QGP have to cross a high dense medium and
thus may suffer significant energy losses or even be absorbed in
what is generically called “quenching” .
The most spectacular observation of such phenomena is in di-jet
events, where one of the high jets loose a large fraction of its
energy (Fig. 6.77). This “extinction” of jets is usually
quantified in terms of the nuclear
suppression factordefined as
the ratio between differential distributions in nucleus–nucleus and
in proton–proton collisions:
(6.401)
where is the average number of
nucleon–nucleon collisions at each specific rapidity bin.
In the absence of “medium effects,”
may reflect a possible modification
of the PDFs in nuclei as compared to the ones in free nucleons but
should not be far from the unity. The measurement at the LHC
(Fig. 6.78, left) showed however a clear suppression
demonstrating significant energy losses in the medium and in this
way it can provide information of the dynamical properties of the
medium, such as its density.
Not only loss processes may occur in
the presence of a hot and dense medium (QGP). The production of
high-energy quarkonia (bound states of heavy quark–antiquark pairs)
may also be suppressed whenever QGP is formed as initial proposed
on a seminal paper in 1986 by Matsui and Satz in the case of the
J/ ( pair) production in high-energy
heavy-ion collisions. The underlined proposed mechanism was a color
analog of Debye screening which describes the screening of
electrical charges in the plasma. Evidence of such suppression was
soon reported at CERN in fixed target oxygen–uranium collisions at
per nucleon by the NA38
collaboration. Many other results were published in the following
years, and a long discussion was held on whether the observed
suppression was due to the absorption of these fragile states by the surrounding nuclear
matter or to the possible existence of the QGP. In 2007 the NA60
Collaboration reported, in indium–indium fixed target collisions at
per nucleon, the existence of an
anomalous J/suppression not compatible with the nuclear
absorption effects. However, this anomalous suppression did not
increase at higher c.m. energies, and recently showed a clear
decrease at the LHC (Fig. 6.78, right). Meanwhile, the possible
(re)combination of charm and anticharm quarks at the boundaries of
the QGP region was proposed as an enhancement production mechanism,
and such mechanism seems to be able to describe the present data.
Fig.
6.78
Left: The nuclear modification factor
as a function of , measured by the ATLAS experiment at
LHC at c.m. energy per nucleon of 5.02 TeV, for five centrality
intervals. From ATLAS-CONF-2017-012J. Right: for inclusive J/ production at mid rapidity as
reported by PHENIX (RHIC) and ALICE (LHC) experiments at c.m.
energy per nucleon of 0.2 and 2.76 TeV,
respectively.
The study of the production, as well as of other
quarkonia states, is extremely important to study QGP as it allows
for a thermal spectroscopy of the QGP evolution. The
dissociation/association of these pairs is intrinsically related to the
QGP temperature; as such, as this medium expands and cools down,
these pairs may recombine and each flavor has a different
recombination temperature. However, the competition between the
dissociation and association effects is not trivial and so far it
was not yet experimentally assessed.
The process of formation of the
QGP in high-energy heavy-ions collisions
is theoretically challenging. It is generally accepted that in the
first moments of the collisions, the two nuclei had already reached
the saturation state described by the color glass condensate (CGC)
referred at the beginning of Sect. 6.4.7. Then a fast
thermalization process occur ending in the formation of a QGP state
described by relativistic hydrodynamic models. The intermediated
stage, not experimentally accessible and not theoretically well
established, is designated as glasma .
Finally, the QGP “freezes-out” into a gas of hadrons. Such scheme
is pictured out in an artistic representation in
Fig. 6.79.
Fig.
6.79
An artistic representation of the time–space
diagram of the evolution of the states created in heavy-ion
collisions.
In ultrahigh-energy cosmic ray
experiments (see Chap. 10), events with c.m. energies well
above those presently attainable in human-made accelerators are
detected. Higher Q and thus smaller scales ranges can
then be explored opening a new possible window to test hadronic
interactions.
Further Reading
[F6.1]
M. Thomson, “Modern Particle Physics,”
Cambridge University Press 2013. A recent, pedagogical and rigorous
book covering the main aspects of particle physics at advanced
undergraduate and early graduate level.
[F6.2]
A. Bettini, “Introduction to Elementary
Particle Physics” (second edition), Cambridge University Press
2014. A very good introduction to Particle Physics at the
undergraduate level starting from the experimental aspects and
deeply discussing relevant experiments.
[F6.3]
D. Griffiths, “Introduction to
Elementary Particles” (second edition), Wiley-VCH 2008. A reference
book at the undergraduate level with many proposed problems at the
end of each chapter; rather oriented on the theoretical
aspects.
[F6.4]
S. Gasiorowicz, “Quantum Physics”
(third edition), Wiley 2003. Provides a concise and solid
introduction to quantum mechanics. It is very useful for students
that had already been exposed to the subject.
[F6.5]
I.J.R. Aitchison, A.J.G. Hey, “Gauge
Theories in Particle Physics: A Practical Introduction” (fourth
edition—2 volumes), CRC Press, 2012. Provides a pedagogical and
complete discussion on gauge field theories in the Standard Model
of Particle Physics from QED (vol. 1) to electroweak theory and QCD
(vol. 2).
[F6.6]
F. Halzen, A.D. Martin, “Quarks and
Leptons: An Introductory Course in Modern Particle Physics”, Wiley
1984. A book at early graduate level providing in a clear way the
theories of modern physics in how to approach which teaches people
how to do calculations.
[F6.7]
M. Merk, W. Hulsbergen, I. van Vulpen,
“Particle Physics 1”, Nikhef 2016. Concise and clear lecture notes
at a master level covering from the QED to the Electroweak symmetry
breaking.
[F6.8]
J. Romão, “Particle Physics”, 2014,
http://porthos.ist.utl.pt/Public/textos/fp.
Lecture notes for a one-semester master course in theoretical
particle physics; also a very good introduction to quantum field
theory.
[F6.9]
B. Andersson, “The Lund Model”,
Cambridge University Press, 2005. The physics behind the
Pythia/Lund model.
[F6.10]
T. Sjöstrand et al. “An Introduction to
PYTHIA 8.2”, Computer Physics Communications 191 (2015) 159. A
technical explanation of the reference Monte Carlo code for the
simulation of hadronic processes, with links to the physics
behind.
Exercises
1.
Spinless
particles interaction. Determine, in the high-energy limit,
the electromagnetic differential cross section between two spinless
charged nonidentical particles.
2.
Dirac
equation invariance. Show that the Dirac equation written
using the covariant derivative is gauge-invariant.
3.
Bilinear covariants. Show that
(a)
is a scalar;
(b)
is a pseudoscalar;
(c)
is a four-vector;
(d)
is a pseudo four-vector.
4.
Chirality and helicity. Show that the
right helicity eigenstate can be decomposed in the right
() and left ) chiral states as follows:
5.
Running
electromagnetic coupling. Calculate for GeV.
6.
beams. Consider a beam of produced through the decay of a
primary beam containing pions (90%) and kaons (10%). The primary
beam has a momentum of 10 GeV and an intensity of 10 s.
(a)
Determine the number of pions and kaons
that will decay in a tunnel 100 m long.
(b)
Determine the energy spectrum of the
decay products.
(c)
Calculate the contamination of the
beam, i.e., the fraction of
present in that beam.
7.
semileptonic interaction. Considering the
process :
(a)
Discuss what X could be (start by computing the
available energy in the center of mass).
(b)
Write the amplitude at lower order for
the process for the interaction of the with the valence quark d ().
(c)
Compute the effective energy in the
center of mass for this process supposing that the energy of the
is 10 GeV and the produced muon takes
5 GeV and is detected at an angle of 10 with the beam.
(d)
Write the cross section of the process
as a function of the elementary cross
section .
8.
Neutrino and antineutrino deep inelastic
scattering. Determine, in the framework of the quark parton
model, the ratio:
where N stands for an isoscalar
(same number of protons and neutrons) nucleus. Consider that the
involved energies are much higher than the particle masses. Take
into account only diagrams with valence quarks.
9.
Feynman
rules. What is the lowest-order diagram for the process
?
10.
Bhabha
scattering. Draw the QED Feynman diagrams at lowest (leading)
order for the elastic scattering and discuss why the Bhabha
scattering measurements at LEP are done at very small polar
angle.
11.
Bhabha
scattering: higher orders. Draw the QED Feynman diagrams at
next-to-leading order for the Bhabha scattering.
12.
Compton
scattering and Feynman rules. Draw the leading-order Feynman
diagram(s) for the Compton scattering and compute the amplitude for the
process.
13.
Top pair
production. Consider the pair production of top/antitop quarks
at a proton–antiproton collider. Draw the dominant first-order
Feynman diagram for this reaction and estimate what should be the
minimal beam energy of a collider to make the process happen.
Discuss which channels have a clear experimental signature.
14.
c quark
decay. Consider the decay of the c quark. Draw the dominant first-order
Feynman diagrams of this decay and express the corresponding decay
rates as a function of the muon decay rate and of the Cabibbo
angle. Make an estimation of the c quark lifetime knowing that the muon
lifetime is about 2.2 s.
15.
Gray
disk model in proton–proton interactions. Determine, in the
framework of the gray disk model, the mean radius and the opacity
of the proton as a function of the c.m. energy (you can use
Fig. 6.70 to extract the total and the elastic
proton–proton cross sections).