© Springer International Publishing AG, part of Springer Nature 2018
Alessandro De Angelis and Mário  PimentaIntroduction to Particle and Astroparticle PhysicsUndergraduate Lecture Notes in Physicshttps://doi.org/10.1007/978-3-319-78181-5_6

6. Interactions and Field Theories

Alessandro De Angelis1, 2   and Mário Pimenta3
(1)
Department of Mathematics, Physics and Computer Science, University of Udine, Udine, Italy
(2)
INFN Padova and INAF, Padua, Italy
(3)
Laboratório de Instrumentação e Física de Partículas, IST, University of Lisbon, Lisbon, Portugal
 
 
Alessandro De Angelis

The structure and the dynamics of the Universe are determined by the so-called fundamental interactions: gravitational, electromagnetic, weak, and strong. In their absence, the Universe would be an immense space filled with ideal gases of structureless particles. Interactions between “matter” particles (fermions) are in relativistic quantum physics associated with the exchange of “wave” particles (bosons)—note that bosons can also interact among themselves. Such a picture can be visualized (and observables related to the process can be computed) using the schematic diagrams invented in 1948 by Richard Feynman: the Feynman diagrams (Fig. 6.1), that we have shortly presented in Chap. 1.

Each Feynman diagram corresponds to a specific term of a perturbative expansion of the scattering amplitude. It is a symbolic graph, where initial and final state particles are represented by incoming and outgoing lines (which are not space–time trajectories), and the internal lines represent the exchange of virtual particles (the term “virtual” meaning that their energy and momentum do not have necessarily to be related through the relativistic equation $$E^2=\ p^2+M^2$$; if they are not, they are said to be off the mass shell). Solid straight lines are associated with fermions while wavy, curly, or broken lines are associated with bosons. Arrows indicate the time flow of the external particles and antiparticles (in the plot time runs usually from left to right, but having it running from bottom to top is also a possible convention). A particle (antiparticle) moving backward in time is equivalent to its antiparticle (particle) moving forward in time.

At the lowest order, the two initial state particles exchange only a particle mediating the interaction (for instance a photon). Associated with each vertex (a point where at least three lines meet) is a number, the coupling parameter1 (in the case of electromagnetic interaction $$z\sqrt{\alpha }=ze/\sqrt{4\pi }$$ for a particle with electrical charge z), which indicates the probability of the emission/absorption of the field particle and thus the strength of the interaction. Energy–momentum, as well as quantum numbers, is conserved at each vertex.

At higher orders, more than one field particle can be exchanged (second diagram from the left in the Fig. 6.1) and there is an infinite number of possibilities (terms in the perturbative expansion) for which amplitudes and probabilities are proportional to increasing powers of the coupling parameters. Although the scattering amplitude is proportional to the square of the sum of all the terms, if the coupling parameters are small enough, just the first diagrams will be relevant. However, even low-order diagrams can give an infinite contribution. Indeed in the second diagram, there is a loop of internal particles and an integration over the exchanged energy–momentum has to be carried out. Since this integration is performed in a virtual space, it is not bound and therefore it might, in principle, diverge. Curing divergent integrals (or, in jargon, “canceling infinities”) became the central problem of quantum field theory in the middle of the twentieth century (classically the electrostatic self-energy of a point charged particle is also infinite) and it was successfully solved in the case of electromagnetic interaction, as it will be briefly discussed in Sect. 6.2.12, within the renormalization scheme.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig1_HTML.gif
Fig. 6.1

Feynman diagrams

The quantum equations for “matter” (Schrödinger, Klein–Gordon, Dirac equations) must be modified to incorporate explicitly the couplings with the interaction fields. The introduction of these new terms makes the equations invariant to a combined local (space–time dependent) transformation of the matter and of the interactions fields (the fermion wave phase and the four-momentum potential degree of freedom in case of the electromagnetic interactions). Conversely requiring that the “matter” quantum equations should be invariant with respect to local transformation within some internal symmetry groups implies the existence of well-defined interaction fields, the gauge fields. These ideas, developed in particular by Feynman and by Yang and Mills in the 1950s, were applied to the electromagnetic, weak, and strong interactions field theories; they provided the framework for the unification of the electromagnetic and weak interactions (electroweak interactions) which has been extensively tested with an impressive success (see next chapter) and may lead to further unification involving strong interaction (GUTs—Grand Unified Theories) and even gravity (ToE—Theories of Everything). One could think that we are close to the “end of physics.” However, the experimental discovery that most of the energy of the Universe cannot be explained by the known physical objects quickly dismissed such claim—in fact dark matter and dark energy represent around 95% of the total energy budget of the Universe, and they are not explained by present theories.

6.1 The Lagrangian Representation of a Dynamical System

In the quantum world, we usually find it convenient to use the Lagrangian or the Hamiltonian representation of a system to compute the equations of motion. The Lagrangian L of a system of particles is defined as
$$\begin{aligned} L = K - V \end{aligned}$$
(6.1)
where K is the total kinetic energy of the system and V its total potential energy.
Any system with n degrees of freedom is fully described by n generalized coordinates $$q_j$$ and n generalized velocities $$\dot{q}_j$$. The equations of motion of the system are the so-called Euler–Lagrange equations
$$\begin{aligned} \frac{{d}}{{d}t} \left( \frac{\partial L}{\partial \dot{q}_j} \right) = \frac{\partial L}{\partial q_j} \end{aligned}$$
(6.2)
where the index $$ j = 1, 2,\ldots , n$$ runs over the degrees of freedom. For example, in the case of a single particle in a conservative field in one dimension, x, one can write
$$\begin{aligned} L = \frac{1}{2}mv^2 - V(x) \end{aligned}$$
(6.3)
and applying the Euler–Lagrange equations
$$\begin{aligned} \frac{{d}}{{d}t}mv = - \frac{{d}}{{d}x}V = F \Longrightarrow F = ma \, \end{aligned}$$
(Newton’s law).

Although the mathematics required for Lagrange’s equations might seem more complicated than Newton’s law, Lagrange equations make often the solution easier, since the generalized coordinates can be conveniently chosen to exploit symmetries in the system, and constraint forces are incorporated in the geometry of the problem.

The Lagrangian is of course not unique: you can multiply it by a constant factor, for example, or add a constant, and the equations will not change. You can also add the four-divergence of an arbitrary vector function: it will cancel when you apply the Euler–Lagrange equations, and thus the dynamical equations are not affected.

The so-called Hamiltonian representation uses instead the Hamiltonian function $$H(p_j, q_j, t)$$:
$$\begin{aligned} H = K + V \, . \end{aligned}$$
(6.4)
We have already shortly discussed in the previous chapter this function, which represents the total energy in terms of generalized coordinates $$q_j$$ and of generalized momenta
$$\begin{aligned} p_j = \frac{\partial {H}}{\partial \dot{q}_j} \, . \end{aligned}$$
(6.5)
The time evolution of the system is obtained by the Hamilton’s equations:
$$\begin{aligned} \frac{d{p_j}}{dt} = -\frac{\partial {H}}{\partial {q_j}} \; ; \; \frac{d{q_j}}{dt} = \frac{\partial {H}}{\partial {p_j}} \, . \end{aligned}$$
(6.6)
The two representations, Lagrangian and Hamiltonian, are equivalent. For example, in the case of a single particle in a conservative field in one dimension,
$$\begin{aligned} H = \frac{p^2}{2m} + V \end{aligned}$$
(6.7)
and Hamilton’s equations become
$$\begin{aligned} \frac{d{p}}{dt} = -\frac{d{V}}{d{x}} = F \; ; \; \frac{d{x}}{dt} = \frac{{p}}{m} \, . \end{aligned}$$
(6.8)
We shall use more frequently Lagrangian mechanics. Let us now see how Lagrangian mechanics simplifies the description of a complex system.

6.1.1 The Lagrangian and the Noether Theorem

Noether’s theorem is particularly simple when the Lagrangian representation is used. If the Lagrangian does not depend on the variable $$q_i$$, the Euler–Lagrange equation related to this coordinate becomes
$$\begin{aligned} \frac{{d}}{{d}t} \left( \frac{\partial L}{\partial \dot{q}_i} \right) = 0 \end{aligned}$$
(6.9)
and thus the quantity
$$\begin{aligned} \left( \frac{\partial L}{\partial \dot{q}_i} \right) = p_i \end{aligned}$$
(6.10)
is conserved. For example, the invariance to space translation implies that linear momentum is conserved. By a similar approach, we could see that the invariance to rotational translation implies that angular momentum is conserved.

6.1.2 Lagrangians and Fields; Lagrangian Density

The Euler–Lagrange equations are derived imposing the stationarity of an action S defined as $$S = \int dt \, L$$; such a form, giving a special role to time, does not allow a relativistically covariant Lagrangian L.

We can recover relativistic covariance using instead of the Lagrangian a “Lagrangian density” $$\mathcal {L}$$, such that the Lagrangian will be the integral of $$\mathcal {L}$$ over all space,
$$\begin{aligned} L = \int d^3x \, \mathcal {L} \, . \end{aligned}$$
(6.11)
Now we can write
$$\begin{aligned} S = \int dt \, L = \int d^4 x \, \mathcal {L} \, . \end{aligned}$$
(6.12)
In a quantum mechanical world $$\mathcal {L}$$ can depend, instead than on coordinates and velocities, on fields, $$\phi (\mathbf {r}, t) = \phi (x^{\mu })$$, which are meaningful quantities in the four-dimensional space of relativity. Quantum mechanics guarantees the invariance of physics with respect to a global rotation of the wave function in complex space, i.e., the multiplication for a constant phase: $$\phi \rightarrow \phi e^{i\theta }$$. This means that, in general, a Lagrangian will be the combination of functions $$|\phi |^2$$ or $$|\partial \phi |^2$$. The latter are called, with obvious meaning, kinetic terms.
The same argument leading to the Euler–Lagrange equations leads now to generalized Euler–Lagrange equations
$$\begin{aligned} \partial _{\mu } \bigg ({\frac{\partial \mathcal {L}}{\partial (\partial _{\mu } \phi _i)}}\bigg ) - \frac{\partial \mathcal {L}}{ \partial \phi _i} = 0 \end{aligned}$$
(6.13)
for fields $$\phi _i$$ ($$i=1,\ldots , n$$).
Noether’s theorem guarantees that, if the Lagrangian density does not depend explicitly on the field $$\phi $$, we have a four-current
$$\begin{aligned} j^{\mu } \equiv {\partial \mathcal {L} \over \partial (\partial _{\mu }\phi )} \delta \phi \end{aligned}$$
(6.14)
subject to the continuity condition
$$\begin{aligned} \partial _{\mu }j^{\mu } = 0 \Rightarrow -{\partial j^0 \over \partial t} + \mathbf {\nabla } \cdot \mathbf {j} = 0 \, , \end{aligned}$$
(6.15)
where $$j^0$$ is the charge density and $$\mathbf {j}$$ is the current density. The total (conserved) charge will be
$$\begin{aligned} Q = \int _\mathrm{{all\;space}} d^3x \, j^0 \, . \end{aligned}$$
(6.16)
Hamilton’s formalism can be also extended to relativistic quantum fields.

In the rest of the book, we shall in general make use of Lagrangian densities $$\mathcal {L}$$, but unless otherwise specified we shall refer to the Lagrangian densities simply as Lagrangians.

6.1.3 Lagrangian Density and Mas

A Lagrangian is in general composed of generalized coordinates and of their derivatives (or of fields and their derivatives).

We shall show later that a nonzero mass—i.e., a positive energy for a state at rest—is associated in field theory to an expression quadratic in the field; for instance, in the case of a scalar field,
$$\begin{aligned} \mathcal {L}_K = \frac{1}{2} m^2 |\phi |^2 \, . \end{aligned}$$
(6.17)
The dimension of the Lagrangian density is [energy$$^4$$] since the action (6.12) is dimensionless; the scalar field $$\phi $$ has thus the dimension of an energy.

6.2 Quantum Electrodynamics (QED)

Electromagnetic effects were known since the antiquity, but just during the nineteenth century the (classical) theory of electromagnetic interactions was firmly established. In the twentieth century, the marriage between electrodynamics and quantum mechanics (Maxwell’s equations were already relativistic even before the formulation of Einstein’s relativity) gave birth to the theory of Quantum Electrodynamics (QED) , which is themost accurate theory ever formulated. QED describes the interactions between charged electrical particles mediated by a quantized electromagnetic field.

6.2.1 Electrodynamics

In 1864, James Clerk Maxwell accomplished the “second great unification in Physics” (the first one was realized by Isaac Newton) formulating the theory of electromagnetic field and summarizing it in a set of coupled differential equations. Maxwell’s equations can be written using the vector notation introduced by Heaviside and following the Lorentz–Heaviside convention for units (see Chap. 2) as
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {{\mathcal{{E}}}}= & {} \rho \end{aligned}$$
(6.18)
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {B}= & {} 0 \end{aligned}$$
(6.19)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {{\mathcal{{E}}}}= & {} -\frac{\partial \mathbf {B}}{\partial t} \end{aligned}$$
(6.20)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {B}= & {} \mathbf {j}+\frac{\partial \mathbf {{\mathcal{{E}}}}}{\partial t} \, . \end{aligned}$$
(6.21)
A scalar potential $$\phi $$ and a vector potential $$\mathbf {A}$$ can be introduced such that
$$\begin{aligned} \mathbf {{\mathcal{{E}}}}= & {} -\mathbf {\nabla }\phi -\frac{\partial \mathbf {A}}{\partial t} \end{aligned}$$
(6.22)
$$\begin{aligned} \mathbf {B}= & {} \mathbf {\nabla }\times \mathbf {A} \, . \end{aligned}$$
(6.23)
Then two of the Maxwell equations are automatically satisfied:
$$\begin{aligned} \mathbf {\nabla }\cdot \mathbf {B}=\mathbf {\nabla }\cdot \left( \mathbf {\nabla }\times \mathbf {A}\right) =0 \end{aligned}$$
(6.24)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {{\mathcal{{E}}}}=\mathbf {\nabla }\times \left( -\mathbf {\nabla }\phi -\frac{\partial \mathbf {A}}{\partial t}\right) =-\frac{\partial \mathbf {B}}{\partial t} \end{aligned}$$
(6.25)
and the other two can be written as:
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {{\mathcal{{E}}}}=\mathbf {\nabla }\cdot \left( -\mathbf {\nabla }\phi -\frac{\partial \mathbf {A}}{\partial t}\right) =\rho \end{aligned}$$
(6.26)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {B}=\mathbf {\nabla }\times \left( \mathbf {\nabla }\times \mathbf {A}\right) =\mathbf {j}+\frac{\partial }{\partial t}\left( -\mathbf {\nabla }\cdot \phi -\frac{\partial \mathbf {A}}{\partial t}\right) \, . \end{aligned}$$
(6.27)
However, the potential fields $$(\phi ,\mathbf {A})$$ are not totally determined, having a local degree of freedom. In fact, if $$\chi \left( t,\mathbf {x}\right) $$ is a scalar function of the time and space coordinates, then the potentials $$(\phi ,\mathbf {A})$$ defined as
$$\begin{aligned} \phi '= & {} \phi -\frac{\partial \chi }{\partial t} \end{aligned}$$
(6.28)
$$\begin{aligned} {\mathbf {A}}'= & {} \mathbf {A}+\mathbf {\nabla }\chi \end{aligned}$$
(6.29)
give origin to the same $$\mathbf {{\mathcal{{E}}}}$$ and $$\mathbf {B}$$ fields. These transformations are designated as gauge transformations and generalize the freedom that exist in electrostatics in the definition of the space points where the electric potential is zero (the electrostatic field is invariant under a global transformation of the electrostatic potential, but the electromagnetic field is invariant under a joint local transformation of the scalar and vector potential).
The arbitrariness of these transformations can be used to write the Maxwell equations in a simpler way. What we are going to do is to use our choice to fix things so that the equations for $$\mathbf {A}$$ and for $$\phi $$ are separated but have the same form. We can do this by taking (this is called the Lorenz gauge):
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {A} = - \frac{\partial \phi }{\partial t} \, . \end{aligned}$$
(6.30)
Thus
$$\begin{aligned} \frac{{\partial }^2\phi }{{\partial t}^2}-{\nabla }^{{2}}\phi= & {} \rho \end{aligned}$$
(6.31)
$$\begin{aligned} \frac{{\partial }^2\mathbf {A}}{{\partial t}^2}-{\nabla }^{{2}}\mathbf {A}= & {} \mathbf {j} \, . \end{aligned}$$
(6.32)
The last two equations can be written in an extremely compact way if four-vectors $$\ A^{\mu }$$ and $$j^{\mu }$$ are introduced and if the D’Alembert operator $$\Box \equiv {\partial }^{\mu }{\partial }_{\mu }$$ is used. Defining
$$\begin{aligned} A^{\mu }=\left( \phi ,\mathbf {A}\right) \; ; \; j^{\mu }=\left( \rho ,\mathbf {j}\right) \end{aligned}$$
(6.33)
(notice that the Lorenz gauge $$\partial _\mu A^\mu = 0 \,\text {is covariant}$$), the two equations are summarized by
$$\begin{aligned} {\Box A}^{\mu } =j^{\mu } \, . \end{aligned}$$
(6.34)
In the absence of charges and currents (free electromagnetic field)
$$\begin{aligned} {\Box A}^{\mu }=0 \, . \end{aligned}$$
(6.35)
This equation is similar to the Klein–Gordon equation for a particle with $$m=0$$ (see Sects. 3.​2.​1 and 6.2.5) but with spin 1. $$A^{\mu }$$ is identified with the wave function of a free photon, and the solution of the above equation is, up to some normalization factor:
$$\begin{aligned} A^{\mu }=\epsilon ^{\mu }\left( q\right) e^{-iqx} \end{aligned}$$
(6.36)
where q is the four-momentum of the photon and $$\epsilon ^{\mu }$$ its the polarization four-vector. The four components of $$\epsilon ^{\mu }$$ are not independent. The Lorenz condition imposes one constraint, reducing the number of independent component to three. However, even after imposing the Lorenz condition, there is still the possibility, if $${\ \partial }^2\chi =0$$, of a further gauge transformation
$$\begin{aligned} A^{\mu }\rightarrow A^{\mu }+{\partial }^{\mu }\chi \, . \end{aligned}$$
(6.37)
This extra gauge transformation can be used to set the time component of the polarization four-vector to zero $$\left( \epsilon ^0=0\right) \ $$and thus converting the Lorenz condition into
$$\begin{aligned} \varvec{\epsilon }\cdot \mathbf {q}=0 \, . \end{aligned}$$
(6.38)
This choice is known as the Coulomb gauge, and it makes clear that there are just two degrees of freedom left for the polarization which is the case of mass zero spin 1 particles $$\left( m_s=\pm 1\right) $$.

6.2.1.1 Modification for a Nonzero Mass: The Proca Equation

In the case of a photon with a tiny mass $${\mu }_{\gamma }$$:
$$\begin{aligned} \left( \Box -{{\mu }_{\gamma }}^2\right) {\ A}^{\mu }=j^{\mu } \, , \end{aligned}$$
(6.39)
Maxwell equations would be transformed into the Proca2 equations:
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {{\mathcal{{E}}}}= & {} \rho -{{\mu }_{\gamma }}^2\phi \end{aligned}$$
(6.40)
$$\begin{aligned} \mathbf {\nabla } \cdot \mathbf {B}= & {} 0\end{aligned}$$
(6.41)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {{\mathcal{{E}}}}= & {} -\frac{\partial \mathbf {B}}{\partial t}\end{aligned}$$
(6.42)
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {B}= & {} \mathbf {j}+\frac{\partial \mathbf {{\mathcal{{E}}}}}{\partial t}\,-\,{{\mu }_{\gamma }}^2\mathbf {A}\,. \end{aligned}$$
(6.43)
In this scenario, the electrostatic field would show a Yukawa-type exponential attenuation, $$e^{-{\mu }_{\gamma }r}$$. Experimental tests of the validity of the Coulomb inverse square law have been performed since many years in experiments using different techniques, leading to stringent limits: $${\mu }_{\gamma }<{10}^{-18}$$ eV $$\sim $$10$$^{-51}$$g. Stronger limits $$({\mu }_{\gamma }<{10}^{-26}$$ eV) are reported from the analyses of astronomical data, but are model dependent.

6.2.2 Minimal Coupling

Classically, the coupling between a particle with charge e and the electromagnetic field is given by the Lorentz force:
$$\begin{aligned} \mathbf {{F}}=e\left( \mathbf {{\mathcal{{E}}}}+\mathbf {v}\times \mathbf {B}\right) \end{aligned}$$
(6.44)
which can be written in terms of scalar and vector potential as
$$ \frac{d\mathbf {p}}{dt} =e\left( -\mathbf {\nabla }\phi -\frac{\partial \mathbf {A}}{\partial t}+\mathbf {v}\times \left( \mathbf {\nabla } \times \mathbf {A}\right) \right) = e\left( -\mathbf {\nabla }\phi -\frac{\partial \mathbf {A}}{\partial t}+\mathbf {\nabla }\ \left( \mathbf {v}\cdot \mathbf {A}\right) -\left( \mathbf {v}\cdot \mathbf {\nabla }\right) \ \mathbf {A}\right) = $$
$$ =e\left( -\ \mathbf {\nabla }\ \left( \phi -\mathbf {v}\cdot \mathbf {A}\right) -\frac{\partial \mathbf {A}}{\partial t} -\left( \mathbf {v}\cdot \mathbf {\nabla }\right) \ \mathbf {A} \right) = e\left( -\ \mathbf {\nabla }\ \left( \phi -\mathbf {v}\cdot \mathbf {A}\right) -\frac{d \mathbf {A}}{d t} \right) $$
$$\begin{aligned} \Longrightarrow \frac{d}{dt}(\mathbf {p}+e\mathbf {A}) = e \left( -\ \mathbf {\nabla }\ \left( \phi -\mathbf {v}\cdot \mathbf {A}\right) \right) \, . \end{aligned}$$
Referring to the Euler–Lagrange equations:
$$\begin{aligned} \frac{\partial }{\partial t}\frac{\partial L}{\partial \dot{x}_i}= \frac{\partial L}{\partial x_i} \end{aligned}$$
with the nonrelativistic Lagrangian L defined as
$$\begin{aligned} { L}=\sum _i{\frac{1}{2}m{\dot{x}_i}^2}-U\left( x_i,\dot{x}_i, t\right) \end{aligned}$$
(6.45)
a generalized potential $$U(x_i,\dot{x}_i, t)$$ for this dynamics is
$$\begin{aligned} U=e \left( \phi -\dot{\mathbf {{{x}}}}_{i}\cdot \mathbf {A}\right) \, . \end{aligned}$$
(6.46)
The momentum being given by
$$\begin{aligned} p_i=\frac{\partial { L}}{\partial \dot{x}_i} \end{aligned}$$
one has for $$\mathbf {p}$$ and for the Hamiltonian H
$$\begin{aligned} \mathbf {p} = m \dot{\mathbf {{x}}}_{i} + e \mathbf {A} \end{aligned}$$
(6.47)
$$\begin{aligned} H=\frac{1}{2m}{\left( \mathbf {p}-e\mathbf {A}\right) }^2+e\phi \, . \end{aligned}$$
(6.48)
Then the free-particle equation
$$\begin{aligned} E=\frac{{\mathbf {p}}^{\, 2}}{2m} \end{aligned}$$
is transformed in the case of the coupling with the electromagnetic field in:
$$\begin{aligned} E-e\phi =\frac{1}{2m}{\left( \mathbf {p}-e\mathbf {A}\right) }^2 \, . \end{aligned}$$
(6.49)
This is equivalent to the following replacements for the free-particle energy and momentum:
$$\begin{aligned} E\rightarrow E-e\phi \; ; \; \mathbf {p}\rightarrow \mathbf {p}-e\mathbf {A} \end{aligned}$$
(6.50)
i.e., in terms of the relativistic energy–momentum four-vector:
$$\begin{aligned} {p^{\mu }\rightarrow p^{\mu }-eA}^{\mu } \end{aligned}$$
(6.51)
or, in the operator view ($$p^{\mu }\rightarrow $$ $$i \hbar {\partial }^{\mu }$$):
$$\begin{aligned} {{\partial }^{\mu }\rightarrow D^{\mu }\,\equiv \,{\partial }^{\mu }+ieA}^{\mu } \, . \end{aligned}$$
(6.52)
The operator $$D^{\mu }$$ is designated the covariant derivative .

The replacement $${\partial }^{\mu }\rightarrow D^{\mu }$$ is called the minimal coupling prescription . This prescription involves only the charge distribution and is able to account for all electromagnetic interactions.

Wave equations can now be generalized to account for the coupling with the electromagnetic field using the minimal coupling prescription.

For instance, the free-particle Schrödinger equation
$$\begin{aligned} i\hbar \frac{\partial }{\partial t}\varPsi =-\frac{1}{2m}{\left( -i\hbar \mathbf {\nabla }\right) }^2\varPsi \end{aligned}$$
(6.53)
becomes under such a replacement
$$\begin{aligned} \left( i\hbar \frac{\partial }{\partial t}-e\phi \right) \varPsi =-\frac{1}{2m}{\left( -i\hbar \mathbf {\nabla }-e\mathbf {A}\right) }^2\varPsi \,. \end{aligned}$$
(6.54)
The Schrödinger equation couples directly to the scalar and vector potential and not to the force, and quantum effects not foreseen in classic physics appear. One of them is the well-known Bohm–Aharonoveffect predicted in 1959 by David Bohm and his student Yakir Aharonov.3 Whenever a particle is confined in a region where the electric and the magnetic field are zero but the potential four-vector is not, its wave function changes the phase.
This is the case of particles crossing a region outside an infinite thin solenoid (Fig. 6.2, left). In this region, the magnetic field $$\mathbf {B}$$ is zero but the vector potential vector $$\mathbf {A}$$ is not
$$\begin{aligned} \mathbf {\nabla }\times \mathbf {A}=\mathbf {B} \end{aligned}$$
$$\begin{aligned} \oint {\mathbf {A}}\cdot d\mathbf {l}=\ \int _S{\mathbf {B}}\cdot d\mathbf {s} \, . \end{aligned}$$
The line integral of the vector potential $$\mathbf {A}\ $$ around a closed loop is equal to the magnetic flux through the area enclosed by the loop. As $$\ \mathbf {B}$$ inside the solenoid is not zero, the flux is also not zero and therefore $$\mathbf {A}\ $$ is not null.
This effect was experimentally verified observing shifts in an interference pattern whether or not the current in a microscopic solenoid placed in between the two fringes is turned on (Fig. 6.2, right).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig2_HTML.gif
Fig. 6.2

Left: Vector potential in the region outside an infinite solenoid. Right: Double-slit experiment demonstrating the Bohm–Aharonov effect.

From D. Griffiths, “Introduction to quantum mechanics,” second edition, Pearson 2004

6.2.3 Gauge Invariance

We have seen that physical observables connected to a wave function $$\varPsi $$ are invariant to global change in the phase of the wave function itself
$$\begin{aligned} \varPsi \left( \mathbf {x},t\right) \rightarrow \varPsi \left( \mathbf {x}, t\right) \ e^{iq\alpha } \end{aligned}$$
(6.55)
where $$\alpha $$ is a real number.
The free-particle Schrödinger equation in particular is invariant with respect to a global change in the phase of the wave function. It is easy, however, to verify that this does not apply, in general, to a local change
$$\begin{aligned} \varPsi \left( \mathbf {x},t\right) \rightarrow \varPsi \left( \mathbf {x},t\right) \ e^{iq\alpha (\mathbf {x}, t)} \, . \end{aligned}$$
(6.56)
On the other hand, the electromagnetic field is, as it was discussed in Sect. 6.2.1, invariant under a combined local transformation of the scalar and vector potential:
$$\begin{aligned} \phi \rightarrow \phi -\frac{\partial \chi }{\partial t}\end{aligned}$$
(6.57)
$$\begin{aligned} \mathbf {A}\ \rightarrow \mathbf {A}+\mathbf {\nabla }\chi \end{aligned}$$
(6.58)
where $$\chi \left( t,\mathbf {x}\right) $$ is a scalar function of the time and space coordinates.
Remarkably, the Schrödinger equation modified using the minimal coupling prescription is invariant under a joint local transformation both of the phase of the wave function and of the electromagnetic four-potential:
$$\begin{aligned} \varPsi \left( \mathbf {x},t\right) \rightarrow \varPsi \left( \mathbf {x}, t\right) \ e^{ie\alpha \left( \mathbf {x}\right) } \end{aligned}$$
(6.59)
$$\begin{aligned} A^{\mu }\ \rightarrow A^{\mu }-{\partial }^{\mu }\ \alpha \left( \mathbf {x}\right) \, . \end{aligned}$$
(6.60)
Applying the minimal coupling prescription to the relativistic wave equations (Klein–Gordon and Dirac equations), these equations become also invariant under local gauge transformations, as we shall verify later.

Conversely, imposing the invariance under a local gauge transformation of the free-particle wave equations implies the introduction of a gauge field.

The gauge transformation of the wave functions can be written in a more general form as
$$\begin{aligned} {\varPsi \left( \mathbf {x},t\right) \rightarrow \varPsi \left( \mathbf {x}, t\right) {\ }\exp \left( i\alpha \left( \mathbf {x}\right) \hat{A}\right) \ } \end{aligned}$$
(6.61)
where $$\alpha \left( \mathbf {x}\right) $$ is a real function of the space coordinates and $$\hat{A}$$ a unitary operator (see Sect. 5.​3.​3).

In the case of QED, Herman Weyl, Vladmir Foch, and Fritz London found in the late 1920s that the invariance of a Lagrangian including fermion and field terms with respect to transformations associated with the U(1) group, corresponding to local rotations by$$\ \alpha \left( \mathbf {x}\right) $$ of the wave function phase, requires (and provides) the interaction term with the electromagnetic field, whose quantum is the photon.

The generalization of this symmetry to non-Abelian groups was introduced in 1954 by Chen Yang and Robert Mills.4 Indeed we shall see that:
  • The weak interaction is modeled by a “weak isospin” symmetry linking “weak isospin up” particles (identified, e.g., with the u-type quarks and with the neutrinos) and “weak isospin down” particles (identified, e.g., with the d-type quarks and with the charged leptons). We have seen that SU(2) is the minimal representation for such a symmetry. If $$\hat{A}$$ is chosen to be one of the generators of the SU(2) group, then the associated gauge transformation corresponds to a local rotation in a spinor space. The gauge fields needed to ensure the invariance of the wave equations under such transformations are the weak fields, which imply the existence of the $$W^{\pm }$$ and Z mediators (see Sect. 6.3).

  • The strong interaction is modeled by QCD, a theory exploiting the invariance of the strong interaction with respect to a rotation in color space. We shall see that SU(3) is the minimal representation for such a symmetry. If $$\hat{A}$$ is chosen to be one of the generators of the SU(3) group, then the associated gauge transformation corresponds to a local rotation in a complex three-dimensional vector space, which represents the color space. The gauge fields needed to assure the invariance of the wave equations under such transformations are the strong fields whose quanta are called gluons (see Sect. 6.4).

Figure 6.3 shows schematic representations of such transformations.

6.2.4 Dirac Equation Revisited

Dirac equation was briefly introduced in Sect. 3.​2.​1. It is a linear equation describing free relativistic particles with spin 1 / 2 (electrons and positrons for instance); linearity allows overcoming some difficulties coming from the nonlinearity of the Klein–Gordon equation, which was the translation in quantum mechanical form of the relativistic Hamiltonian
$$\begin{aligned} H^2 = p^2 + m^2 \end{aligned}$$
replacing the Hamiltonian itself and the momentum with the appropriate operators:
$$\begin{aligned} \hat{H}^2 = \hat{p}^2 + m^2 \Longrightarrow - \frac{\partial ^2 \psi }{\partial t^2} = - \mathbf {\nabla }^2 \psi + m^2 \psi \, . \end{aligned}$$
(6.62)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig3_HTML.gif
Fig. 6.3

Schematic representations of U(1), SU(2), and SU(3) transformations applied to the models of QED, weak, and strong interactions

Dirac searched for an alternative relativistic equation starting from the generic form describing the evolution of a wave function, in the familiar form:
$$\begin{aligned} i \frac{\partial \varPsi }{\partial t} = \hat{H} \psi \, \end{aligned}$$
(6.63)
with a Hamiltonian operator linear in $$\hat{\mathbf {p}}$$, t (Lorentz invariance requires that if the Hamiltonian has first derivatives with respect to time also the spatial derivatives should be of first order):
$$\begin{aligned} \hat{H} = \varvec{\alpha } \cdot \mathbf {p}+ \beta m \, . \end{aligned}$$
(6.64)
This must be compatible with the Klein–Gordon equation, and thus
$$\begin{aligned}&\qquad \alpha ^2_i = 1 \; \, ; \, \; \beta ^2 = 1 \nonumber \\&\,\,\,\, \alpha _i \beta + \beta \alpha _i \,= \, 0 \nonumber \\&\alpha _i\alpha _j + \alpha _j\alpha _i \, = \, 0 \, . \end{aligned}$$
(6.65)
Therefore, the parameters $$\varvec{\alpha }$$ and $$\beta $$ cannot be numbers. However, things work if they are matrices (and if these matrices are Hermitian it is guaranteed that the Hamiltonian is also Hermitian). It can be demonstrated that their lowest possible rank is 4.
Using the explicit form of the momentum operator $$\mathbf {p}= -i \mathbf {\nabla }$$, the Dirac equation can be written as
$$\begin{aligned} i \frac{\partial \psi }{\partial t} = \left( i \varvec{\alpha } \cdot \mathbf {\nabla } + \beta m \right) \psi \, . \end{aligned}$$
(6.66)
The wave functions $$\psi $$ must thus be of the form:
$$\begin{aligned} \psi (\mathbf {r}, t) = \left( \begin{array}{c} \psi _1(x) \\ \psi _2(x) \\ \psi _3(x) \\ \psi _4(x) \end{array} \right) . \end{aligned}$$
(6.67)
We arrived at an interpretation of the Dirac equation as a four-dimensional matrix equation in which the solutions are four-component wavefunctions called bi-spinors. Plane wave solutions are
$$\begin{aligned} \psi (x) = u(\mathbf {p}) e^{i(\mathbf {p}\cdot \mathbf {r}- Et)} \end{aligned}$$
(6.68)
where $$u(\mathbf {p})$$ is also a four-component bi-spinor satisfying the eigenvalue equation
$$\begin{aligned} \left( \varvec{\alpha } \cdot \mathbf {p}+ \beta m \right) u(\mathbf {p}) = E u(\mathbf {p}) \, . \end{aligned}$$
(6.69)
This equation has four solutions: two with positive energy $$E =+E_p$$ and two with negative energy $$E =-E_p$$. We will discuss later the interpretation of the negative energy solutions. The Dirac equation accounts “for free” for the existence of two spin states, which had to be inserted by hand in the Schrödinger equation of nonrelativistic quantum mechanics, and therefore explains the magnetic moment of point-like fermions. In addition, since spin is embedded in the equation, the Dirac’s equation allows computing correctly the energy splitting of atomic levels with the same quantum numbers due to the spin–orbit and spin–spin interactions in atoms (fine and hyperfine splitting).

We shall now write the free-particle Dirac equation in a more compact form, from which relativistic covariance is immediately visible. This requires the introduction of a new set of important 4$$\,\times \,$$4 matrices, the $${\gamma }^{\mu }$$ matrices, which replace the $$\alpha _i$$ and $$\beta $$ matrices discussed before. To account for electromagnetic interactions, the minimal coupling prescription can once again be used.

A possible choice, the Dirac-Pauli representation, for $${\alpha }_i$$ and $$\beta \ $$ satisfying the conditions (6.65) is the set of matrices:
$$\begin{aligned} {\alpha }_i =\left( \begin{array}{cc} 0 &{} {\sigma }_i \\ {\sigma }_i &{} 0 \end{array} \right) \; ; \; \beta =\left( \begin{array}{cc} I &{} 0 \\ 0 &{} -I \end{array} \right) \end{aligned}$$
(6.70)
being $${\sigma }_i$$ the $$2 \times 2$$ Pauli matrices (see Sect. 5.​7.​2) and I the unit $$2\times 2$$ matrix.
Multiplying the Dirac equation (6.66) by $$\beta $$ one has
$$\begin{aligned} i\beta \frac{\partial \psi }{\partial t}=\left( i\beta \varvec{\alpha }\cdot \mathbf {\nabla }+m\right) \psi \, , \end{aligned}$$
and introducing the Pauli–Dirac $${\gamma }^{\mu }$$ matrices defined as
$$\begin{aligned} {\gamma }^0=\beta \; ; \; \varvec{\gamma }=\beta \varvec{\alpha } \end{aligned}$$
(6.71)
$$\begin{aligned} \gamma ^0 = \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c}1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \end{array}\right) ; \gamma ^1 \!=\! \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 \end{array}\right) ;\\ \gamma ^2 \!=\! \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 0 &{} 0 &{} -i \\ 0 &{} 0 &{} i &{} 0 \\ 0 &{} i &{} 0 &{} 0 \\ -i &{} 0 &{} 0 &{} 0 \end{array}\right) ; \gamma ^3 \!=\! \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \\ -1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \end{array}\right) \end{aligned}$$
then:
$$\begin{aligned} \left[ i\left( {\gamma }^0{\frac{\partial }{\partial x}}_0+{\gamma }^i{\frac{\partial }{\partial x}}_i\right) -m\right] \psi =0 \, . \end{aligned}$$
(6.72)
If we use a four-vector notation
$$\begin{aligned} \gamma ^{\mu } = (\beta ,\beta \varvec{\alpha }) \, , \end{aligned}$$
(6.73)
taking into account that
$$\begin{aligned} {\partial }_{\mu }=\left( \frac{\partial }{\partial t},\ \mathbf {\nabla }\right) \, , \end{aligned}$$
(6.74)
the Dirac equation can be finally written as:
$$\begin{aligned} (i{\gamma }^{\mu }{\partial }_{\mu }-m) \psi =0 \, . \end{aligned}$$
(6.75)
This is an extremely compact form of writing a set of four differential equations applied to a four-component vector $$\psi $$ (often called a bi-spinor) . We call it the covariant form of the Dirac equation (its form is preserved in all the inertial frames).

Let us examine now the solutions of the Dirac equation in some particular cases.

6.2.4.1 Particle at Rest

Particles at rest have $$\mathbf {p}=0$$ and thus
$$\begin{aligned} \left( i{\gamma }^0\frac{\partial }{\partial t}-m\right) \psi =0 \end{aligned}$$
(6.76)
$$\begin{aligned} \left( \begin{array}{cc} I &{} 0 \\ 0 &{} -I \end{array} \right) \left( \begin{array}{c} \frac{\partial }{\partial t}{\psi }_A \\ {\frac{\partial }{\partial t}\psi }_B \end{array} \right) =-im\left( \begin{array}{c} {\psi }_A \\ {\psi }_B \end{array} \right) \end{aligned}$$
(6.77)
being $${\psi }_A$$ and $${\psi }_B$$ spinors:
$$\begin{aligned} {\psi }_A{=}\left( \begin{array}{c} {\psi }_1 \\ {\psi }_2 \end{array} \right) \end{aligned}$$
(6.78)
$$\begin{aligned} {\psi }_B{=}\left( \begin{array}{c} {\psi }_3 \\ {\psi }_4 \end{array} \right) \, . \end{aligned}$$
(6.79)
In this simple case, the two spinors are subject to two independent differential equations:
$$\begin{aligned} \frac{\partial }{\partial t}{\psi }_A=-im{\psi }_A \end{aligned}$$
(6.80)
$$\begin{aligned} \frac{\partial }{\partial t}{\psi }_B=im{\psi }_B \end{aligned}$$
(6.81)
which have as solution (up to some normalization factor):
  • $${\psi }_A=\ e^{-imt}{\psi }_A\left( 0\right) $$ with energy $$E=m>0$$;

  • $${\psi }_B=\ e^{imt}{\psi }_B\left( 0\right) $$ with energy $$E=-m<0$$

or in terms of each component of the wavefunction vector
$$\begin{aligned} {\psi }_1=\ e^{-imt}\left( \begin{array}{c} 1 \\ \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ 0 \end{array} \end{array} \end{array} \right) \; \; {\psi }_2=\ e^{-imt}\left( \begin{array}{c} 0 \\ \begin{array}{c} 1 \\ \begin{array}{c} 0 \\ 0 \end{array} \end{array} \end{array} \right) \end{aligned}$$
(6.82)
$$\begin{aligned} {\psi }_3=\ e^{imt}\ \ \left( \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ \begin{array}{c} 1 \\ 0 \end{array} \end{array} \end{array} \right) \; \; {\psi }_4=\ e^{imt}\ \ \left( \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ 1 \end{array} \end{array} \end{array} \right) . \end{aligned}$$
(6.83)
There are then four solutions which can accommodate a spin 1 / 2 particle or antiparticle. The positive energy solutions $${\psi }_1$$ and $${\psi }_2\ $$correspond to fermions (electrons for instance) with spin up and down, respectively, while the negative energy solutions $${\psi }_3$$ and $${\psi }_4\ $$correspond to antifermions (positrons for instance) with spin up and down.

6.2.4.2 Free Particle

Free particles have $$\mathbf {p}=\mathrm {constant}$$ and their wave function is a plane wave of the form:
$$\begin{aligned} \psi \left( \mathbf {x},t\right) =\ u\left( \mathbf {p}, p_0\right) e^{-i\left( p_0t-\mathbf {p}\cdot \mathbf {x}\right) } \end{aligned}$$
(6.84)
where
$$u\left( \mathbf {p}, p_0\right) ={N}\left( \begin{array}{c} \phi \\ \chi \end{array} \right) $$
is a bi-spinor ($$\phi $$, $$\chi $$ are spinors) and N a normalization factor.
The Dirac equation can be written as a function of the energy–momentum operators as
$$\begin{aligned} \left( \left( {\gamma }^0p_0-\varvec{\gamma }\cdot \mathbf {p}\right) -m\right) \psi =0 \end{aligned}$$
(6.85)
Inserting the equation of a plane wave as a trial solution and using the Pauli–Dirac representation of the $$\gamma $$ matrices:
$$\begin{aligned} \left( \begin{array}{cc} (p_0-m)I &{} -\varvec{\sigma }\cdot \mathbf {p} \\ \varvec{\sigma }\cdot \mathbf {p} &{} ({-p}_0-m)I \end{array} \right) \left( \begin{array}{c} \phi \\ \chi \end{array} \right) =0 \, . \end{aligned}$$
(6.86)
I is again the $$2\times 2$$ unity matrix which is often omitted writing the equations and
$$\begin{aligned} \varvec{\sigma } \cdot \mathbf {p} = \left( \begin{array}{cc} p_z &{} p_x - i p_y \\ p_x + i p_x &{} - p_z \end{array} \right) \, . \end{aligned}$$
(6.87)
For $$\mathbf {p}=0$$, the “particle at rest” solution discussed above is recovered. Otherwise, there are two coupled equations for the spinors $$\phi $$ and $$\chi $$:
$$\begin{aligned} \phi =\,\frac{\varvec{\sigma }\cdot \mathbf {p}}{{ E-m}}\chi \end{aligned}$$
(6.88)
$$\begin{aligned} \chi =\,\frac{\varvec{\sigma }\cdot \mathbf {p}}{{ E+m}}\phi \end{aligned}$$
(6.89)
and then the u bi-spinor can be written either in terms of the spinor $$\phi $$ or in term of the spinor $$\chi $$:
$$\begin{aligned} u_1={N}\left( \begin{array}{c} \phi \\ \frac{\varvec{\sigma }\cdot \mathbf {p}}{{ E\,+\, m}}\phi \end{array} \right) \end{aligned}$$
(6.90)
$$\begin{aligned} u_2={N}\left( \begin{array}{c} \frac{{ -}\varvec{\sigma }\cdot \mathbf {p}}{{ -}{ E\,+\, m}}\chi \\ \chi \end{array} \right) \, . \end{aligned}$$
(6.91)
The first solution corresponds to states with $${ E}>0$$ (particles) and the second to states with $${ E}<0$$ (antiparticles) as can be seen by going to the $$\mathbf {p}=0\ $$limit. These last states can be rewritten changing the sign of E and $$\mathbf {p}$$ and labeling the bi-spinor $$u_2$$ as v ($$u_1$$ is then labeled just as u).
$$\begin{aligned} v={N}\left( \begin{array}{c} \frac{\varvec{\sigma }\cdot \mathbf {p}}{{ E+m}}\chi \\ \chi \end{array} \right) . \end{aligned}$$
(6.92)
Both $$\phi $$ and $$\chi $$ can be written in a base of unit vectors $${\chi }_s$$ with
$$\begin{aligned} {\chi }_{s=1}=\left( \begin{array}{c} { 1} \\ 0 \end{array} \right) \end{aligned}$$
(6.93)
$$\begin{aligned} {\chi }_{s=2}=\left( \begin{array}{c} 0 \\ 1 \end{array} \right) . \end{aligned}$$
(6.94)
Finally, we have then again four solutions: two for the particle states and two for the antiparticle states.
The normalization factor N is often defined as
$$\begin{aligned} {N}{ =}\frac{\sqrt{{ E+m}}}{\sqrt{{ V}}} \end{aligned}$$
(6.95)
ensuring a standard relativistic normalization convention of 2E particles per box of volume V. In fact, introducing the bi-spinors transpose conjugate $$u^{\dagger }$$and $$v^{\dagger }$$
$$\begin{aligned} u^{\dagger }u=v^{\dagger }v=2E/V \, . \end{aligned}$$
(6.96)

6.2.4.3 Helicity

The spin operator $$\mathbf {S}$$ introduced in Sect. 5.​7.​2 can now be generalized in this bi-spinor space as
$$\begin{aligned} \mathbf {S}=\frac{1}{2}\mathbf {\Sigma } \end{aligned}$$
(6.97)
where
$$\begin{aligned} \mathbf {\Sigma }\,=\,\left( \begin{array}{cc} \varvec{\sigma } &{} 0 \\ 0 &{} \varvec{\sigma } \end{array} \right) \, . \end{aligned}$$
(6.98)
More generally, defining the helicity operator h as the projection of the spin over the momentum direction:
$$\begin{aligned} \mathrm{{h}} =\frac{1}{2}\frac{\varvec{\sigma }\cdot \mathbf {p}}{\left| \mathbf {p}\right| } \end{aligned}$$
(6.99)
there are always four eigenstates of this operator. Indeed, using spherical polar coordinates $$(\theta ,\phi )$$:
$$\begin{aligned} \mathbf {p} = |\mathbf {p}| (\sin \theta \cos \phi \mathbf {e}_x + \sin \theta \sin \phi \mathbf {e}_y + \cos \theta \mathbf {e}_z) \, , \end{aligned}$$
(6.100)
and the helicity operator is given by
$$\begin{aligned} \mathrm{{h}}\,=\,\left( \begin{array}{cc} \cos \theta &{} \sin \theta e^{-i \phi } \\ \sin \theta e^{i \phi } &{} -\cos \theta \end{array} \right) \, . \end{aligned}$$
(6.101)
The eigenstates of the operator h can also be written as
$$\begin{aligned} u_{\uparrow }\,= \sqrt{{E+m}} \left( \begin{array}{c} \cos \left( \frac{\theta }{2} \right) \\ \sin \left( \frac{\theta }{2} \right) e^{i \phi } \\ \frac{p}{E+m} \cos \left( \frac{\theta }{2} \right) \\ \frac{p}{E+m} \sin \left( \frac{\theta }{2} \right) e^{i \phi } \end{array} \right) ; u_{\downarrow }\,= \sqrt{{E+m}} \left( \begin{array}{c} - \sin \left( \frac{\theta }{2} \right) \\ \cos \left( \frac{\theta }{2} \right) e^{i \phi } \\ \frac{p}{E+m} \sin \left( \frac{\theta }{2} \right) \\ - \frac{p}{E+m} \cos \left( \frac{\theta }{2} \right) e^{i \phi } \end{array} \right) \end{aligned}$$
(6.102)
$$\begin{aligned} v_{\uparrow }\,= \sqrt{{E+m}} \left( \begin{array}{c} \frac{p}{E+m} \sin \left( \frac{\theta }{2} \right) \\ - \frac{p}{E+m} \cos \left( \frac{\theta }{2} \right) e^{i \phi } \\ - \sin \left( \frac{\theta }{2} \right) \\ \cos \left( \frac{\theta }{2} \right) e^{i \phi } \end{array} \right) ; v_{\downarrow }\,= \sqrt{{E+m}} \left( \begin{array}{c} \frac{p}{E+m} \cos \left( \frac{\theta }{2} \right) \\ \frac{p}{E+m} \sin \left( \frac{\theta }{2} \right) e^{i \phi }\\ \cos \left( \frac{\theta }{2} \right) \\ \sin \left( \frac{\theta }{2} \right) e^{i \phi } \end{array} \right) . \end{aligned}$$
(6.103)
Note that helicity is Lorentz invariant only in the case of massless particles (otherwise the direction of $$\mathbf {p}$$ can be inverted choosing an appropriate reference frame).

6.2.4.4 Dirac Adjoint, the $${\varvec{\gamma }}^\mathbf{{5}}$$ Matrix, and Bilinear Covariants

The Dirac bi-spinors are not real four-vectors, and it can be shown that the product $${\psi }^{\dagger }\psi $$ is not a Lorentz invariant (a scalar). On the contrary, the product $$\overline{\psi }\psi $$ is a Lorentz invariant being $$\overline{\psi }$$ named the adjoint Dirac spinor and defined as:
$$\begin{aligned} \overline{\psi }={\psi }^{\dagger }{\gamma }^0 = \end{aligned}$$
$$\begin{aligned} = \left( {\psi }^*_1,{\psi }^*_2,{\psi }^*_3,{\psi }^*_4\right) \ \left( \begin{array}{cccc} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \\ \end{array} \right) = \left( {\psi }^*_1,{\psi }^*_2,-{\psi }^*_3,{-\psi }^*_4\right) . \end{aligned}$$
(6.104)
The parity operator P in the Dirac bi-spinor space is just the matrix $${\gamma }^0$$ (it reverts the sign of the terms which are function of $$\mathbf {p}$$), and
$$\begin{aligned} P\left( \overline{\psi }\psi \right) ={\psi }^{\dagger }{\gamma }^0{\gamma }^0{\gamma }^0\psi = \overline{\psi }\psi \end{aligned}$$
(6.105)
as $${\left( {\gamma }^0\right) }^2=1$$.
Other quantities can be constructed using $$\psi $$ and $$\overline{\psi }$$ (bilinear covariants). In particular introducing $${\gamma }^5$$ as
$$\begin{aligned} {{\gamma }^5=i{\gamma }^0{\gamma }^1{\gamma }^2{\gamma }^3}={\left( \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ \end{array} \right) }: \end{aligned}$$
(6.106)
  • $$\overline{\psi }{\gamma }^5\psi $$ is a pseudoscalar.

  • $$\overline{\psi }{\gamma }^{\mu }\psi $$ is a four-vector.

  • $$\overline{\psi }{\gamma }^{\mu }{\gamma }^5\psi $$ is a pseudo four-vector.

  • $$\left( \overline{\psi }{\ \sigma }^{\mu \nu }\psi \right) $$, where $${\ \sigma }^{\mu \nu }=\frac{i}{2}\left( {\gamma }^{\mu }{\gamma }^\nu -{\gamma }^\nu {\gamma }^{\mu }\right) $$, is an antisymmetric tensor.

6.2.4.5 Dirac Equation in the Presence of an Electromagnetic Field

The Dirac equation in the presence of an electromagnetic field can be obtained applying the minimal coupling prescription discussed in Sect. 6.2.2. In practice this is obtained by replacing the $${\partial }^{\mu }$$ derivatives by the covariant derivative $$D^{\mu }$$:
$$\begin{aligned} {{\partial }^{\mu }\rightarrow D^{\mu }\,\equiv \,{\partial }^{\mu }+ieA}^{\mu } \, . \end{aligned}$$
(6.107)
Then
$$\begin{aligned} \left( i{\gamma }^{\mu }D_{\mu }-m\right) \psi =0 \end{aligned}$$
(6.108)
$$\begin{aligned} \left( i{\gamma }^{\mu }{\partial }_{\mu }-e\ {\gamma }^{\mu }A_{\mu }-m\right) \psi =0 \, . \end{aligned}$$
(6.109)
The interaction with a magnetic field can be then described introducing the two spinors $$\phi $$ and $$\chi $$ and using the Pauli–Dirac representation of the $$\gamma $$ matrices:
$$\begin{aligned} \left( \begin{array}{cc} p_0-m - eA_0 &{} -\varvec{\sigma }\cdot \left( -i\mathbf {\nabla }-e\mathbf {A}\right) \\ \varvec{\sigma }\cdot \left( -i\mathbf {\nabla }-e\mathbf {A}\right) &{} {-p}_0-m + eA_0 \end{array} \right) \left( \begin{array}{c} \phi \\ \chi \end{array} \right) =0 \, . \end{aligned}$$
(6.110)
In the nonrelativistic limit ($$E\approx m)$$, the Dirac equation reduces to
$$\begin{aligned} \frac{1}{2m}{\left| \mathbf {{ p}}-e\mathbf {A}\right| }^2\psi -\frac{e\mathbf {B}\cdot \mathbf {\Sigma }}{2m}\psi =0 \end{aligned}$$
(6.111)
where the magnetic field $$\mathbf {B}=\mathbf {\nabla }\times \mathbf {A}$$ has been reintroduced.
There is thus a coupling of the form $$\ -\varvec{\mu }\cdot \mathbf {B}$$ between the magnetic field and the spinof a point-like charged particle (the electron or the muon for instance), and the quantity
$$\begin{aligned} \varvec{\mu }={ \ }{\varvec{\mu }}_S=\frac{e}{m}\frac{1}{2}\mathbf {\Sigma }=\frac{e}{m}\mathbf {S} \end{aligned}$$
(6.112)
can be identified with the intrinsic magnetic moment of a charged particle with spin $$\mathbf {S}$$.
Defining the gyromagnetic ratio g as the ratio between $${\varvec{\mu }}_S$$ and the classical magnetic moment $${\varvec{\mu }}_L$$ of a charged particle with an angular momentum $$\mathbf {L}=\mathbf {S}$$:
$$\begin{aligned} g=\frac{{\varvec{\mu }}_S}{{\varvec{\mu }}_L}=2 \, . \end{aligned}$$
(6.113)

6.2.4.6 $${g}-2$$

The value of the coupling between the magnetic field and the spin of the point charged particle is however modified by higher-order corrections which can be translated in successive Feynman diagrams, as the ones we have seen in Fig. 6.1. In second order, the main correction is introduced by a vertex correction, described by the diagram represented in Fig. 6.4 computed in 1948 by Schwinger, leading to deviation of g from 2 (anomalous magnetic moment) with magnitude:
$$\begin{aligned} a_e=\frac{g-2}{2}\simeq \frac{\alpha }{2\pi } \simeq 0.0011614 \, . \end{aligned}$$
(6.114)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig4_HTML.gif
Fig. 6.4

Second-order vertex correction to g

Nowadays, the theoretical corrections are completely computed up to the eighth-order (891 diagrams) and the most significant tenth-order terms as well as electroweak and hadronic corrections are also computed. There is a remarkable agreement with the present experimental value of:
$$\begin{aligned} a^\mathrm{exp}_e=0.00115965218076\pm 0.00000000000027 \, . \end{aligned}$$
(6.115)
Historically, the first high precision $$g-2$$ measurements were accomplished by H. Richard Crane and his group in the years 1950–1967 at the University of Michigan, USA. A beam of electrons is first polarized and then trapped in a magnetic bottle for a (long) time T. After this time, the beam is extracted and the polarization is measured (Fig. 6.5).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig5_HTML.gif
Fig. 6.5

Schematic drawing of the g – 2 experiment from H. Richard Crane

Under the influence of the magnetic field B in the box, the spin of the electron precesses with angular velocity
$$\begin{aligned} \omega _p=\frac{g\ e\ B}{2\ m} \end{aligned}$$
(6.116)
while the electron follows a helicoidal trajectory with an angular velocity of
$$\begin{aligned} \omega _{rot}=\frac{\ e\ B}{\ m} \, . \end{aligned}$$
(6.117)
The polarization of the outgoing beam is thus proportional to the ratio
$$\begin{aligned} \frac{w_p}{w_{rot}}=\frac{g}{2} \, . \end{aligned}$$
(6.118)
Nowadays, Penning traps are used to keep electrons (and positrons) confined for months. Such a device, invented by H. Dehmelt in the 1980s, uses a homogeneous static magnetic field and a spatially inhomogeneous static electric field to trap charged particles (Fig. 6.6).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig6_HTML.gif
Fig. 6.6

Schematic representation of the electric and magnetic fields inside a Penning trap.

By Arian Kriesch Akriesch 23:40, [own work, GFDL http://​www.​gnu.​org/​copyleft/​fdl.​html, CC-BY-SA-3.0], via Wikimedia Commons

The muon and electron magnetic moments are equal at first order. However, the loop corrections are proportional to the square of the respective masses and thus those of the muon are much larger $$\left( {m_{\mu }}^2/{m_e}^2\sim 4\times {10}^4\right) $$. In particular, the sensitivity to loops involving hypothetical new particles (see Chap. 7 for a survey) is much higher, and a precise measurement of the muon anomalous magnetic moment $$a_{\mu }$$ may be used as a test of the standard model.

The most precise measurement of $$a_{\mu }$$ so far was done by the experiment E821 at Brookhaven National Laboratory (BNL). A beam of polarized muons circulates in a storage ring with a diameter of $$\sim $$14 m under the influence of an uniform magnetic field (Fig. 6.7).The muon spin precesses, and the polarization of the beam is a function of time. After many turns, muons decay to electron (and neutrinos) whose momentum is basically aligned with the direction of the muon spin (see Sect. 6.3). The measured value is
$$\begin{aligned} a^\mathrm{exp}_{\mu }=0.00116592083\pm 0.00000000063 \, . \end{aligned}$$
(6.119)
This result is more than $$3\ \sigma $$ away from the expected one which leads to a wide discussion both on the accuracy of the theoretical computation (in particular in the hadronic contribution) and the possibility of an indication of new physics (SUSY particles, dark photon, extra dimensions, additional Higgs bosos, ...). Meanwhile the E821 storage ring has been moved to Fermilab, and it is presently used by the E989 experiment which aims to improve the precision by a factor of four. Results are expected in few years (2018–2020).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig7_HTML.gif
Fig. 6.7

The E821 storage ring.

From Brookhaven National Laboratory

6.2.4.7 The Lagrangian Density Corresponding to the Dirac Equation

Consider the Lagrangian density
$$\begin{aligned} \mathcal {L} = i \bar{\psi } \gamma ^\mu \partial _\mu \psi - m \bar{\psi } \psi \end{aligned}$$
(6.120)
and apply the Euler–Lagrange equations to $$\bar{\psi }$$. One finds
$$\begin{aligned} \frac{\partial \mathcal {L}}{\partial \bar{\psi }} = i \gamma ^\mu \partial _\mu \psi - m \psi = 0 \, , \end{aligned}$$
which is indeed the Dirac equation for a free particle. Notice that:
  • the mass (i.e., the energy associated with rest—whatever this can mean in quantum mechanics) is associated with a term quadratic in the field
    $$\begin{aligned} m \bar{\psi } \psi \, ; \end{aligned}$$
  • the dimension of the field $$\psi $$ is [energy$$^{3/2}$$] ($$m \psi ^2 d^4x$$ is a scalar).

6.2.5 Klein–Gordon Equation Revisited

The Klein–Gordon equation was briefly introduced in Sect. 3.​2.​1. It describes free relativistic particles with spin 0 (scalars or pseudoscalars). With the introduction of the four-vector notation, it can be written in a covariant form. To account for electromagnetic interactions, the minimal coupling prescription can be used.

6.2.5.1 Covariant Form of the Klein–Gordon Equation

In Sect. 5.​7.​2, the Klein–Gordon equation was written as
$$\begin{aligned} \left( \frac{{\partial }^2}{{\partial t}^2}-{\mathbf {\nabla }}^2+m^2\right) \phi (x)=0 \end{aligned}$$
where $$\phi (x)$$ is a scalar wave function.
Remembering that
$$\begin{aligned} \Box ={\partial }_{\mu }{\partial }^{\mu }=\frac{{\partial }^2}{{\partial t}^2}-{\mathbf {\nabla }}^2 \end{aligned}$$
the Klein–Gordon equation can be written in a covariant form:
$$\begin{aligned} \left( {\partial }_{\mu }{\partial }^{\mu }+m^2\right) \phi (x)=0 \, . \end{aligned}$$
(6.121)
The solutions are, as it was discussed before, plane waves
$$\begin{aligned} \phi (x)=N{\ e}^{i(\mathbf {p} \cdot \mathbf {r}-Et)} \end{aligned}$$
(6.122)
with
$$\begin{aligned} { E=\pm }\sqrt{{\mathbf {p}}^2+m^2} \end{aligned}$$
(6.123)
(the positive solutions correspond to particles and the negative ones to antiparticles).
Doing some arithmetic with the Klein–Gordon equation and its conjugate, a continuity equation can also be obtained for a particle with charge e:
$$\begin{aligned} \mathbf {\nabla }\ \cdot \mathbf {j}=-\frac{\partial \rho }{\partial t} \end{aligned}$$
(6.124)
where
$$\rho (x)=ie\left( {\phi }^*\partial _t\phi \,-\,\phi \partial _t{\phi }^*\right) \; ; \; \mathbf {j}(x)=\ -ie\left( {\phi }^*\nabla \phi \,-\,\phi \nabla {\phi }^*\right) $$
or in terms of four-vectors:
$$\begin{aligned} {\partial }^{\mu }j_{\mu }=0 \end{aligned}$$
(6.125)
where
$$\begin{aligned} j_{\mu }(x)=\ ie\left( {\phi }^*{\partial }_{\mu }\phi \,-\,\phi {\partial }_{\mu }{\phi }^*\right) \, . \end{aligned}$$
(6.126)
In the case of plane waves:
$$\begin{aligned} j_{\mu }(x)=2e{\left| N\right| }^2p_{\mu } \, . \end{aligned}$$
(6.127)

6.2.5.2 Klein–Gordon Equation in Presence of an Electromagnetic Field

In the presence of an electromagnetic field, the Klein–Gordon equation can be modified applying, as it was done previously for the Schrödinger and the Dirac equations, the minimal coupling prescription. The normal derivatives are replaced by the covariant derivatives:
$$\begin{aligned} {{\partial }^{\mu }\rightarrow D^{\mu }{\equiv \partial }^{\mu }+ieA}^{\mu } \end{aligned}$$
(6.128)
and thus
$$\begin{aligned} \left( \left( {\partial }_{\mu }+ieA_{\mu }\right) \left( {\partial }^{\mu }+ieA^{\mu }\right) +m^2\right) \phi (x)=0 \end{aligned}$$
$$\begin{aligned} \left( {\partial }_{\mu }{\partial }^{\mu }+m^2+ie\left( {\partial }_{\mu }A^{\mu }{+A}_{\mu }{\partial }^{\mu }\right) -e^2A_{\mu }A^{\mu }\right) \ \phi (x)=0 \, . \end{aligned}$$
The $$e^2$$ term is of second order and can be neglected. Then the Klein–Gordon equation in presence of an electromagnetic field can be written at first order as
$$\begin{aligned} \left( {\partial }_{\mu }{\partial }^{\mu }+{V(x)+m}^2\right) \ \phi (x)=0 \end{aligned}$$
(6.129)
where
$$\begin{aligned} V(x)=ie\left( {\partial }_{\mu }A^{\mu }\,+\,{A}_{\mu }{\partial }^{\mu }\right) \end{aligned}$$
(6.130)
is the potential.

6.2.5.3 The Lagrangian Density Corresponding to the Klein–Gordon Equation

Consider the Lagrangian density
$$\begin{aligned} \mathcal {L} = \frac{1}{2} (\partial _\mu \phi ) (\partial ^\mu \phi ) - \frac{1}{2} m^2 \phi ^2 \end{aligned}$$
(6.131)
and apply the Euler–Lagrange equations to $$\phi $$. We find
$$\begin{aligned} \partial _\mu \left( \frac{\partial \mathcal {L}}{\partial (\partial _\mu \phi )}\right) -\frac{\partial \mathcal {L}}{\partial \phi } = \partial _\mu \partial ^\mu \phi + m^2 \phi = 0 \, , \end{aligned}$$
which is indeed the Klein–Gordon equation for a free scalar field.
Notice that:
  • the mass (i.e., the energy associated with rest—or better, in a quantum mechanical language, to the ground state) is associated with a term quadratic in the field
    $$\begin{aligned} \frac{1}{2} m^2 \phi ^2 \, ; \end{aligned}$$
  • the dimension of the field $$\phi $$ is [energy] ($$m^2 \phi ^2 d^4x$$ is a scalar).

6.2.6 The Lagrangian for a Charged Fermion in an Electromagnetic Field: Electromagnetism as a Field Theory

Let us draw a field theory equivalent to the Dirac equations in the presence of an external field.

We already wrote a Lagrangian density equivalent to the Dirac equation for a free particle (Eq. 6.120):
$$\begin{aligned} \mathcal {L}=\bar{\psi }(i\gamma ^\mu \partial _\mu -m)\psi \, . \end{aligned}$$
(6.132)
Electromagnetism can be translated into the quantum world by assuming a Lagrangian density
$$\begin{aligned} \mathcal {L}=\bar{\psi }(i\gamma ^\mu D_\mu -m)\psi -\frac{1}{4}F_{\mu \nu }F^{\mu \nu } \end{aligned}$$
(6.133)
where $$D_\mu \equiv \partial _\mu +ieA_\mu $$ is called the covariant derivative (remind the “minimal prescription”), and $$A_\mu $$ is the four-potential of the electromagnetic field; $$F_{\mu \nu } = \partial _\mu A_\nu - \partial _\nu A_\mu $$ is the electromagnetic field tensor (see Sect. 2.​9.​8).
If the field $$A^\mu $$ transforms under a local gauge transformation as
$$\begin{aligned} A^\mu \rightarrow A^\mu - \partial ^\mu \theta (x) \end{aligned}$$
(6.134)
the Lagrangian is invariant with respect to a local U(1) gauge transformation $$\psi \rightarrow \psi e^{i\theta (x)}$$.
Substituting the definition of D into the Lagrangian gives us
$$\begin{aligned} \mathcal {L} = i \bar{\psi } \gamma ^\mu \partial _\mu \psi - e\bar{\psi }\gamma _\mu A^\mu \psi -m \bar{\psi } \psi - \frac{1}{4}F_{\mu \nu }F^{\mu \nu } \, . \end{aligned}$$
(6.135)
Differentiating with respect to $$\bar{\psi }$$, one finds
$$\begin{aligned} i \gamma ^\mu \partial _\mu \psi - m \psi = e \gamma _\mu A^\mu \psi \, . \end{aligned}$$
(6.136)
This is the Dirac equation including electrodynamics, as we have seen when discussing the minimal coupling prescription.
Let us now apply the Euler–Lagrange equations this time to the field $$A_\mu $$ in the Lagrangian (6.133):
$$\begin{aligned} \partial _\nu \left( \frac{\partial \mathcal {L}}{\partial ( \partial _\nu A_\mu )} \right) - \frac{\partial \mathcal {L}}{\partial A_\mu } = 0\,. \end{aligned}$$
(6.137)
We find
$$ \partial _\nu \left( \frac{\partial \mathcal {L}}{\partial ( \partial _\nu A_\mu )} \right) = \partial _\nu \left( \partial ^\mu A^\nu - \partial ^\nu A^\mu \right) \; ; \; \frac{\partial \mathcal {L}}{\partial A_\mu } = -e\bar{\psi } \gamma ^\mu \psi $$
and substituting these two terms into (6.137) gives:
$$\begin{aligned} \partial _\nu F^{\nu \mu } = e \bar{\psi } \gamma ^\mu \psi \, . \end{aligned}$$
(6.138)
For the spinor matter fields, the current takes the simple form:
$$\begin{aligned} j^{\mu }(x) = \sum _i q_i \bar{\psi }_i(x)\gamma ^{\mu }\psi _i(x) \end{aligned}$$
(6.139)
where $$q_i$$ is the charge of the field $$\psi _i$$ in units of e. The equation
$$\begin{aligned} \partial _\nu F^{\nu \mu } = j^\mu \end{aligned}$$
(6.140)
is equivalent, as we discussed in Chap. 2, to the nonhomogeneous Maxwell equations. Notice that the two homogeneous Maxwell equations
$$\begin{aligned} \epsilon _{\mu \nu \rho \sigma } F^{\mu \nu }F^{\rho \sigma } = 0 \end{aligned}$$
are automatically satisfied due to the definition of the tensor $$F^{\mu \nu }= \partial _\mu A_\nu - \partial _\nu A_\mu $$ when we impose the Lorenz gauge $$\partial _{\mu } A^\mu = 0$$.
Again, if we impose the Lorenz gauge $$\partial _{\mu } A^\mu = 0$$,
$$\begin{aligned} \Box A^{\mu }=e\bar{\psi } \gamma ^{\mu } \psi \, , \end{aligned}$$
(6.141)
which is a wave equation for the four-potential—the QED version of the classical Maxwell equations in the Lorenz gauge.
Notice that the Lagrangian (6.133) of QED, based on a local gauge invariance, contains all the physics of electromagnetism. It reflects also some remarkable properties, confirmed by the experiments:
  • The interaction conserves separately PC,  and T.

  • The current is diagonal in flavor space (i.e., it does not change the flavors of the particles).

We can see how the massless electromagnetic field $$A^\mu $$ “appears” thanks the gauge invariance. This is the basis of QED, quantum electrodynamics.

If a mass $$m \ne 0$$ were associated with A, this new field would enter in the Lagrangian with a Proca term
$$\begin{aligned} -\frac{1}{2} F^{\mu \nu }F_{\mu \nu } + m A^\mu A_\mu \end{aligned}$$
(6.142)
which is not invariant under local phase transformation. The field must, thus, be massless.

Summarizing, the requirement of local phase invariance under U(1), applied to the free Dirac Lagrangian, generates all of electrodynamics and specifies the electromagnetic current associated to Dirac particles; moreover, it introduces a massless field which can be interpreted as the photon. This is QED.

Notice that introducing local phase transformations just implies a simple difference in the calculation of the derivatives: we pick up an extra piece involving $$A^\mu $$. We replace the derivative with the covariant derivative
$$\begin{aligned} \partial ^\mu \rightarrow D^\mu = \partial ^\mu + iqA^\mu \end{aligned}$$
(6.143)
and the invariance of the Lagrangian is restored. Substituting $$\partial ^\mu $$ with $$D^\mu $$ transforms a globally invariant Lagrangian into a locally invariant one.

6.2.7 An Introduction to Feynman Diagrams: Electromagnetic Interactions Between Charged Spinless Particles

Electrons and muons have spin 1/2; but, for a moment, let us see how to compute transition probabilities in QED in the case of hypothetical spinless charged point particles, since the computation of the electromagnetic scattering amplitudes between charged spinless particles is much simpler.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig8_HTML.gif
Fig. 6.8

Left: Schematic representation of the first-order interaction of a particle in a field. Right: Schematic representation (Feynman diagram) of the first-order elastic scattering of two charged nonidentical particles

6.2.7.1 Spinless Particles in an Electromagnetic Field

The scattering of a particle due to an interaction that acts only in a finite time interval can be described, as it was discussed in Sect. 2.​7, as the transition between an initial and a final stationary states characterized by well-defined momentum. The first-order amplitude for such transition is written, in relativistic perturbative quantum mechanics, as (see Fig. 6.8, left):
$$\begin{aligned} H'_{if}=-i\int {{\phi }^*_f(x)}V(x){ \ }{{\phi }_i(x)d}^{4}{x} \, . \end{aligned}$$
(6.144)
In the case of the electromagnetic field, the potential is given by (see Eq. 6.130) $$V(x)=ie\left( {\partial }_{\mu }A^{\mu }\,+\,{A}_{\mu }{\partial }^{\mu }\right) $$ and
$$\begin{aligned} H'_{if}=\ e\int {{\phi }^*_f(x)}\left( {\partial }_{\mu }A^{\mu }{+A}_{\mu }{\partial }^{\mu }\right) { \ }{{\phi }_i(x)d}^{4}{x}\, . \end{aligned}$$
(6.145)
Integrating by parts assuming that the field $$A^{\mu }$$ vanishes at $$t \rightarrow \pm \infty $$ or $$x \rightarrow \pm \infty $$
$$\begin{aligned} \int {{\phi }^*_f(x){\partial }_{\mu }\left( A^{\mu } \phi _i \right) d^4 x} = - \int { {\partial }_{\mu }\left( \phi _f^* \right) A^{\mu } \phi _i d^4 x} \end{aligned}$$
and introducing a “transition” current $$j^{if}_{\mu }$$ between the initial and final states defined as:
$$\begin{aligned} j^{if}_{\mu }{ \ }=\ ie\left( {\phi }^*_f\left( {\partial }_{\mu }{\phi }_i\right) { -}\left( {\partial }_{\mu }{\phi }^*_f\right) {\phi }_i\right) \, , \end{aligned}$$
this amplitude can be transformed into:
$$\begin{aligned} H'_{if}=\ -i\int {j^{if}_{\mu }}A^{\mu }{ \ }d^4{ x} \, . \end{aligned}$$
(6.146)
In the case of plane waves describing particles with charge e, the current $$j^{if}_{\mu }$$ can be written as:
$$\begin{aligned} j^{if}_{\mu }{ \ }=eN_iN_f{\left( p_i+p_f\right) }_{\mu }e^{i\left( p_f-p_i\right) x}\,. \end{aligned}$$
(6.147)
Considering now, as an example, the classical case of the Rutherford scattering (i.e., the elastic scattering of a spin-0 positive particle with charge e by a Coulomb potential originated by a static point particle (infinite mass) with a charge Ze in the origin), we have:
$$\begin{aligned} A^{\mu }={\left( V, 0\right) } \end{aligned}$$
with
$$\begin{aligned} V(r)=\frac{1}{4\pi } \frac{Ze}{r}\, . \end{aligned}$$
Then
$$\begin{aligned} H'_{if}=\ -i\int {N_iN_f{\left( E_i+E_f\right) }e^{i\left( p_f-p_i\right) x}}\frac{1}{4\pi } \frac{Ze^2}{r}{ \ }d^4{ x} \, . \end{aligned}$$
(6.148)
Factorizing the integrals in time and space and remarking that $$r=|\mathbf {x}|$$
$$\begin{aligned} H'_{if}=\ -iN_iN_fZe^2{\left( E_i+E_f\right) } \int {e^{i\left( E_f-E_i\right) t}dt} \int {e^{i\left( \mathbf {p_f}-\mathbf {p_i}\right) \cdot {\mathbf {r}}}\frac{1}{4\pi r}{ \ }d^{3}x}\, . \end{aligned}$$
(6.149)
The first integral is in fact a $$\delta $$ function which ensures energy conservation (there is no recoil of the scattering point particle and therefore no energy transfer),
$$\begin{aligned} \int {e^{i\left( E_f-E_i\right) t}dt} = 2\pi \delta \left( E_f-E_i\right) \, , \end{aligned}$$
(6.150)
while the second integral gives
$$\begin{aligned} \int {e^{iq\cdot {r}} \frac{1}{4\pi r}d^{3}x}= \frac{1}{\mathbf {q}^2}, \end{aligned}$$
(6.151)
where
$$\begin{aligned} \mathbf {q}=\mathbf {p_f}-\mathbf {p_i} \end{aligned}$$
is the transfered momentum.
The transition amplitude for the Rutherford scattering is, in this way, given by:
$$\begin{aligned} H'_{if}=\ -iN_iN_f 2\pi \delta \left( E_f-E_i\right) {\left( E_i+E_f\right) } \frac{Ze^2}{\mathbf {q}^2} { \ }\, . \end{aligned}$$
(6.152)
The corresponding differential cross section can now be computed applying the relativistic Fermi golden rule discussed in Chap. 2:
$$\begin{aligned} d\sigma = \frac{1}{flux} \frac{2\pi }{\hbar }\frac{|H'_{if}|^2}{\prod _{i=1}^{n_i} 2 E_i} \rho _{n_f}\, . \end{aligned}$$
(6.153)
Taking into account the convention adopted for:
  • the invariant wave function normalization factor:
    $$\begin{aligned} N_i=N_f=\frac{1}{\sqrt{2E}} \, , \end{aligned}$$
  • the invariant phase space:
    $$\begin{aligned} \rho _{n_f} = \prod _{i=1}^{n_f} \frac{1}{\left( 2\pi \right) ^3} \frac{d^3\mathbf {p_f}}{2E_f}\, , \end{aligned}$$
  • the incident flux for a single incident particle:
    $$\begin{aligned} flux= |\mathbf {v_i}|2 E_i=2|\mathbf {p_i}| \, , \end{aligned}$$
then
$$\begin{aligned} d\sigma = \frac{2\pi \delta \left( E_f-E_i\right) }{2|\mathbf {p_i}|} {\left( \frac{{\left( E_i+E_f\right) }Ze^2}{\mathbf {q}^2}\right) }^2 \frac{1}{\left( 2\pi \right) ^3} \frac{d^3\mathbf {p_f}}{2E_f} \, . \end{aligned}$$
(6.154)
Since
$$\begin{aligned} d^3\mathbf {p_f}= {p_f}^2 d{p_f} d\varOmega \, , \end{aligned}$$
$$\begin{aligned} {p_f} d{p_f}= E_f dE \, , \end{aligned}$$
$$\begin{aligned} {q}^2= 4 {p_i}^2 \sin ^2{\frac{\theta }{2} } \, , \end{aligned}$$
we find again the Rutherford differential cross section, previously obtained in the Classical Mechanics and in the nonrelativistic quantum mechanical frameworks (Chap. 2):
$$\begin{aligned} \frac{d\sigma }{d\varOmega }= \frac{Z^2e^4}{64 \pi ^2 {E_i}^2 \sin ^4{\frac{\theta }{2} } } \, . \end{aligned}$$
(6.155)

6.2.7.2 Elastic Scattering of Two Nonidentical Charged Spinless Particles

The interaction of two charged particles can be treated as the interaction of one of the particles with the field created by the other (which thus acts as the source of the field).

The initial and final states of particle 1 are labeled as the states A and C, respectively, while for the particle 2 (taken as the source of the field) the corresponding labels are B and D (see Fig. 6.8, right). Let us assume that particles 1 and 2 are not of the same type (otherwise they would be indistinguishable) and have charge e. Then:
$$\begin{aligned} H'_{if}=\ e\int {j^{AC}_{\mu }}A^{\mu }{ \ }d^4{ x} \end{aligned}$$
(6.156)
with
$$\begin{aligned} j^{AC}_{\mu }{ \ }=eN_AN_C{\left( p_A+p_C\right) }_{\mu }e^{i\left( p_C-p_A\right) x} \, . \end{aligned}$$
(6.157)
Being $$A^{\mu }$$ generated by the current associated with particle 2 (see Sect. 6.2.1)
$$\begin{aligned} {\Box A}^{\mu }=j^{\mu }_{BD} \end{aligned}$$
(6.158)
with
$$\begin{aligned} j^{\mu }_{BD}{ \ }=eN_BN_D{\left( p_B+p_D\right) }^{\mu }e^{i\left( p_D-p_B\right) x} \, , \end{aligned}$$
(6.159)
defining the exchanged four-momentum q as:
$$\begin{aligned} q=\left( p_D-p_B\right) \ =\ \left( p_A-p_C\right) \end{aligned}$$
and since
$$\begin{aligned} {\Box } e^{i q.x}= - q^2 e^{i q.x} \end{aligned}$$
(6.160)
the field $$A^{\mu }$$ is given by
$$\begin{aligned} A^{\mu }={-\frac{1}{q^2}j^{\mu }_{BD}} . \end{aligned}$$
(6.161)
Therefore
$$\begin{aligned} H'_{if}=\ -i\int {j^{AC}_{\mu }}{\left( -\frac{1}{q^2}\right) j^{\mu }_{BD}}{ \ }d^4{ x} \, = -i\int {j^{\mu }_{ AC}}{\left( -\frac{g_{\mu \nu }}{q^2}\right) }j^\nu _{BD}{ \ }d^4{x}. \end{aligned}$$
(6.162)
Solving the integral ($$\int { e^{i x \left( p_C+p_D-p_A-p_B\right) } \ d^4{ x} }= {\left( 2\pi \right) }^4{\delta }^4\left( p_A+p_B-p_C-p_D\right) $$):
$$\begin{aligned} H'_{if}=\ -i\ N_AN_BN_CN_D{\left( 2\pi \right) }^4{\delta }^4\left( p_A+p_B-p_C-p_D\right) \ {\mathcal M} \, \end{aligned}$$
(6.163)
where $$\delta ^4()$$ ensures the conservation of energy–momentum, and the amplitude $${\mathcal M}$$ is defined as
$$\begin{aligned} i{\mathcal M}=\left( ie{\left( p_A+p_C\right) }^{\mu }\right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie{\left( p_B+p_D\right) }^\nu \right) \, . \end{aligned}$$
(6.164)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig9_HTML.gif
Fig. 6.9

Scattering of two charged particles in the center-of-mass reference frame

With $$\theta $$ the scattering angle in the center-of-mass (c.m.) reference frame (see Fig. 6.9) and p the module of momentum still in the c.m., the four-vectors of the initial and final states at high-energy $$\left( E\gg m\ \right) $$ can be written as
$$\begin{aligned} p_A= & {} (p,p, 0,0)\\ p_B= & {} (p,-p, 0,0)\\ p_C= & {} (p,p\, \cos \theta {,}\,p\, {\sin \theta }{,}\, 0)\\ p_D= & {} (p,-p\, \cos \theta {,}\,-p\, {\sin \theta }{,}\, 0)\, . \end{aligned}$$
Then:
$$\begin{aligned} \left( p_A+p_C\right)= & {} (2p, p\, (1+\cos \theta ),p\, {\sin \theta },\, 0)\\ \left( p_B+p_D\right)= & {} (2p,-p\, (1+\cos \theta ),-p\, {\sin \theta },\, 0)\\ q=\left( p_D-p_B\right)= & {} (0,p\, \left( 1-{\cos \theta }\right) ,-p\, {\sin \theta },\, 0) \end{aligned}$$
and
$$\begin{aligned} {\mathcal M}=-e^2\frac{1}{q^2}\, \left( {\left( p_A+p_C\right) }^0{\ \left( p_B+p_D\right) }^0-\sum ^3_{i=1}{{\left( p_A+p_C\right) }^i{\ \left( p_B+p_D\right) }^i}\right) \end{aligned}$$
$$\begin{aligned} {\mathcal M}=-e^2\frac{1}{p^2\, (1-\cos \theta )^2+p^2\, {\sin ^2 \theta \ }}\left( 4p^2+p^2\, (1+\cos \theta )^2+p^2\, {{\sin ^2 \theta \ }} \right) \end{aligned}$$
$$\begin{aligned} {\mathcal M}=-e^2\frac{(3+ \cos \theta )}{(1- \cos \theta )} \, . \end{aligned}$$
(6.165)
On the other hand, the differential cross section of an elastic two-body scattering between spinless nonidentical particles in the c.m. frame is given by (see Sect. 2.​9.​7):
$$\begin{aligned} \frac{d\sigma }{d\varOmega }=\frac{{\ \left| {\mathcal M}\right| }^2}{64{\pi }^2s} \end{aligned}$$
where $${s=\left( E_A+E_B\right) }^2$$ is the square of the c.m. energy (s is one of the Mandelstam variables, see Sect. 2.​9.​6).
Thus:
$$\begin{aligned} \frac{d\sigma }{d\varOmega }=\frac{{\alpha }^2}{4s}\ \frac{{\left( 3{{\,+\cos } \theta }\right) }^2}{{\left( 1-{\cos \theta }\right) }^2} \end{aligned}$$
(6.166)
where
$$\begin{aligned} \alpha =\frac{e^2}{4\pi } \end{aligned}$$
(6.167)
is the fine structure constant.

Note that when $${\cos \theta \ }\rightarrow 1$$ the cross section diverges. This fact is a consequence of the infinite range of the electromagnetic interactions, translated into the fact that photons are massless.

6.2.7.3 Feynman Diagram Rules

The invariant amplitude computed in the previous subsection,
$$\begin{aligned} i{\mathcal M}=\left( ie{\left( p_A+p_C\right) }^{\mu }\right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie{\left( p_B+p_D\right) }^\nu \right) \, , \end{aligned}$$
can be obtained directly from the Feynman diagram (Fig. 6.8, right) using appropriate “Feynman rules.”
In particular, for this simple case, the different factors present in the amplitude are:
  • the vertex factors: $$\left( ie{\left( p_A+p_C\right) }^{\mu }\right) $$, corresponding to the vertex A-C-photon, and $$\left( ie{\left( p_B+p_D\right) }^\nu \right) $$, corresponding to the vertex B-D-photon;

  • the propagator factor: $$\left( {-ig_{\mu \nu }}/{q^2}\right) $$, corresponding to the only internal line, the exchanged photon, existing in the diagram.

The energy–momentum is conserved at each vertex, which is trivially ensured by the definition of $${q^2}$$.

6.2.8 Electron–Muon Elastic Scattering ($$e^- \mu ^- \rightarrow e^- \mu ^- $$)

Electron and muon have spin 1/2 and are thus described by Dirac bi-spinors (see Sect. 6.2.4). The computation of the scattering amplitudes is more complex than the one discussed in the previous subsection for the case of spinless particles but the main steps, summarized hereafter, are similar.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig10_HTML.gif
Fig. 6.10

Lowest-order Feynman diagram for electron–muon scattering

The Dirac equation in presence of an electromagnetic field is written as
$$\begin{aligned} \left( i{\gamma }^{\mu }{\partial }_{\mu }-e\ {\gamma }^{\mu }A_{\mu }-m\right) { \ }\psi =0. \end{aligned}$$
(6.168)
The corresponding current is
$$\begin{aligned} j_{\mu }(x)=\ -e \overline{\psi }{\ \gamma }_{\mu }\psi \, . \end{aligned}$$
(6.169)
The transition amplitude for the electron (states A and C)/muon (states B and D) scattering can then be written as (Fig. 6.10):
$$\begin{aligned} H'_{if}=\ -i\int {j^{elect}_{\mu }}{\left( -\frac{1}{q^2}\right) }j^{\mu }_{muon}{ \ }d^4{ x}= -i\int {j^{\mu }_{ \mathrm elect}}{\left( -\frac{g_{\mu \nu }}{q^2}\right) }j^\nu _{muon}{ \ }d^4{ x} \end{aligned}$$
(6.170)
where
$$\begin{aligned} j^{elect}_{\mu }= & {} -e\left( {\bar{u}}_C{\gamma }_{\mu }u_A\right) e^{-iqx} \end{aligned}$$
(6.171)
$$\begin{aligned} j^{\mu }_{muon}= & {} -e\left( {\bar{u}}_D{\gamma }^{\mu }u_B\right) e^{iqx} \end{aligned}$$
(6.172)
with
$$\begin{aligned} q=\left( p_D-p_B\right) \ =\ \left( p_A-p_C\right) \, . \end{aligned}$$
Solving the integral,
$$\begin{aligned} H'_{if}=\ -i\ N_A N_B N_C N_D{\left( 2\pi \right) }^4{\delta }^4\left( p_A+p_B-p_C-p_D\right) \ {\mathcal M} \end{aligned}$$
(6.173)
where the amplitude $${\mathcal M}$$ is given by
$$\begin{aligned} -i{\mathcal M}=\left( ie\left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{u}}_D{\gamma }^\nu u_B\right) \right) \, . \end{aligned}$$
(6.174)
The cross section is proportional to the square of the transition amplitude $${\left| {\mathcal M}\right| }^2$$ (see the Fermi golden rule—Chap. 2). However, the amplitude written above depends on the initial and final spin configurations. In fact, as there are four possible initial configurations (two for the electron and two for the muon) and also four possible final configurations, there are sixteen such amplitudes to be computed. Using the orthogonal helicity state basis (Sect. 6.2.4.3), each of these amplitudes are independent (there is no interference between the corresponding processes) and can be labeled according to the helicities of the corresponding initial and final states. For instance, if all the states have Right (positive) helicity the amplitude is labelled as $$\mathcal M_{RR \rightarrow RR} $$.
In the case of an experiment with unpolarized beams (all the initial helicities configurations are equiprobable) and in which no polarization measurements of the helicities of the final states are made, the corresponding cross section must be obtained averaging over the initial configurations and summing over the final ones. A mean squared amplitude is then defined as:
$$\begin{aligned} {<\left| {\mathcal M}\right| }^2>= \frac{1}{4} ( \mathcal M_{RR \rightarrow RR}^2 + \mathcal M_{RR \rightarrow RL}^2+ ...+\mathcal M_{LL \rightarrow LL}^2) \end{aligned}$$
(6.175)
Luckily, in the limit of high energies (whenever the electron and the muon masses can be neglected), many of these amplitudes are equal to zero. Taking for example $$\mathcal M_{RR \rightarrow RL}$$,
$$\begin{aligned} -i\mathcal M_{RR \rightarrow RL}=\left( ie\left( {\bar{u}_{\uparrow C}}{\gamma }^{\mu }{u_{\uparrow A}}\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{u}_{\downarrow D}}{\gamma }^{\nu }{u_{\uparrow B}}\right) \right) \, ; \end{aligned}$$
(6.176)
the last factor corresponding to the muonic current is equal to zero,
$$\begin{aligned} ie\left( {\bar{u}_{\downarrow D}}{\gamma }^{\nu }{u_{\uparrow B}}\right) =0 \, . \end{aligned}$$
(6.177)
Indeed, remembering the definitions of the helicity eigenvectors (Eqs. 6.102, Sect. 6.2.4.3), and of the $$\gamma ^0$$ matrix (Sect. 6.2.4) and working in the c.m. frame ($$\theta ^*_B=\pi , \phi ^*_B=\pi $$), ($$\theta ^*_D=(\pi -\theta ^*) , \phi ^*_D=\pi $$), $$(E^*=E_A^*=E_B^*=E_C^*=E_D^*)$$:
$$\begin{aligned} {u_{\uparrow B}}\,= \sqrt{{E^*}} \left( \begin{array}{c} 1 \\ 0 \\ 1 \\ 0 \end{array} \right) \; ; \; {u_{\downarrow D}}\,= \sqrt{{E^*}} \left( \begin{array}{c} - \cos ({\theta ^*}/{2}) \\ -\sin ({\theta ^*}/{2}) \\ \cos ({\theta ^*}/{2}) \\ \sin ({\theta ^*}/{2}) \end{array} \right) \end{aligned}$$
(6.178)
and since
$$\begin{aligned} {\bar{u}_{\downarrow D}}=({u_{\downarrow }^T}_D)^*{\gamma }^{0} \end{aligned}$$
(6.179)
$$\begin{aligned} {\bar{u}_{\downarrow D}}=\sqrt{{E^*}} \left( - \cos \frac{\theta ^*}{2},-\sin \frac{\theta ^*}{2}, -\cos \frac{\theta ^*}{2},-\sin \frac{\theta ^*}{2} \right) \end{aligned}$$
(6.180)
then
$$\begin{aligned} {\bar{u}_{\downarrow D}}{\gamma }^{0 }{u_{\uparrow B}}= {\bar{u}_{\downarrow D}}{\gamma }^{1 }{u_{\uparrow B}}={\bar{u}_{\downarrow D}}{\gamma }^{2 }{u_{\uparrow B}}= {\bar{u}_{\downarrow D}}{\gamma }^{3 }{u_{\uparrow B}}=0 \, . \end{aligned}$$
(6.181)
The only amplitudes that are nonzero are those where the helicity of the electron and the helicity of the muon are conserved, i.e.,
$$ \mathcal M_{RR \rightarrow RR} ; \mathcal M_{RL \rightarrow RL};\mathcal M_{LR \rightarrow LR};\mathcal M_{LL \rightarrow LL} \, . $$
This fact is a direct consequence of the conservation of chirality in the QED vertices and that, in the limit of high energies, chirality and helicity coincide (see Sect. 6.3.4). If the fermions masses cannot be neglected, all the currents are nonzero but the total angular momentum of the interaction will be conserved, as it should. In this case, the computation of the amplitudes is more complex but the sum over all internal indices and products of $$\gamma $$ matrices can be considerably simplified using the so-called trace theorems (for a pedagogical introduction see for instance the books of Thomson [F6.1] and of Halzen and Martin [F6.6]).
In the case of unpolarized beams, of no polarization measurements of the helicities of the final states and whenever masses can be neglected, the mean squared amplitude is thus:
$$\begin{aligned} {<{\mathcal M}}^2>= \frac{1}{4} ( \mathcal M_{RR \rightarrow RR}^2 + \mathcal M_{RL \rightarrow RL}^2+ \mathcal M_{LR \rightarrow LR}^2+\mathcal M_{LL \rightarrow LL}^2) \end{aligned}$$
(6.182)
Each of the individual amplitudes are expressed as a function of the electronic and muonic currents which can be computed following a similar procedure of the one sketched above for the computation of $$\left( ie\left( {\bar{u}_{\downarrow D}}{\gamma }^\nu {u_{\uparrow B}}\right) \right) $$. The relevant four-vector currents are:
$$\begin{aligned} {\bar{u}_{\uparrow C}}{\gamma }^\nu {u_{\uparrow }}_A= & {} 2 E^*\left( \cos \frac{\theta ^*}{2},\, \sin \frac{\theta ^*}{2},\, i\sin \frac{\theta ^*}{2},\, \cos \frac{\theta ^*}{2}\right) \end{aligned}$$
(6.183)
$$\begin{aligned} {\bar{u}_{\downarrow C}}{\gamma }^\nu {u_{\downarrow }}_A= & {} 2 E^*\left( \cos \frac{\theta ^*}{2},\, \sin \frac{\theta ^*}{2},\, -i\sin \frac{\theta ^*}{2},\, \cos \frac{\theta ^*}{2}\right) \end{aligned}$$
(6.184)
$$\begin{aligned} {\bar{u}_{\uparrow D}}{\gamma }^\nu {u_{\uparrow B}}= & {} 2 E^*\left( \cos \frac{\theta ^*}{2},\, -\sin \frac{\theta ^*}{2},\, i\sin \frac{\theta ^*}{2},\, -\cos \frac{\theta ^*}{2}\right) \end{aligned}$$
(6.185)
$$\begin{aligned} {\bar{u}_{\downarrow D}}{\gamma }^\nu {u_{\downarrow B}}= & {} 2 E^*\left( \cos \frac{\theta ^*}{2},\, -\sin \frac{\theta ^*}{2},\, -i\sin \frac{\theta ^*}{2},\, -\cos \frac{\theta ^*}{2}\right) \end{aligned}$$
(6.186)
and the amplitudes are given by:
$$\begin{aligned} \mathcal M_{RR \rightarrow RR}= & {} \left( ie\left( {\bar{u}_{\uparrow C}}{\gamma }^\nu {u_{\uparrow }}_A\right) \right) \frac{-ig_{\mu \nu }}{q^2}\left( ie\left( {\bar{u}_{\uparrow D}}{\gamma }^\nu {u_{\uparrow B}}\right) \right) = -\frac{4 e^2}{(1- \cos \theta ^*)}\qquad \end{aligned}$$
(6.187)
$$\begin{aligned} \mathcal M_{RL \rightarrow RL}= & {} \left( ie\left( {\bar{u}_{\uparrow C}}{\gamma }^\nu {u_{\uparrow }}_A\right) \right) \frac{-ig_{\mu \nu }}{q^2}\left( ie\left( {\bar{u}_{\downarrow D}}{\gamma }^\nu {u_{\downarrow B}}\right) \right) = - 2 e^2 \left( \frac{1 + \cos \theta ^*}{1 - \cos \theta ^*} \right) \qquad \end{aligned}$$
(6.188)
$$\begin{aligned} \mathcal M_{LR \rightarrow LR}= & {} \left( ie\left( {\bar{u}_{\downarrow C}}{\gamma }^\nu {u_{\downarrow }}_A\right) \right) \frac{-ig_{\mu \nu }}{q^2}\left( ie\left( {\bar{u}_{\uparrow D}}{\gamma }^\nu {u_{\uparrow B}}\right) \right) = - 2 e^2 \left( \frac{1 + \cos \theta ^*}{1 - \cos \theta ^*} \right) \qquad \end{aligned}$$
(6.189)
$$\begin{aligned} \mathcal M_{LL \rightarrow LL}= & {} \left( ie\left( {\bar{u}_{\downarrow C}}{\gamma }^\nu {u_{\downarrow }}_A\right) \right) \frac{-ig_{\mu \nu }}{q^2}\left( ie\left( {\bar{u}_{\downarrow D}}{\gamma }^\nu {u_{\downarrow B}}\right) \right) = -\frac{4 e^2}{(1- \cos \theta ^*)} \, . \end{aligned}$$
(6.190)
The angular dependence of the denominators reflects the t channel character of this interaction ($${q^2}=t \propto {(1- \cos \theta ^*)}$$) while the angular dependence of the numerators reflects the total angular momentum of the initial and final states ($$\mathcal M_{RR \rightarrow RR}$$ and $$\mathcal M_{LL \rightarrow LL} $$ correspond to initial and final states with a total angular momentum $$J=0$$, the other two amplitudes correspond to initial and final states with a total angular momentum $$J=1$$).
The mean squared amplitude (6.2.8) is now easily computed to be:
$$\begin{aligned} {<{\mathcal M}}^2>= 8 e^4 \frac{4+{(1+\cos \theta ^*)}^2}{(1-\cos \theta ^*)^2}. \end{aligned}$$
(6.191)
This amplitude is often expressed in terms of the Mandelstam variables stu, as:
$$\begin{aligned} {<{\mathcal M}}^2>= 2 e^4 \frac{s^2+u^2}{t^2}, \end{aligned}$$
(6.192)
since, in this case, $$s= 4 {E^*}^2$$, $$t= -2 {E^*}^2(1-\cos \theta ^*)$$ and $$u= -2 {E^*}^2(1+\cos \theta ^*)$$.
Remembering once again the Fermi golden rule for the differential cross section of two body elastic scattering discussed in Chap. 2, we have then in the c.m. reference frame:
$$\begin{aligned} \frac{d\sigma }{d\varOmega }= \frac{1}{64\pi ^2} \frac{1}{s} {<{\mathcal M}}^2>= \frac{{\alpha }^2}{2s}\ \frac{{{ 1\,+\,} {{\cos }}^4\left( \theta ^* /2\right) \ }}{{{\sin }}^4\left( \theta ^*/2\right) } \end{aligned}$$
(6.193)
which in the laboratory reference frame (muon at rest) is converted to:
$$\begin{aligned} \frac{d\sigma }{d\varOmega }=\ \ \frac{{\alpha }^2{\cos }^2\left( \frac{\theta }{2}\right) }{4\ E^2{\sin }^4\left( \frac{\theta }{2}\right) }\frac{E'}{{ E}}\left( 1-\frac{q^2}{2m^2_{\mu }}\ \tan ^2 \frac{\theta }{2} \right) . \end{aligned}$$
(6.194)
This is the Rosenbluth formulareferred in Sect. 5.​5.​1.

6.2.9 Feynman Diagram Rules for QED

The invariant amplitude computed in the previous subsection,
$$\begin{aligned} -i{\mathcal M}=\left( ie\left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{u}}_D{\gamma }^\nu u_B\right) \right) \, , \end{aligned}$$
can be obtained directly from the Feynman diagram (Fig. 6.10) using appropriate “Feynman rules.”

The Feynman rules consist in drawing all topologically distinct and connected Feynman diagrams for a given process and making the product of appropriate multiplicative factors associated with the various elements of each diagram.

In particular the different factors present in the amplitude computed in the previous subsection are:
  • the vertex factors: $$ie{\gamma }^{\mu }$$;

  • the propagator factor: $$\left( {-ig_{\mu \nu }}/{q^2}\right) $$, corresponding to the only internal line, the exchanged photon;

  • the external lines factors: for the initial particles A and B, the spinors $$u_A$$ and $$\ u_B$$; for the final particles C and D, the adjoint spinors $${\bar{u}}_C$$ and $${\bar{u}}_D$$,

and again energy–momentum conservation is imposed at each vertex.
The Dirac currents (e.g., $$\left( ie\left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) \right) $$) involve both the electric and magnetic interactions of the charged spin 1/2 particles. This can be explicitly shown using the so-called Gordon decomposition of the vectorial current,
$$\begin{aligned} ie\left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) =\frac{ie}{2m}{\bar{u}}_C\left( {\left( p_A+p_C\right) }^{\mu }+i\sigma ^{\mu \nu }{\left( p_C-p_A\right) }_{\nu }\right) u_A \end{aligned}$$
(6.195)
where the tensor $$\sigma ^{\mu \nu }$$ is defined as
$$\begin{aligned} \sigma ^{\mu \nu }=\frac{i}{2}{\left( \gamma ^\mu \gamma ^\nu -\gamma ^\nu \gamma ^\mu \right) } \, . \end{aligned}$$
(6.196)
Higher-order terms correspond to more complex diagrams which may have internal loops and fermion internal lines (see Fig. 6.14). In this case, the factor associated with each internal fermion line is
$$\begin{aligned} \left( \frac{i\left( {\gamma }^{\mu }p_{\mu }+m\right) }{p^2-m^2}\right) \end{aligned}$$
and one should not forget that every internal four-momentum loop has to be integrated over the full momentum range.

The complete set of the Feynman diagram rules for the QED should involve thus all the possible particles and antiparticles (spin 0, 1/2, spin 1) in the external and internal lines.

Multiplicative factors associated with each element of Feynman diagrams in the Feynman rules are summarized in Table 6.1) (from Ref. [F6.6]).
Table 6.1

Feynman rules for $$-i\mathcal {M}$$

 

Multiplicative factor

$$\bullet $$ External Lines

Spin-0 boson

1

Spin-$$\frac{1}{2}$$ fermion (in, out)

$$u,\ \overline{u}$$

Spin-$$\frac{1}{2}$$ antifermion (in, out)

$$\overline{v}, \ v$$

Spin-1 photon (in, out)

$$\epsilon _\mu ,\ \epsilon _\mu ^*$$

$$\bullet $$ Internal Lines − Propagators

Spin-0 boson

$$\frac{i}{p^2- m^2}$$

Spin-$$\frac{1}{2}$$ fermion

$$\frac{i(\not p + m)}{p^2- m^2}$$

Massive spin-1 boson

$$\frac{-i(g_{\mu \nu } - p_\mu p_\nu /M^2)}{p^2- M^2}$$

Massless spin-1 boson

$$\frac{-i g_{\mu \nu }}{p^2}$$

(Feynman gauge)

 

$$\bullet $$ Vertex Factors

Photon−spin-0 (charge e)

$$-ie (p + p')^\mu $$

Photon−spin-$$\frac{1}{2}$$ (charge e)

$$-ie \gamma ^\mu $$

$$\bullet $$ Loops: $$\int d^4k/(2 \pi )^4$$ over loop momentum; include $$-1$$ if fermion loop and take the trace of associated $$\gamma $$-matrices

$$\bullet $$ Identical fermions: $$-1$$ between diagrams which differ only in $$e^- \leftrightarrow e^-$$ or initial $$e^- \leftrightarrow $$ final $$e^+$$

The total amplitude at a given order is then obtained adding up the amplitudes corresponding to all the diagrams that can be drawn up to that order. Minus signs (antisymmetrization) must be included between diagrams that differ only in the interchange of two incoming or outgoing fermions (or antifermions), or of an incoming fermion with an outgoing antifermion (or vice versa).

Some applications follow in the next subsections.

6.2.10 Muon Pair Production from $$e^- e^+$$ Annihilation ($$e^- e^+ \rightarrow \mu ^- \mu ^+ $$)

Applying directly the Feynman diagram rules discussed above the invariant amplitude for $$e^- e^+ \rightarrow \mu ^- \mu ^+ $$ (see Fig.  6.11) gives:
$$\begin{aligned} -i{\mathcal M}=\left( ie\left( {\bar{v}}_B{\gamma }^{\mu }u_A\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{u}}_D{\gamma }^\nu v_C\right) \right) \, , \end{aligned}$$
(6.197)
where the spinors v are used to describe the antiparticles.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig11_HTML.gif
Fig. 6.11

Lowest-order Feynman diagram for electron–positron annihilation into a muon pair

As we already know this amplitude depends on the initial and final spin configurations and each configuration can be computed independently. In the limit where masses can be neglected it can be shown, similarly to the case of the $$e^- \mu ^- \rightarrow e^- \mu ^-$$ channel discussed above, that only four helicity combinations give a nonzero result. These configurations correspond to $$J=\pm 1$$ initial and final states and:
$$\begin{aligned} \mathcal M_{RL \rightarrow RL}= & {} - e^2 {\left( 1+ \cos \theta ^*\right) }\\ \mathcal M_{RL \rightarrow LR}= & {} e^2 {\left( 1- \cos \theta ^*\right) }\\ \mathcal M_{LR\rightarrow RL}= & {} e^2 {(1- \cos \theta ^*)}\\ \mathcal M_{LR \rightarrow LR}= & {} - e^2 {\left( 1+ \cos \theta ^*\right) } \, , \end{aligned}$$
where $$\theta ^*$$ is the angle in the c.m. reference frame between the electron and the muon.

The angular dependence of these amplitudes could have been predicted observing the total angular momentum of the initial states. In fact, these amplitudes correspond, as stated before, to initial and final states with a total angular momentum $$J=\pm 1$$. The projection of the initial and final angular momentum along the beam direction $$J_Z$$ implies then, according to the quantum mechanics spin-1 rotation matrices, the factor $$(1\pm \cos \theta ^*)$$.

Once again, in the case of an experiment with unpolarized beams and in which no polarization measurements of the helicities of the final states are made, the cross section is obtained averaging over the initial configurations and summing over the final ones. The mean squared amplitude is therefore defined as:
$$\begin{aligned} {<\left| {\mathcal M}\right| }^2>= & {} \frac{1}{4} ( \mathcal M_{RL \rightarrow RL}^2 + \mathcal M_{RL \rightarrow LR}^2+ \mathcal M_{LR\rightarrow RL}^2 + \mathcal M_{LR\rightarrow LR}^2)=\nonumber \\= & {} \frac{1}{4} e^4 {[2{(1+\cos \theta ^*)}^2} +{2(1-\cos \theta ^*)^2 ]} = e^4 {(1+{\cos ^2 \theta ^*}) }.\qquad \qquad \end{aligned}$$
(6.198)
The differential cross section in the c.m. reference frame is then given by:
$$\begin{aligned} \frac{d\sigma }{d\varOmega }= \frac{1}{64\pi ^2} \frac{1}{s} {<{\mathcal M}}^2>= \frac{{\alpha }^2}{4 s} {(1+{\cos ^2 \theta ^*}) } \, . \end{aligned}$$
(6.199)
Finally, one should note that the mean squared amplitude obtained above can also be expressed in terms of the Mandelstam variables stu, as:
$$\begin{aligned} {<{\mathcal M}}^2>= 2 e^4 \frac{t^2+u^2}{s^2} \, . \end{aligned}$$
(6.200)
This formula is equivalent to the one obtained in the case of the elastic scattering of $$e^-$$ and $$ \mu ^- $$ (see Eq. 6.192) if one makes the following correspondences between the Mandelstam variables computed in the two channels:
$$\begin{aligned} {s^{pair}\rightarrow t^{scatt}} \; ; \; {t^{pair}\rightarrow u^{scatt}} \; ; \; {u^{pair}\rightarrow s^{scatt}}. \end{aligned}$$
(6.201)
In fact, the scattering (t channel) and the pair production (s channel) Feynman diagrams can be transformed in each other just exchanging an incoming (in) external line by an outgoing (out) external line and transforming in this operation the corresponding particle into its antiparticle with symmetric momenta and helicity (and vice versa). These exchanges are translated in exchanging the four-momenta as follows:
$$\begin{aligned} P^{scatt}_{in(e^-) }\rightarrow & {} P^{pair}_{in(e^-)} ; \\ P^{scatt}_{out(e^-)}\rightarrow & {} - P^{pair}_{in(e^+)};\\ P^{scatt}_{out(\mu ^-)}\rightarrow & {} P^{pair}_{out(\mu ^-)};\\ P^{scatt}_{in(\mu ^-)}\rightarrow & {} - P^{pair}_{out(\mu ^+)} \, . \end{aligned}$$
Such relations between amplitudes corresponding to similar Feynman diagrams are called Crossing Symmetries.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig12_HTML.gif
Fig. 6.12

Feynman diagrams contributing at first order to the Bhabha cross section

6.2.11 Bhabha Scattering ($$e^- e^+ \rightarrow e^- e^+ $$)

Two first-order (tree level) diagrams (Fig.  6.12) contribute to this process:
  • The first diagram corresponds to the exchange of a photon in the s channel and is, if masses are neglected, identical to the $$e^- e^+ \rightarrow \mu ^- \mu ^+$$ diagram we computed above:
    $$\begin{aligned} -i{\mathcal M_s}=\left( ie\left( {\bar{v}}_B{\gamma }^{\mu }u_A\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{u}}_D{\gamma }^\nu v_C\right) \right) \, . \end{aligned}$$
    (6.202)
  • The second diagram corresponds to the exchange of a photon in the t channel and is, if masses are neglected, similar (just exchanging a particle by an antiparticle) to the $$e^- \mu ^- \rightarrow e^- \mu ^-$$ diagram computed above:
    $$\begin{aligned} -i{\mathcal M_t}=\left( ie\left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) \right) \left( \frac{-ig_{\mu \nu }}{q^2}\right) \left( ie\left( {\bar{v}}_B{\gamma }^\nu v_D\right) \right) \, . \end{aligned}$$
    (6.203)
The total amplitude is the sum of these two amplitudes:
$$\begin{aligned} {\mathcal M}= {\mathcal M^s}-{\mathcal M^t}. \end{aligned}$$
(6.204)
The minus sign comes from the antisymmetrization imposed by the Fermi statistics, and it is included in the Feynman rules (see Sect. 6.2.9).
Remembering the amplitudes computed before for the s and t channels, the nonzero spin configuration amplitudes are:
$$\begin{aligned} \mathcal M_{RR \rightarrow RR}= & {} -\mathcal M^t_{RR \rightarrow RR};\\ \mathcal M_{RL \rightarrow RL}= & {} \mathcal M^s_{RL\rightarrow RL}-\mathcal M^t_{RL\rightarrow RL};\\ \mathcal M_{RL\rightarrow LR}= & {} \mathcal M^s_{RL \rightarrow LR};\\ \mathcal M_{LR\rightarrow LR}= & {} \mathcal M^s_{LR\rightarrow LR}-\mathcal M^t_{LR\rightarrow LR};\\ \mathcal M_{LR \rightarrow RL}= & {} \mathcal M^s_{LR \rightarrow RL};\\ \mathcal M_{LL\rightarrow LL}= & {} -\mathcal M^t_{LL \rightarrow LL}. \end{aligned}$$
One should note that the $$\mathcal M_{RL \rightarrow RL}$$ and $$\mathcal M_{LR \rightarrow LR}$$ amplitudes are the sum of two amplitudes corresponding to the s and t channels and therefore when squaring them interference terms will appear.
The mean squared amplitude is, in the case of an experiment with unpolarized beams and in which no polarization measurements of the helicities of the final states are made:
$$\begin{aligned}&<\left| {\mathcal M}\right| ^2>= \frac{1}{6} (\mathcal M_{RR \rightarrow RR}^2 + \mathcal M_{RL \rightarrow RL}^2 + \nonumber \\ {}&+ \mathcal M_{RL \rightarrow LR}^2+ \mathcal M_{LR\rightarrow RL}^2 + \mathcal M_{LR\rightarrow LR}^2+\mathcal M_{LL \rightarrow LL}^2 ) \end{aligned}$$
(6.205)
that, using the Mandelstam variables, gives (for a more detailed calculation see reference [F6.8]):
$$\begin{aligned} {<{\mathcal M}}^2>= 2 e^4 \left( \frac{t^2+{(s+t)}^2}{s^2}+\frac{s^2+{(s+t)}^2}{t^2}+2\frac{{(s+t)}^2}{st}\right) , \end{aligned}$$
(6.206)
or
$$\begin{aligned} {<{\mathcal M}}^2>= 2 e^4 \left( \frac{t^2+u^2}{s^2}+\frac{s^2+u^2}{t^2}+\frac{2u^2}{st}\right) . \end{aligned}$$
(6.207)
The first and second terms correspond to the mean squared amplitudes obtained, respectively, for the s and the t channels and the third is the contribution from the interference terms discussed above.
Since, in the center-of-mass reference frame,
$$\begin{aligned} t= - \frac{s}{2} (1+\cos \theta )=- s \cos ^2(\theta /2) \end{aligned}$$
and
$$\begin{aligned} u= - \frac{s}{2} (1-\cos \theta )=- s \sin ^2(\theta /2) \, , \end{aligned}$$
the mean squared amplitude can be expressed as:
$$\begin{aligned} {<{\mathcal M}}^2>= 2 e^4 \left( \frac{1+\cos ^2(\theta )}{2}+\frac{1+\cos ^4(\theta /2)}{\sin ^4(\theta /2)}-\frac{2\cos ^4(\theta /2)}{\sin ^2(\theta /2)}\right) \, . \end{aligned}$$
(6.208)
Finally the differential cross section in the c.m. reference frame is:
$$\begin{aligned} \frac{d\sigma }{d\varOmega }&= \frac{1}{64\pi ^2} \frac{1}{s} {<{\mathcal M}}^2> =\frac{\alpha ^2}{2s} \left( \frac{1+\cos ^2(\theta )}{2}+\frac{1+\cos ^4(\theta /2)}{\sin ^4(\theta /2)}-\frac{2\cos ^4(\theta /2)}{\sin ^2(\theta /2)}\right) = \nonumber \\&= \frac{\alpha ^2}{4s} \left( \frac{3+\cos ^2\theta }{1-\cos \theta }\right) ^2 \, . \end{aligned}$$
(6.209)
This differential cross section is highly peaked forward (in the limit of massless fermions it diverges).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig13_HTML.gif
Fig. 6.13

Differential Bhabha cross section measured by L3 collaboration at $$\sqrt{(}s)= 198$$ GeV.

From L3 Collaboration, Phys. Lett. B623 (2005) 26

The agreement between the QED predictions (including higher-order diagrams) and the experimental measurements is so remarkable (Fig.  6.13) that this process was used at LEP to determine the beam luminosity thanks to small but precise calorimeters installed at low angles.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig14_HTML.gif
Fig. 6.14

A higher-order diagram with a fermion loop

6.2.12 Renormalizationand Vacuum Polarization

High-order diagrams often involve closed loops where integration over momentum should be performed (see Fig. 6.14). As these loops are virtual, they represent phenomena that occur in timescales compatible with the Heisenberg uncertainty relations. Since there is no limit on the range of the integration and on the number of diagrams, the probabilities may a priori diverge to infinity. We shall see, however, that the effect of higher-order diagrams is the redefinition of some quantities; for example, the “bare” (naked) charge of the electron becomes a new quantity e that we measure in experiments. A theory with such characteristics—i.e., a theory for which the series of the contributions from all diagrams converges—is said to be renormalizable.

To avoid confusion in what follows, shall call now $$g_e$$ the “pure” electromagnetic coupling.

Following the example of the amplitude corresponding to the diagram represented in Fig. 6.14, the photon propagator is modified by the introduction of the integration over the virtual fermion/antifermion loop leading to
$$\begin{aligned} {{\mathcal M}}_2\sim \frac{-g_e^4}{q^4}\left( \left( {\bar{u}}_C{\gamma }^{\mu }u_A\right) \right) \left( \left( {\bar{u}}_D{\gamma }^{\mu }u_B\right) \right) \left( \int ^{\infty }_0{\frac{\left( \dots \right) }{\left( k^2-m^2\right) \left( {\left( k-q\right) }^2-m^2\right) }d^4k}\right) \end{aligned}$$
where $$g_e$$ is the “bare” coupling parameter $$(g_e=\sqrt{4\pi \alpha _{0} }$$, in the case of QED; $$\alpha _0$$ refers to the “bare” coupling, without renormalization).
The integral can be computed by setting some energy cutoff M and making $$M\rightarrow \infty $$ in the end of the calculation. Then it can be shown that
$$ {\lim }(M\rightarrow \infty )\left( \int ^M_0{\frac{\left( {\ldots }\right) }{\left( k^2-m^2\right) \left( {\left( k-q\right) }^2-m^2\right) }d^4k}\right) \sim \frac{q^2}{12{\pi }^2}\left[ \ln \left( \frac{M^2}{m^2}\right) -f\left( \frac{{-q}^2}{m^2}\right) \right] \, , $$
having $$\left( {\dots }\right) $$ dimensions of [m$$^2$$], and
$$\begin{aligned} {{\mathcal M}}_2\sim \frac{-g^2_e}{q^2}\left( {\dots }\right) \left( {\dots }\right) \left( 1-\frac{g_e^2}{12{\pi }^2}\left[ \ln \left( \frac{M^2}{m^2}\right) -f\left( \frac{{-q}^2}{m^2}\right) \right] \right) . \end{aligned}$$
The divergence is now logarithmic but it is still present.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig15_HTML.gif
Fig. 6.15

Higher-order diagrams with a fermion loop leading to the renormalization of the fermion mass (left) and of the magnetic moment (right)

The “renormalization miracle” consists in absorbing the infinity in the definition of the coupling parameter. Defining
$$\begin{aligned} g_R\equiv g_e\sqrt{1-\frac{g_e^2}{12{\pi }^2}\ln \left( \frac{M^2}{m^2}\right) } \end{aligned}$$
(6.210)
and neglecting $$g_e^6$$ terms (for that many other diagrams have to be summed up, but the associated probability is expected to become negligible)
$$\begin{aligned} {{\mathcal M}}_2\sim \frac{-{g_R}^2}{q^2}\left( {\dots }\right) \left( {\dots }\right) \left[ 1+\frac{{g_R}^2}{12{\pi }^2}f\left( \frac{{-q}^2}{m^2}\right) \right] \, . \end{aligned}$$
$${{\mathcal M}}_2\ $$ is no more divergent but the coupling parameter $$g_R$$ (the electric charge) is now a function of $$q^2$$:
$$\begin{aligned} g_R\left( q^2\right) =g_R\left( q^2_0\right) \sqrt{1+\frac{g_R\left( q^2_0\right) }{12{\pi }^2}f\left( \frac{{-q}^2}{m^2}\right) }\ \, . \end{aligned}$$
(6.211)
Other diagrams as those represented Fig. 6.15 lead to the renormalization of fundamental constants. In the left diagram, “emission” and “absorption” of a virtual photon by one of the fermion external lines contribute to the renormalization of the fermion mass, while in the one on the right, “emission” and “absorption” of a virtual photon between the fermion external lines from a same vertex contribute to the renormalization of the fermion magnetic moment and thus are central in the calculation of $$(g-2)$$ as discussed in Sect. 6.2.4.6. The contribution of these kinds of diagrams to the renormalization of the charge cancels out, ensuring that the electron and the muon charges remain the same.
The result in Eq. 6.211 can be written at first order as
$$\begin{aligned} \alpha (q^2) \simeq \alpha (\mu ^2) \frac{1}{1- \frac{\alpha (\mu ^2)}{3\pi } \ln \frac{q^2}{\mu ^2}} \, . \end{aligned}$$
(6.212)
The electromagnetic coupling can be obtained by an appropriate renormalization of the electron charge defined at an arbitrary scale $$\mu ^2$$. The electric charge, and the electromagnetic coupling parameter, “run” and increase with $$q^2$$. At momentum transfers close to the electron mass $$\alpha \simeq 1/137$$, while close to the Z mass $$\alpha \sim 1/128$$. The “running” behavior of the coupling parameters is not a mathematical artifact: it is experimentally well established that the strength of the electromagnetic interaction between two charged particles increases as the center-of-mass energy of the collision increases (Fig. 6.16).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig16_HTML.gif
Fig. 6.16

Evolution of the QED effective coupling parameter with momentum transfer. The theoretical curve is compared with measurements at the Z mass at CERN’s LEP $$e^+e^-$$ collider.

From CERN Courier, August 2001

images/304327_2_En_6_Chapter/304327_2_En_6_Fig17_HTML.gif
Fig. 6.17

Left: Artistic representation of the screening of a charge by its own cloud of virtual charged particle–antiparticle pairs. Right: Artistic view of the Casimir effect.

From the Scientific American blog of Jennifer Ouellette, April 19, 2012

Such an effect can be qualitatively described by the polarization of the cloud of the virtual fermion/antifermions pairs (mainly electron/positrons) by the “bare” charge that is at the same time the source of the electromagnetic field (Fig. 6.17, left). This bare charge is screened by this polarized medium and its intensity decreases with the distance to the charge (increases with the square of the transferred momentum).

Even in the absence of any “real” matter particle (i.e., in the vacuum), there is no empty space in quantum field theory. A rich spectrum of virtual wave particles (e.g., photons) can be created and destroyed under the protection of the Heisenberg uncertainty relations and within its limits be transfigurated into fermion/antifermion pairs. Space is thus full of electromagnetic waves and the energy of its ground state (the zero point energy ) is, like the ground state of any harmonic oscillator, different from zero. The integral over all space of this ground-state energy will be infinite, which leads to an enormous challenge to theoretical physicists: what is the relation of this effect with a nonzero cosmological constant which may explain the accelerated expansion of the Universe observed in the last years as discussed in Sect. 8.​1?

A spectacular consequence is the attraction experimented by two neutral planes of conductor when placed face to face at very short distances, typically of the order of the micrometer (see Fig. 6.17, right). This effect is known as the Casimir effect, since it was predicted by Hendrick Casimir5 in 1948 and later experimentally demonstrated. The two plates impose boundary conditions to the electromagnetic waves originated by the vacuum fluctuations, and the total energy decreases with the distance in such a way that the net result is a very small but measurable attractive force.

A theory is said to be renormalizable if (as in QED) all the divergences at all orders can be absorbed into physical constants; corrections are then finite at any order of the perturbative expansion. The present theory of the so-called standard model of particle physics was proven to be renormalizable. In contrast, the quantization of general relativity leads easily to non-renormalizable terms and this is one of the strong motivations for alternative theories (see Chap. 7). Nevertheless, the fact that a theory is not renormalizable does not mean that it is useless: it might just be an effective theory that works only up to some physical scale.

6.3 Weak Interactions

Weak interactionshave short range and contrary to the other interactions do not bind particles together. Their existence was first revealed in $$\beta $$ decay, and their universality was the object of many controversies until being finally established in the second half of the twentieth century. All fermions have weak charges and are thus subject to their subtle or dramatic effects. The structure of the weak interactions was found to be similar to the structure of QED, and this fact is at the basis of one of the most important and beautiful pieces of theoretical work in the twentieth century: the Glashow–Weinberg–Salam model of electroweak interactions, which, together with the theory of strong interactions (QCD), constitutes the standard model (SM) of particle physics, that will be discussed in the next chapter.

There are however striking differences between QED and weak interactions: parity is conserved, as it was expected, in QED, but not in weak interactions; the relevant symmetry group in weak interactions is SU(2) (fermions are grouped in left doublets and right singlets) while in QED the symmetry group is U(1); in QED there is only one massless vector boson, the photon, while weak interactions are mediated by three massive vector bosons, the $$W^{\pm }$$ and the Z .

6.3.1 The Fermi Model of Weak Interactions

The $$\beta $$ decay was known since long time when Enrico Fermi in 1933 realized that the associate transition amplitude could be written in a way similar to QED (see Sect. 6.2.8). Assuming time reversal symmetry (see discussion on crossing symmetries at the end of Sect. 6.2.10), one can see that the transition amplitude for $$\beta $$ decay,
$$\begin{aligned} n\rightarrow \ p\ e^{-\ }{\overline{\nu }}_e \, , \end{aligned}$$
(6.213)
is, for instance, the same as:
$$\begin{aligned} \begin{aligned} \ {\nu }_e\ n\rightarrow \ p\ e^{-\ };&\\ e^{-\ }p\ \rightarrow \ n\ {\nu }_e \text { (K capture)};&\\ {\overline{\nu }}_e\ p\ \rightarrow \ n\ e^{+\ } \text {(inverse }\beta \text { decay)} \, .&\end{aligned} \end{aligned}$$
(6.214)
The transition amplitude can then be seen as the interaction of a hadronic and a leptonic current (Fig. 6.18) and may be written, in analogy to the electron–muon elastic scattering discussed before (Fig. 6.10), as
$$\begin{aligned} {\mathcal M}=G_F\left( \left( {\bar{u}}_p{\gamma }^{\mu }u_n\right) \right) \left( \left( {\bar{u}}_e{\gamma }_{\mu }u_{{\nu }_e}\right) \right) \, . \end{aligned}$$
(6.215)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig18_HTML.gif
Fig. 6.18

Current–current description of the $$\beta $$ decay in the Fermi model

Contrary to QED, in the Fermi model of weak interactions fermions change their identity in the interaction ($$n\rightarrow \ p$$; $${\nu }_e\ \rightarrow \ \ e^-$$), currents mix different charges (the electric charges of the initial states are not the same as those of the final states) and there is no propagator (the currents meet at a single point: we are in front of a contact interaction).

The coupling parameter $$G_F$$, known nowadays as the Fermi constant, replaces the $$e^2/q^2$$ factor present in the QED amplitudes and thus has dimensions $$E^{-2}$$ (GeV$$^{-2}$$ in natural units). Its order of magnitude, deduced from the measurements of the $$\beta $$ decay rates, is $$G_F \sim (300\,\mathrm{GeV})^{-2} \sim 10^{-5}\, \mathrm{GeV^{-2}}$$ (see Sect. 6.3.3). Assuming point-like interactions has striking consequences: the Fermi weak interaction cross sections diverge at high energies. On a dimensional basis, one can deduce for instance that the neutrino–nucleon cross section behaves like:
$$\begin{aligned} \sigma \sim {G_F}^2\ \ E^2\ \, . \end{aligned}$$
(6.216)
The cross section grows with the square of the center-of-mass energy, and this behavior is indeed observed in low-energy neutrino scattering experiments.
However, from quantum mechanics, it is well known that a cross section can be decomposed in a sum over all the possible angular momenta l and then
$$\begin{aligned} \sigma \le \ \frac{4\pi }{k^2}\ \sum ^{\infty }_{l=0}{\left( 2l+1\right) } \, . \end{aligned}$$
(6.217)
Being $$\lambda =1/k$$, this relation just means that contribution of each partial wave is bound and its scale is given by the area ($$\pi {\lambda }^2)$$ “seen” by the incident particle. In a contact interaction, the impact parameter is zero and so the only possible contribution is the S wave ($$l=0$$). Thus, the neutrino–nucleon cross section cannot increase forever. Given the magnitude of the Fermi constant $$G_F$$, the Fermi model of weak interactions cannot be valid for center-of-mass energies above a few hundreds of GeV (this bound is commonly said to be imposed by unitarity in the sense that the probability of an interaction cannot be larger than 1).
In 1938 Oscar Klein suggest that the weak interactions may be mediated by a new field of short range, the weak field, whose massive charged bosons (the $$W^{\pm }$$) act as propagators. In practice (see Sect. 6.3.5),
$$\begin{aligned} G_F \rightarrow \frac{g^2_w}{q^2-M_W^2} \,. \end{aligned}$$
(6.218)
Within this frame the weak cross sections no longer diverges and the Fermi model is a low-energy approximation which is valid whenever the center-of-mass energy $$\sqrt{s}\ll \ m_W\ $$ ($$m_W\ \sim \ 80\ $$GeV).
The discovery of the muon extended the applicability of the Fermi model of weak interactions. Bruno Pontecorvo realized in the late 1940s that the capture of a muon by a nucleus,
$$\begin{aligned} {\mu }^-p\rightarrow n\ {\nu }_{\mu } \end{aligned}$$
as well as its weak decay (Fig. 6.19)
$$\begin{aligned} {\mu }^-\rightarrow e^-\ {\nu }_{\mu }{\overline{\nu }}_e \end{aligned}$$
may be described by the Fermi model6 as
$$\begin{aligned} {\mathcal M}=G_F\left( \left( {\bar{u}}_{\nu _\mu }{\gamma }^{\rho }u_\mu \right) \right) \left( \left( {\bar{u}}_e{\gamma }_{\rho }u_{{\nu }_e}\right) \right) \, . \end{aligned}$$
(6.219)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig19_HTML.gif
Fig. 6.19

Current–current description of the muon decay in the Fermi model

Although $$\beta $$ and $$\mu $$ decays are due to the same type of interaction, their phenomenology is different:
  • the neutron lifetime is $$\sim $$900 s while the muon lifetime is $$\sim $$2.2 $$\upmu $$s;

  • the energy spectrum of the decay electron is in both cases continuum (three-body decay) but its shape is quite different (Fig. 6.20). While in $$\beta $$ decay it vanishes at the endpoint, in the case of $$\mu $$ is clearly nonzero.

These striking differences are basically a reflection of the decay kinematics.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig20_HTML.gif
Fig. 6.20

Electron energy spectrum in $$\beta $$ decay of thallium 206 (left) and in $$\mu $$ decay (right). Sources: F.A. Scott, Phys. Rev. 48 (1935) 391; ICARUS Collaboration (S. Amoruso et al.), Eur. Phys. J. C33 (2004) 233

Using once again dimensional arguments, the decay width of these particles should behave as
$$\begin{aligned} \varGamma \sim {G_F}^2\ \ {{\varDelta } E}^5\ \, \end{aligned}$$
(6.220)
where $${\varDelta }E$$ is the energy released in the decay. In the case of the $$\beta $$ decay:
$$\begin{aligned} \varDelta E_n \sim \left( m_n-m_p\right) \sim 1.29\ \mathrm {MeV} \end{aligned}$$
while in the $$\mu $$ decay
$$\begin{aligned} \varDelta E_\mu \sim m_{\mu }\sim 105\ \mathrm {MeV} \end{aligned}$$
and therefore
$$\begin{aligned} \varDelta E_n ^5 \ll \varDelta E_\mu ^5 \, . \end{aligned}$$
On the other hand, the shape of the electron energy spectrum at the endpoint is determined by the available phase space. At the endpoint, the electron is aligned against the other two decay products but, while in the $$\beta $$ decay the proton is basically at rest (or remains “imprisoned” inside the nucleus) and there is only one possible configuration in the final state, in the case of $$\mu $$ decay, as neutrinos have negligible mass, the number of endpoint configurations is quite large reflecting the different ways to share the remaining energy between the neutrino and the antineutrino.

6.3.2 Parity Violation

The conservation of parity (see Sect. 5.​3.​6) was a dogma for physicists until the 1950s. Then, a puzzle appeared: apparently two strange mesons, denominated $${\theta }^+$$ and $${\tau }^+$$ (we know nowadays that $${\theta }^+$$ and $${\tau }^+$$ are the same particle: the $$K^+$$ meson), had the same mass, the same lifetime but different parities according to their decay modes:
$$\begin{aligned} \theta ^+&\rightarrow \ {\pi }^+{\pi }^0 \; \;\qquad (\mathrm{{even\;parity}})\end{aligned}$$
(6.221)
$$\begin{aligned} \tau ^+&\rightarrow \ {\pi }^+{{\pi }^+\pi }^- \quad (\mathrm{{odd\;parity}}). \end{aligned}$$
(6.222)
In the 1956 Rochester conference, the conservation of parity in weak decays was questioned by Feynman reporting a suggestion of Martin Block. Few months later, Lee and Yang reviewed all the past experimental data and found that there was no evidence of parity conservation in weak interactions, and they proposed new experimental tests based on the measurement of observables depending on axial vectors.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig21_HTML.gif
Fig. 6.21

Conceptual (left) and schematic (right) diagram of the experimental apparatus used by Wu et al. (1957) to detect the violation of the parity symmetry in $$\beta $$ decay. The green arrow in the left panel indicates the direction of the electron flow through the solenoid coils. The left plot comes from Wikimedia commons; the right plot from the original article by Wu et al. Physical Review 105 (1957) 1413

C. S. Wu (known as “Madame Wu”) was able, in a few months, to design and perform a $$\beta $$ decay experiment where nuclei of $${}^{\mathrm{60}}{\mathrm{Co\ }}$$ (with total angular momentum $$\mathrm{J}=5$$) decay into an excited state $${}^{\mathrm{60}}{{\mathrm{Ni}}^{{**}}}$$ (with total angular momentum $$\mathrm{J}=4$$):
$$\begin{aligned} \mathrm{\ }{}^{\mathrm{60}}{\mathrm{Co\ }}\rightarrow \ {}^{\mathrm{60}}{{\mathrm{Ni}}^{{**}}{{ e}}^{{ -}}{\overline{\nu }}_{\mathrm{e}}} \end{aligned}$$
(6.223)
The $${}^{\mathrm{60}}{\mathrm{Co\ }}$$ was polarized (a strong magnetic field was used, and the temperatures were as low as a few mK) and the number of decay electrons emitted in the direction (or opposite to) of the polarization field was measured (Fig. 6.21). The observed angle $$\theta $$ between the electron and the polarization direction followed a distribution of the form:
$$\begin{aligned} N\left( \theta \right) \sim 1-P\ \beta \cos \theta \ \end{aligned}$$
(6.224)
where P is the degree of polarization of the nuclei and $$\beta $$ is the speed of the electron normalized to the speed of light.
The electrons were emitted preferentially in the direction opposite to the polarization of the nuclei, thus violating parity conservation. In fact under a parity transformation, the momentum of the electron (a vector) reverses its direction while the magnetic field (an axial vector) does not (Fig. 6.22). Pauli placed a bet: “I don’t believe that the Lord is a weak left-hander, and I am ready to bet a very high sum that the experiment will give a symmetric angular distributions of electrons”—and lost.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig22_HTML.gif
Fig. 6.22

Parity transformation of electron and magnetic field direction. The Wu experiment preferred the right side of the mirror to the left one

6.3.3 V-A Theory

The universality of the Fermi model of weak interactions was questioned long before the Wu experiment. In the original Fermi model, only $$\beta $$ decays in which there was no angular momentum change in the nucleus (Fermi transitions) were allowed, while the existence of $$\beta $$ decays where the spin of the nucleus changed by one unity (the Gamow–Teller transitions) was already well established. The Fermi model had to be generalized.

In the most general way, the currents involved in the weak interactions could be written as a sum of Scalar (S), Pseudoscalar (P), Vector (V), Axial (A), or Tensor (T) terms following the Dirac bilinear forms referred in Sect. 6.2.4:
$$\begin{aligned} J_{1,2}=\sum _i{C_i\left( {\bar{u}}_1{\varGamma }_iu_2\right) } \end{aligned}$$
(6.225)
where $$C_i$$ are arbitrary complex constants and the $${\varGamma }_i\ $$ are S, P, V, A, T operators.At the end of 1956, George Sudarshan, a young Indian Ph.D. student working in Rochester University under the supervision of Robert Marshak, realized that the results on the electron–neutrino angular correlation reported by several experiments were not consistent. Sudarshan suggested that the weak interaction had a V-A structure. This structure was (in the own words of Feynman) “publicized by Feynman and Gell-Mann” in 1958 in a widely cited article.
Each vectorial current in the Fermi model is, in the (V-A) theory, replaced by a vectorial minus an axial-vectorial current. For instance, the neutrino–electron vectorial current present in the $$\beta $$ decay and in the muon decay amplitudes (Eqs. 6.215, 6.219, and Fig. 6.18, respectively):
$$\begin{aligned} \left( {\bar{u}}_e{\gamma }_{\mu }u_{{\nu }_e}\right) \end{aligned}$$
(6.226)
is replaced by
$$\begin{aligned} \left( {\bar{u}}_e{\gamma }_{\mu } (1-\gamma ^5) u_{{\nu }_e}\right) . \end{aligned}$$
(6.227)
In terms of the Feynman diagrams, the factor associated with the vertex becomes
$$\begin{aligned} {\gamma }^{\mu } (1-\gamma ^5) \, . \end{aligned}$$
(6.228)
Within the (V-A) theory, the transition amplitude of the muon decay, which is a golden example of a leptonic weak interaction, can then be written as:
$$\begin{aligned} {\mathcal M}=\frac{G_F}{\sqrt{2}}\left( {\bar{u}}_{{\nu }_{\mu }}{\gamma }^{\mu }{(1-{\gamma }^5)u}_{\mu }\right) \left( {\bar{u}}_e{\gamma }_{\mu }(1-{\gamma }^5)u_{{\nu }_e}\right) . \end{aligned}$$
(6.229)
The factor $$\sqrt{2}$$ is introduced in order that $$G_F$$ keeps the same numerical value. The only relevant change in relation to the Fermi model is the replacement:
$$\begin{aligned} {{\gamma }^{\mu }\rightarrow \gamma }^{\mu }(1-{\gamma }^5) \, . \end{aligned}$$
The muon lifetime can now be computed using the Fermi golden rule. This detailed computation, which is beyond the scope of the present text, leads to:
$$\begin{aligned} {\tau }_{\mu }=\frac{192\ {\pi }^3}{{G_F}^2\ \ {m_{\mu }}^5}\, \end{aligned}$$
(6.230)
showing the $${{\mathrm{m}}_{\mu }}^{-5}$$ dependence anticipated in Sect. 6.3.1 based just on dimensional arguments.
In practice, it is the measurement of the muon lifetime which is used to derive the value of the Fermi constant:
$$\begin{aligned} G_F=\mathrm{1.166}\mathrm{\ 378\ 7}\left( \mathrm{6}\right) {\mathrm{\ \ 10}}^{\mathrm{-}\mathrm{5}}\mathrm{\ }{\mathrm{GeV}}^{\mathrm{-}\mathrm{2}}\simeq \frac{1}{{\left( 300\ \mathrm {GeV} \right) }^2}\, . \end{aligned}$$
(6.231)
The transition amplitude of the $$\beta $$ decay can, in analogous way, be written as
$$\begin{aligned} {\mathcal M}=\frac{G^*_F}{\sqrt{2}}\left( {\bar{u}}_p{\gamma }^{\mu }{(C_V-{C_A\gamma }^5)u}_n\right) \left( {\bar{u}}_e{\gamma }_{\mu }(1-{\gamma }^5)u_{{\nu }_e}\right) . \end{aligned}$$
(6.232)
The $$C_V$$ and $$C_A$$ constants reflect the fact that the neutron and the proton are not point-like particles and thus form factors may lead to a change on their weak charges. Experimentally, the measurement of many nuclear $$\beta $$ decays is compatible with the preservation of the value of the “vector weak charge” and a 25% change in the axial charge:
$$\begin{aligned} C_V=1.000 \end{aligned}$$
$$\begin{aligned} C_A=1.255\pm \mathrm{0.006} \, . \end{aligned}$$
The value of $$G^*_F$$ was found to be slightly lower (2%) than the one found from the muon decay. This “discrepancy” was cured with the introduction of the Cabibbo angle as it will be discussed in Sect. 6.3.6.

6.3.4 “Left” and “Right” Chiral Particle States

The violation of parity in weak interactions observed in the Wu experiment and embedded in the (V-A) structure can be translated in terms of interactions between particles with well-defined states of chirality.

“Chiral” states are eigenstates of $${\gamma }^5$$, and they coincide with the helicity states for massless particles; however, no such particles (massless 4-spinors) appear to exist, to our present knowledge—neutrinos have very tiny mass. The operators $$\frac{1}{2}\left( 1+{\gamma }^5\right) $$ and $$\frac{1}{2}(1-{\gamma }^5)$$, when applied to a generic particle bi-spinor u, (Sect. 6.2.4) project, respectively, on eigenstates with chirality $$+$$1 (R—Right) and −1 (L—Left). Chiral particle spinors can thus be defined as
$$\begin{aligned} u_L=\frac{1}{2}\left( 1-{\gamma }^5\right) \ u \; ; \; u_R=\frac{1}{2}\left( 1+{\gamma }^5\right) \ u \end{aligned}$$
(6.233)
with $$u = u_L + u_R$$. The adjoint spinors are given by
$$\begin{aligned} {\bar{u}}_L=\bar{u}\frac{1}{2}\left( 1+{\gamma }^5\right) \ \; ; \; {\bar{u}}_R=\bar{u}\frac{1}{2}\left( 1-{\gamma }^5\right) . \end{aligned}$$
(6.234)
For antiparticles
$$\begin{aligned} v_L=\,\frac{1}{2}\left( 1+{\gamma }^5\right) \ v \;&; \; v_R=\,\frac{1}{2}\left( 1-{\gamma }^5\right) \ v \end{aligned}$$
(6.235)
$$\begin{aligned} {\bar{v}}_L=\bar{v}\frac{1}{2}\left( 1-{\gamma }^5\right) \;&; \; {\bar{v}}_R=\bar{v}\frac{1}{2}\left( 1+{\gamma }^5\right) . \end{aligned}$$
(6.236)
Chiral states are closely related to helicity states but they are not identical. In fact, applying the chiral projection operators defined above to the helicity eigenstates (Sect. 6.2.4) one obtains, for instance, for the right helicity eigenstate:
$$\begin{aligned} u_\uparrow = \left( \frac{1}{2}(1-\gamma ^5) + \frac{1}{2} (1+\gamma ^5) \right) u_\uparrow = \frac{1}{2} \left( 1 + \frac{p}{E+m} \right) u_R + \frac{1}{2} \left( 1 - \frac{p}{E+m} \right) u_L \, . \end{aligned}$$
(6.237)
In the limit $$m \rightarrow 0$$ or $$p \rightarrow \infty $$, right helicity and right chiral eigenstates coincide, otherwise not.

There is also a subtle but important difference: helicity is not Lorentz invariant but it is time invariant $$([h, H] = 0)$$, while chirality is Lorentz invariant but it is not time invariant $$([\gamma ^5,H] \propto m)$$. The above relation is basically valid for $$t \sim 0$$.

Now, since
$$\begin{aligned} {\gamma }_{\mu }\left( \frac{1-{\gamma }^5}{2}\right) ={\left( \frac{1+{\gamma }^5}{2}\right) \gamma _\mu }\left( \frac{1-{\gamma }^5}{2}\right) \end{aligned}$$
(6.238)
the weak (V-A) neutrino–electron current (Eq. 6.227) can be written as:
$$\begin{aligned} {\bar{u}}_e{\gamma }_{\mu }(1\,-\,{\gamma }^5)u_{{\nu }_e} =2\left[ {\bar{u}}_e{\left( \frac{1+{\gamma }^5}{2}\right) \gamma _\mu }\left( \frac{1-{\gamma }^5}{2}\right) u_{{\nu }_e}\right] =2\left( {\bar{u}}_{eL}{\gamma }_{\mu }u_{{\nu }_{eL}}\right) :\end{aligned}$$
(6.239)
the weak charged leptonic current involves then only chiral left particles (and right chiral antiparticles).
In the case of the $${}^{\mathrm{{60}}}{{\mathrm{{Co}}}}\,\beta $$ decay (the Wu experiment), the electron and antineutrino masses can be neglected and so the antineutrino must have right helicity and the electron left helicity. Thus, as the electron and antineutrino have to add up their spin to compensate the change by one unity in the spin of the nucleus, the electron is preferentially emitted in the direction opposite to the polarization of the nucleus (Fig. 6.23).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig23_HTML.gif
Fig. 6.23

Schematic representation of the spin alignment in the $${}^{\mathrm{60}}{\mathrm{Co\ }}$$ $$\beta $$ decay

The confirmation of the negative helicity of neutrinos came from a sophisticated and elegant experiment by M. Goldhaber, L. Grodzins, and A. Sunyar in 1957, studying neutrinos produced in a K capture process ($$e^{-\ }p\ \rightarrow \ n\ {\nu }_e$$). A source emits europium nuclei ($${}^{152}\mathrm{{Eu}}$$, J $$=$$ 0) on a polarized electron target producing excited $${\mathrm{{Sm}}^*}$$ (J $$=$$ 1) and a neutrino,
$$\begin{aligned} {}^{152}\mathrm{{Eu}}\ \ e^- \rightarrow \ {}^{152}{\mathrm{{Sm}}^*}\ {\nu }_e , \end{aligned}$$
and the $${\mathrm{{Sm}}^*}$$ decays in the ground state $${}^{152}\mathrm{{Sm}}$$ (J $$=$$ 0),
$$\begin{aligned} {}^{152}{\mathrm{{Sm}}^*}\ \rightarrow {}^{152}\mathrm{{Sm}}\ \ \gamma \, . \end{aligned}$$
The longitudinal polarization of the decay photon was then correlated with the helicity of the emitted neutrino in the K capture process. The result was conclusive: neutrinos were indeed left-handed particles.
The accurate calculation of the ratio of the decay width of charged $$\pi ^\pm $$ mesons into electron neutrinos with respect to muon neutrinos was also one of the successes of the (V-A) theory. According to (V-A) theory at first order:
$$\begin{aligned} \frac{BR\left( {\pi }^-\rightarrow e^-\ {\overline{\nu }}_e\right) }{BR\left( {\pi }^-\rightarrow {\mu }^-\ {\overline{\nu }}_\mu \right) } =\frac{m^2_e{\left( m^2_{\pi }-m^2_e\right) }^2}{m^2_{\mu }{\left( m^2_{\pi }-m^2_{\mu }\right) }^2} \simeq 1.28 \times {10}^{-4} \, , \end{aligned}$$
(6.240)
while at the time this ratio was first computed the experimental limit was wrongly much smaller $$\left( {<}{10}^{-6}\right) $$. In fact, the (V-A) theoretical prediction is confirmed by the present experimental determination:
$$\begin{aligned} \frac{BR\left( {\pi }^-\rightarrow e^-\ {\overline{\nu }}_e\right) }{BR\left( {\pi }^-\rightarrow {\mu }^-\ {\overline{\nu }}_\mu \right) } \simeq 1.2 \times {10}^{-4} \, . \end{aligned}$$
(6.241)
In the framework of the (V-A) theory, if leptons were massless these weak decays would be forbidden. In fact, the pion has spin 0, the antineutrino is a right-handed particle and thus to conserve angular momentum the helicity of the electron should be positive (Fig. 6.24) which is impossible for a massless left electron. However, the suppression of the decay into electron neutrino face to the decay into muon neutrino, contrary to what would be expected from the available decay phase space, is not a proof of the (V-A) theory. It can be shown that a theory with V or A couplings (or any combination of them) would also imply a suppression factor of the order $$m_e^2/m_\mu ^2$$ (for a detailed discussion see Sect. 7.​4 of reference [F6.2]).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig24_HTML.gif
Fig. 6.24

Schematic representation of the spin alignment in the $$\pi ^-$$ decay

As a last example, the neutrino and antineutrino handedness is revealed in the observed ratio of cross sections for neutrino and antineutrino in isoscalar nuclei (with an equal number of protons and neutrons) N at GeV energies:
$$\begin{aligned} \frac{\sigma \left( \ {\overline{\nu }}_{\mu \ }N\rightarrow {\mu }^+X\right) }{\sigma \left( \ {\nu }_{\mu \ }N\rightarrow {\mu }^-X\right) }\sim \frac{1}{3} \, . \end{aligned}$$
(6.242)
Note that at these energies, the neutrinos and the antineutrinos interact directly with the quarks and antiquarks the protons and neutrons are made of (similarly to the electrons in the deep inelastic scattering discussed in Sect. 5.​5.​3).
Let us now consider just valence quarks in a first approximation. As electric charge and leptonic number are conserved, a neutrino can just pick up a d quark transforming it into a u quark and emitting a $$\mu ^-$$. Antineutrinos will do the opposite. In these conditions, neglecting masses, all fermions have negative helicity and all antifermions have positive helicity. The total angular momentum is therefore 0 for neutrino interactions and 1 for antineutrino interactions (Fig. 6.25). Thus, the former interaction will be isotropic while the amplitude of the latter will be weighted by a factor $$1/2(1+\cos \theta )$$. Then
$$\begin{aligned} \frac{d\sigma \left( {\overline{\nu }}_{\mu \ }u\rightarrow {\mu }^+d\right) }{d\varOmega }=\frac{d\sigma \left( {\nu }_{\mu \ }d\rightarrow {\mu }^-u\right) }{d\varOmega }\ \frac{(1+\cos \theta )^2}{4} \end{aligned}$$
(6.243)
and integrating over the solid angle
$$\begin{aligned} \frac{\sigma \left( {\overline{\nu }}_{\mu \ }u\rightarrow {\mu }^+d\right) }{\sigma \left( {\nu }_{\mu \ }d\rightarrow {\mu }^-u\right) }=\ \frac{1}{3} \, . \end{aligned}$$
(6.244)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig25_HTML.gif
Fig. 6.25

Schematic representation of the spin alignments in $${\nu }_{\mu \ }d\rightarrow {\mu }^-u$$ (left) and in $${\overline{\nu }}_{\mu \ }u\rightarrow {\mu }^+d$$ (right) interactions

6.3.5 Intermediate Vector Bosons

Four-fermion interaction theories (like Fermi model—see Sect. 6.3.1) violate unitarity at high energy and are not renormalizable (all infinities cannot be absorbed into running physical constants—see Sect. 6.2.12). The path to solve such problem was to construct, in analogy with QED, a gauge theory of weak interactions leading to the introduction of intermediate vector bosons with spin 1: the $$W^{\pm }$$ and the Z. However, in order to model the short range of the weak interactions, such bosons could not have zero mass, and thus would violate the gauge symmetry. The problem was solved by the introduction of spontaneously broken symmetries, which then led to the prediction of the existence of the so-called Higgs boson.

In this section, the modification introduced on the structure of the charged weak currents as well as the discovery of the neutral currents and of the $$W^{\pm }$$ and the Z bosons will be briefly reviewed. The overall discussion on the electroweak unification and its experimental tests will be the object of the next chapter.

6.3.5.1 Charged Weak Currents

The structure of the weak charged and of the electromagnetic interactions became similar with the introduction of the $$W^{\pm }$$ bosons, with the relevant difference that weak-charged interactions couple left-handed fermions (right-handed antifermions) belonging to SU(2) doublets, while electromagnetic interactions couple fermions belonging to U(1) singlets irrespective of chirality.

The muon decay amplitude deduced in (V-A) theory (Eq. 6.229) is now, introducing the massive $$W^{\pm }$$ propagator (Fig. 6.26), written as:
$$\begin{aligned} {\mathcal M}=\frac{g_W}{\sqrt{2}}\left( {\bar{u}}_{{\nu }_{\mu }}\frac{1}{2}{\gamma }^{\mu }{(1-{\gamma }^5)u}_{\mu }\right) \frac{-i\left( g_{\mu \nu }-q_{\mu }q_{\nu }/{M_W}^2\right) }{\left( q^2-{M_W}^2\right) }\frac{g_W}{\sqrt{2}}\left( {\bar{u}}_e\frac{1}{2}{\gamma }^\nu (1-{\gamma }^5)u_{{\nu }_e}\right) \end{aligned}$$
(6.245)
or
$$\begin{aligned} {\mathcal M}=\frac{{g_W}^2}{8}\left( {\bar{u}}_{{\nu }_{\mu }}{\gamma }^{\mu }{(1-{\gamma }^5)u}_{\mu }\right) \frac{-i\left( g_{\mu \nu }-q_{\mu }q_{\nu }/{M_W}^2\right) }{\left( q^2-{M_W}^2\right) }\left( {\bar{u}}_e{\gamma }^\nu (1-{\gamma }^5)u_{{\nu }_e}\right) . \end{aligned}$$
(6.246)
Introducing explicitly the left and right spinors:
$$\begin{aligned} {\mathcal M}=\frac{{g_W}^2}{2}\ \left( {\bar{u}}_{{\nu }_{\mu L}}{\gamma }^{\mu }u_{{\mu }_L}\right) \frac{-i\left( g_{\mu \nu }-q_{\mu }q_{\nu }/{M_W}^2\right) }{\left( q^2-{M_W}^2\right) }\ \left( {\bar{u}}_{e_L}{\gamma }^\nu u_{{\nu }_{eL}}\right) \, . \end{aligned}$$
(6.247)
The derivation of the expression of the propagator for massive spin 1 boson is based on the Proca equation (Sect. 6.2.1) and it is out of the scope of the present text. But whenever the term $$(q_{\mu }q_{\nu }/{M_W}^2)$$ can be neglected, a Yukawa-type expression, $${g_{\mu \nu }}/{\left( q^2-{M_W}^2\right) }$$, is recovered. In the low-energy limit, $$\left( q^2\ \ll {M_W}^2\right) $$ the two coupling parameters (Eqs. 6.229 and 6.245) are thus related by:
$$\begin{aligned} G_F=\frac{\sqrt{2}}{8}\ \frac{{g_W}^2}{{M_W}^2} \, . \end{aligned}$$
(6.248)
$$G_F$$ is thus much smaller than $$g_W$$ which is of the same order of magnitude of the electromagnetic coupling g.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig26_HTML.gif
Fig. 6.26

First-order Feynman diagram for muon decay

6.3.5.2 Neutral Weak Currents

Neutral weak currents were predicted long before their discovery at CERN in 1973 (N. Kemmer 1937, O. Klein 1938, S. A. Bludman 1958). Indeed the SU(2) structure of charged interactions (leptons organized in weak isospin doublets) suggested the existence of a triplet of weak bosons similarly to the pion triplet responsible for the proton–neutron strong isospin rotations.

However, if the charged components would be the $$W^{\pm }$$, the neutral boson could not be the $$\gamma $$, which has no weak charge. Furthermore, in the 1960s it was discovered that strangeness-changing neutral currents (for instance $$K^+\rightarrow {\pi }^+\nu \ \overline{\nu }$$) were highly suppressed and thus some thought that neutral weak interactions may not exist. Many theorists however became enthusiastic about neutral currents around the 1970s since they were embedded in the work by Glashow, Salam, and Weinberg on electroweak unification (the GSW model, see Sect. 7.​2). From the experimental point of view, it was clearly a very difficult issue and the previous experimental searches on neutral weak processes lead just to upper limits.

Neutrino beams were the key to such searches. In fact, as neutrinos do not have electromagnetic and strong charges, their only possible interaction is the weak one. Neutrino beams are produced in laboratory (Fig. 6.27, left) by the decay of secondary pions and kaons coming from a primary high-energy proton interaction on a fixed target. The charge and the momentum range of the pions and kaons can be selected using a sophisticated focusing magnetic optics system (narrow-band beam) or just loosely selected maximizing the beam intensity (wide-band beam). The energy spectra of such beams are quite different (Fig. 6.27, right). While the narrow-band beam has an almost flat energy spectrum, the wide band is normally peaked at low energies.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig27_HTML.gif
Fig. 6.27

Left: Neutrino narrow-band beam (top) and wide-bam beam (bottom) production. Right: Narrow-band (lower curve) and wide-band (upper curve) neutrino energy spectra. The y-axis represents the number of particles per bunch

In the 1960s, a large heavy liquid bubble chamber (18 tons of freon under a pressure of 10–15 atmospheres, in a magnetic field of 2 T) called Gargamelle was proposed by André Lagarrigue from the École Polytechnique in Paris. The chamber was built in Saclay and installed at CERN. Gargamelle could collect a significant number (one order of magnitude above the previous experiments) of neutrino interactions (Fig. 6.28). Its first physics priority was, in the beginning of the 1970s, the test of the structure of protons and neutrons just revealed in the deep inelastic scattering experiment at SLAC (Sect. 5.​5.​3).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig28_HTML.gif
Fig. 6.28

Technicians at work in the Gargamelle bubble chamber at CERN. Source: CERN

In a batch of about 700 000 photos of neutrino interactions, one event emerged as anomalous. In that photo (Fig. 6.29, left), taken with an antineutrino beam, just an electron was visible (giving rise to a small electromagnetic cascade). This event is a perfect candidate for a $${\overline{\nu }}_{\mu }\ \ e^-\rightarrow {\overline{\nu }}_{\mu }\ \ e^-\ $$interaction (Fig. 6.29, right). The background in the antineutrino beam was estimated to be negligible.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig29_HTML.gif
Fig. 6.29

Left: Gargamelle image (top) and sketch (bottom) of the first observed neutral-current process $${\overline{\nu }}_{\mu }\ \ e^-\rightarrow {\overline{\nu }}_{\mu }\ \ e^-$$. A muon antineutrino coming from the left knocks an electron forward, creating a small shower of electron–positron pairs. Source: CERN. Right: First-order Feynman diagram for the neutral leptonic weak interactions $${\overline{\nu }}_{\mu }\ \ e^-\rightarrow {\overline{\nu }}_{\mu }\ \ e^-$$

Neutral-current interactions should be even more visible in the semileptonic channel. Their signature should be clear: in charged semileptonic weak interactions, an isolated muon and several hadrons could be produced in the final state, while in the interactions mediated by the neutral current there could be no muon (Fig. 6.30).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig30_HTML.gif
Fig. 6.30

First-order Feynman diagrams for the charged (left) and neutral (right) semileptonic weak interactions

However, the background resulting from neutron interactions in the chamber, being the neutrons produced in neutrino interactions upstream the detector, is not negligible. Careful background estimation had to be performed. The final result, after several months of work and public discussions, was that the number of events without a muon was clearly above the expected number of background events. The existence of the weak neutral currents was finally firmly established.

6.3.5.3 The Discovery of the W and Z Bosons

Neutral currents did exist, and the GSW model proposed a complete and unified framework for electroweak interactions: the intermediate vector bosons should be there (with expected masses around 65 and 80 GeV for the $$W^{\pm }$$ and the Z, respectively, based on the data known at that time). They had to be found.

In 1976, Carlo Rubbia pushed the idea to convert the existing Super Proton Synchrotron accelerator at CERN (or the equivalent machine at Fermilab) into a proton/antiproton collider. It was not necessary to build a new accelerator (protons and antiprotons would travel in opposite directions within the same vacuum tube) but antiprotons had to be produced and kept alive during many hours to be accumulated in an auxiliary storage ring. Another big challenge was to keep the beam focused. Simon van der Meer made this possible developing an ingenious strategy of beam cooling, to decrease the angular dispersion while maintaining monochromaticity. In beginning of the 1980s, the CERN SPS collider operating at a center-of-mass energy of 540 GeV was able to produce the first $$W^{\pm }$$ and Z (Fig. 6.31) by quark/antiquark annihilation ($$u\ \bar{u}\rightarrow Z$$; $$d\ \bar{d}\rightarrow Z$$; $$u\ \bar{d}\rightarrow W^+$$; $$d\ \bar{u}\rightarrow W^-$$).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig31_HTML.gif
Fig. 6.31

$$W^{\pm }$$ and Z production in proton/antiproton colliders

The leptonic decay channels with electrons and muons in the final state were the most obvious signatures to detect the so awaited bosons. The hadronic decay channels as well as final states with tau leptons suffer from a huge hadronic background due to the “normal” quark and gluon strong interactions. Priority was then given to searches into the channels:
$$\begin{aligned} p\ \overline{p}\rightarrow ZX\rightarrow e^-e^+X \; ; \; p\ \overline{p}\rightarrow ZX\rightarrow {\mu }^-{\mu }^+X \end{aligned}$$
(6.249)
and
$$\begin{aligned} p\ \overline{p}\rightarrow W^{\pm }\mathrm{\ }X\rightarrow e^{\pm }{\nu }_{e\ }X \; ; \; p\ \overline{p}\rightarrow W^{\pm }\mathrm{\ }X\rightarrow {\mu }^{\pm }{\nu }_{e\ }X \, . \end{aligned}$$
(6.250)
Two general-purpose experiments, UA1 and UA2, were built having the usual “onion” structure (a tracking detector surrounded by electromagnetic and hadronic calorimeters, surrounded by an exterior layer of muon detectors). In the case of UA1, the central detector (tracking and electromagnetic calorimeter) was immersed in a 0.7 T magnetic field, perpendicular to the beam line, produced by a magnetic coil (Fig. 6.32); the iron return yoke of the field was instrumented to operate as a hadronic calorimeter. UA1 was designed to be as hermetic as possible.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig32_HTML.gif
Fig. 6.32

Longitudinal cross section of the UA1 detector.

The first $$W^{\pm }$$ and Z events were recorded in 1983. $$Z\rightarrow e^-e^+$$ events were characterized by two isolated high-energy deposits in the cells of the electromagnetic calorimeter (Fig. 6.33 left) while $$W^{\pm }\mathrm{\ }X\rightarrow e^{\pm }{\nu }_{e\ }$$events were characterized by an isolated high-energy deposit in the cells of the electromagnetic calorimeter and an important transverse missing energy (Fig. 6.33 right).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig33_HTML.gif
Fig. 6.33

Left: Two high-energy deposits from a $$Z\rightarrow e^-e^+$$ event seen in the electromagnetic calorimeter of the UA2 experiment. Right: A high-energy deposit with accompanying missing transverse momentum from a $$W^{\pm }\mathrm{\ }X\rightarrow e^{\pm }{\nu }_{e\ }$$ event.

The Z mass in this type of events can be reconstructed just computing the invariant mass of the final state electron and positron:
$$\begin{aligned} m^2_{Z}\cong 4{\ E}_1E_2 \; {\sin }^2\left( \alpha /2\right) , \end{aligned}$$
(6.251)
where $$\alpha $$ is the angle between the electron and positron.
The distribution of the measured $$m_{Z}$$ for the first $$Z\rightarrow e^-e^+$$ and $$Z \rightarrow \mu ^+\mu ^-$$ candidate events by UA1 and UA2 is represented in Fig. 6.34. The best-fit value presented by Carlo Rubbia in his Nobel lecture (1984) was of $$m_{Z}=(95.6\,{\pm }\, 1.4\,{\pm }\, 2.9)$$ GeV—the present value, after LEP, is $$91.1876\pm 0.0021$$ GeV.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig34_HTML.gif
Fig. 6.34

Invariant mass distribution for the first candidate $$Z \rightarrow e^+e^-$$ and $$Z \rightarrow \mu ^+\mu ^-$$ events recorded by UA1 and UA2 (from the Nobel lecture of Carlo Rubbia, ©The Nobel Foundation). A clear peak of 17 events is visible around 95 GeV

The reconstruction of the $$W^{\pm }$$ mass is more subtle—the missing energy does not allow a full kinematical constraint. The best way is to take it from the shape of the differential $$W^{\pm }$$ cross section as a function of the transverse momentum (the so-called Jacobian peak method). In fact, neglecting the electron and neutrino masses, the transverse momentum of the $$W^{\pm }$$ is given by
$$\begin{aligned} P_T\cong \frac{m_W}{2}\sin {\theta }^*, \end{aligned}$$
(6.252)
where $${\theta }^*$$ is the $$W^{\pm }$$ production angle in the center-of-mass reference frame. Then
$$\begin{aligned} \cos {\theta }^*={\sqrt{1-4\ \frac{P^2_T}{m^2_W}}} \end{aligned}$$
(6.253)
and
$$\begin{aligned} \frac{d\cos \theta ^*}{dP_T}=\frac{4P_T/m^2_W}{\sqrt{{\left( 1-4 \frac{P^2_T}{m^2_W}\ \right) }}} \, . \end{aligned}$$
(6.254)
Writing the differential cross section as
$$\begin{aligned} \frac{d\sigma }{dP_T}=\frac{d\sigma }{d\cos {\theta }^* }\ \frac{d\cos \theta ^*}{dP_T} \end{aligned}$$
(6.255)
it is clear (Fig. 6.35) that a peak is present at
$$\begin{aligned} P_T=\frac{m_W}{2} \, . \end{aligned}$$
(6.256)
The measured value for $$m_W$$ by UA1 and UA2 was, respectively, $$m_W=(82.7\pm 1.0\pm 2.7)$$ GeV and $$m_W= (80.2\pm 0.6\pm 0.5$$) GeV—the present world average is $$(80.385 \pm 0.015)$$ GeV.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig35_HTML.gif
Fig. 6.35

Differential $$W^{\pm }$$ cross section as a function of transverse momentum. The gray (black) line refers to a measurement with an ideal (real) detector

Finally the V-A character of the charged weak interactions, as well as the fact that the W has spin 1, is revealed by the differential cross section as a function of $$\cos {\theta }^*$$ for the electron produced in the W semileptonic decay, which displays a $${\left( 1+\cos {\theta }^*\right) }^2$$ dependence (Fig. 6.36).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig36_HTML.gif
Fig. 6.36

The angular distribution of the electron emission angle $$\theta ^*$$ in the rest frame of the W after correction for experimental acceptance, as measured by the UA1 detector (from the Nobel lecture of Carlo Rubbia, ©The Nobel Foundation)

In fact, at CERN collider energies, neglecting the masses of the quarks and leptons and considering that $$W^{\pm }$$ are mainly produced by the interaction of valence quarks (from the proton) and valence antiquarks (from the antiproton), the helicity of the third component of spin of the $$W^{\pm }$$ is along the antiproton beam direction and thus the electron (positron) is emitted preferentially in the proton (antiproton) beam direction (Fig. 6.37).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig37_HTML.gif
Fig. 6.37

Helicity in the $$W^{\pm }$$ production and decay

6.3.6 The Cabibbo Angle and the GIM Mechanism

The universality of weak interactions established in the end 1940s (see Sect. 6.3.1) was questioned when it was discovered that some strange particle decays (as for instance $$K^-\rightarrow {\mu }^-\ {\overline{\nu }}_{\mu }$$ or $${\varLambda } \rightarrow p\ e^{-\ }{\overline{\nu }}_e$$) were suppressed by a factor around 20 in relation to what expected.

The problem was solved in 1963 by Nicola Cabibbo,7 who suggested that the quark weak and strong eigenstates may be not the same. At that time only the u, d, and s quarks were known (Sect. 5.​7.​2) and Cabibbo conjectured that the two quarks with electromagnetic charge $$-1/3\ $$ (d and s) mixed into a weak eigenstate $$d'$$ such as:
$$\begin{aligned} d' = d {\cos {\theta }_c+s \sin \ }{\theta }_c , \end{aligned}$$
(6.257)
where $${\theta }_c$$ is a mixing angle, designated as the Cabibbo angle.
Then the $$W-$$quark couplings involved in the $$\mu {,}\ n$$, and $${\varLambda }$$ decays are, respectively $${g}_w$$, $$g_w\cos {\theta }_c\, $$and $$g_w\sin {\theta }_c$$ (Fig. 6.38). The value of the Cabibbo angle is not predicted in the theory of electroweak interactions. Its present (PDG 2016) experimental value is $$\sin {\theta }_c = 0.2248 \pm 0.0006$$, which corresponds to an angle of about $${\mathrm{13}}^\circ $$.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig38_HTML.gif
Fig. 6.38

Weak decay couplings: Leptonic (top), semileptonic involving (bottom), and not involving (middle) strange quarks

images/304327_2_En_6_Chapter/304327_2_En_6_Fig39_HTML.gif
Fig. 6.39

Possible s and d quark transitions generated by Z (top) and $$W^{\pm }$$ (bottom) couplings (three families)

In the Cabibbo model transitions between the s and d quarks would happen both via neutral currents (through the Z) or charged currents (through double $$W^{\pm }$$ exchange) as shown in Fig. 6.39. Decays like $$K^0\rightarrow {\mu }^-{\mu }^+$$ would then be allowed (Fig. 6.40), both at leading order and at one loop. However, the experimental branching ratio of the $$K^0\rightarrow {\mu }^-{\mu }^+$$ process is of the order of $${10}^{-9}$$: flavor-changing neutral currents (FCNC) appear to be strongly suppressed, even below what is predicted taken into account only the diagram involving double W exchange.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig40_HTML.gif
Fig. 6.40

$$K^0\rightarrow {\mu }^-{\mu }^+$$ decay diagrams

images/304327_2_En_6_Chapter/304327_2_En_6_Fig41_HTML.gif
Fig. 6.41

FCNC suppression by diagram cancellation

images/304327_2_En_6_Chapter/304327_2_En_6_Fig42_HTML.gif
Fig. 6.42

The two orthogonal combinations of the quarks s and d in the $$d'$$ and $$s'$$ states

Glashow, Iliopoulos, and Maiani proposed in 1970 the introduction of a fourth quark, the charmc, to symmetrize the weak currents, organizing the quarks into two SU(2) doublets. Such scheme, known as the GIM mechanism, solves the FCNC puzzle and was spectacularly confirmed with the discovery of the J/$$\psi $$ meson (see Sect. 5.​4.​4). FCNC are in this mechanism suppressed by the cancelation of the two lowest diagrams in Fig. 6.41. In fact, in the limit of equal masses the cancelation would be perfect but, as the c mass is much higher than u mass, the sum of the diagrams will lead to terms proportional to $$m^2_c/m^2_{Z, W}$$.

There are now two orthogonal combinations of the quarks s and d (Fig. 6.42):
$$\begin{aligned} d'= & {} d \cos \theta _c + s \sin \theta _c\\ s'= & {} - d \sin \theta _c + s \cos \theta _c \end{aligned}$$
which couple, via the $$W^{\pm }$$, respectively to the u and c quarks.
The GIM mechanism can be translated in a matrix form as
$$\begin{aligned} \left( \begin{array}{c} d'\\ s' \end{array}\right) = V_C \left( \begin{array}{c} d \\ s \end{array} \right) =\left( \begin{array}{cc} \cos {\theta }_c &{} \sin {\theta }_c \\ -\sin {\theta }_c &{} \cos {\theta }_c \end{array} \right) \left( \begin{array}{c} d \\ s \end{array} \right) \, \end{aligned}$$
(6.258)
where $$V_C$$ is a 2 $$\times $$ 2 rotation matrix.

6.3.7 Extension to Three Quark Families: The CKM Matrix

A generic mixing matrix for three families can be written as
$$\begin{aligned} V_{CKM}= \left( \begin{array}{ccc} V_{ud} &{} V_{us} &{} V_{ub} \\ V_{cd} &{} V_{cs} &{} V_{cb} \\ V_{td} &{} V_{ts} &{} V_{tb} \end{array} \right) , \end{aligned}$$
(6.259)
meaning that, for example, the square of the coupling of the b quark to the u quark in the weak transition (which is in turn proportional to the probability of the transition) would be:
$$\begin{aligned} |g_{ub}|^2 = |V_{ub}|^2 g^2_W \, . \end{aligned}$$
(6.260)
The Japanese physicists Makoto Kobayashi and Toshihide Maskawa proposed this form of quark mixing matrix in 1973. Their work was built on that of Cabibbo and extended the concept of quark mixing from two to three generations of quarks. It should be noted that, at that time, the third generation had not been observed yet and even the second was not fully established. But, as we shall see, the extension to three families would allow to qualitatively explain the violation of the CP symmetry, i.e., of the product of the operations of charge conjugation and parity. In 2008, Kobayashi and Maskawa shared one half of the Nobel Prize in Physics “for the discovery of the origin of the broken symmetry which predicts the existence of at least three families of quarks in nature.”
A priori, being the $$V_{ij}$$ complex numbers, the CKM matrix might have $$2N^2$$ degrees of freedom; however, the physical constraints reduce the free elements to $$(N-1)^2.$$ The physical constraints are:
  • Unitarity. If there are only three quark families, one must have
    $$\begin{aligned} V^\dag V = I , \end{aligned}$$
    (6.261)
    where I is the identity matrix. This will guarantee that in an effective transition each u-type quark will transform into one of the three d-type quarks (i.e., that the current is conserved and no fourth generation is present). This constraint reduces the number of degrees of freedom to $$N^2$$; the six equations underneath can be written explicitly as (the so-called weak invariance):
    $$\begin{aligned} \sum _k |V_{ik}|^2 = 1 \; (i=1,2,3) \, \end{aligned}$$
    (6.262)
    and
    $$\begin{aligned} \sum _k V^*_{jk}V_{ik} = 0 \; (i > j) \, . \end{aligned}$$
    (6.263)
    This last equation is a constraint on three sets of three complex numbers, telling that these numbers form the sides of a triangle in the complex plane. There are three independent choices of i and j, and hence three independent triangles; they are called unitarity triangles, and we shall discuss them later in larger detail.
  • Phase invariance. $$2N -1$$ of these parameters leave physics invariant, since one phase can be absorbed into each quark field, and an overall common phase is unobservable. Hence, the total number of free variables is $$N^2 - (2N - 1) = (N - 1)^2.$$

Four independent parameters are thus required to fully define the CKM matrix ($$N=3$$). This implies that the most general 3$$\,\times \,$$3 unitary matrix cannot be constructed using real numbers only: Eq. 6.261 implies that a real matrix has only three degrees of freedom, and thus at least one imaginary parameter is required.

Many parameterizations have been proposed in the literature. An exact parametrization derived from the original work by Kobayashi and Maskawa (KM) extends the concept of Cabibbo angle; it uses three angles $$\theta _{12}$$, $$\theta _{13}$$, $$\theta _{23}$$, and a phase $$\delta $$:
$$\begin{aligned} V_{KM}= \left( \begin{array}{ccc} c_{12} c_{13} &{} s_{12} c_{13} &{} s_{13} e^{-i\delta } \\ -s_{12}c_{23} - c_{12}s_{23}s_{13} e^{i\delta } &{} c_{12}c_{23} - s_{12}s_{23}s_{13} e^{i\delta } &{} s_{23}c_{13} \\ s_{12}s_{23} - c_{12}c_{23}s_{13} e^{i\delta } &{} -c_{12}s_{23} - s_{12}c_{23}s_{13} &{} c_{23}c_{13} \end{array} \right) \, , \end{aligned}$$
(6.264)
with the standard notations $$s_{ij}=\sin \theta _{ij}$$ and $$c_{ij}=\cos \theta _{ij}$$ ($$\theta _{12}$$ is the Cabibbo angle).
Another frequently used parametrization of the CKM matrix is the so-called Wolfenstein parametrization. It refers to four free parameters $$\lambda $$, A, $${\rho }$$, and $${\eta }$$, defined as
$$\begin{aligned} \lambda= & {} s_{12} = \frac{|V_{us}|}{\sqrt{|V_{us}|^2 + |V_{ud}|^2}} \end{aligned}$$
(6.265)
$$\begin{aligned} A= & {} s_{23}/\lambda ^2 \end{aligned}$$
(6.266)
$$\begin{aligned} s_{13}e^{i\delta }= & {} A\lambda ^3(\rho +i\eta ) \end{aligned}$$
(6.267)
($$\lambda $$ is the sine of the Cabibbo angle). We can use the experimental fact that $$s_{13} \ll s_{23} \ll s_{12} \ll 1$$ and expand the matrix in powers of $$\lambda $$. We obtain at order $$\lambda ^4$$:
$$\begin{aligned} V_{W} \simeq \left( \begin{array}{ccc} 1-\frac{1}{2}\lambda ^2 &{} \lambda &{} A\lambda ^3(\rho - i\eta ) \\ -\lambda &{} 1-\frac{1}{2}\lambda ^2 &{} A\lambda ^2\\ A\lambda ^3(1-\rho -i\eta ) &{} -A\lambda ^2 &{} 1 \end{array} \right) . \end{aligned}$$
(6.268)
As we shall see in the following, the combination of parameters $$\bar{\rho } = \rho (1-\lambda ^2/2)$$ and $$\bar{\eta } = \eta (1-\lambda ^2/2)$$ can be very useful.
The experimental knowledge of the terms of the CKM matrix comes essentially for the comparative study of probability of transitions between quarks. It is anyway challenging and difficult, since quarks are embedded in hadrons, and form factors for which only numerical QCD calculations are possible play a relevant role. In any case, the present (PDG 2017) experimental knowledge of the CKM matrix can be summarized in terms of the Wolfenstein parameters as:
$$\begin{aligned} \lambda= & {} 0.22506\pm 0.00050\\ A= & {} 0.811 \pm 0.026\\ \bar{\rho } = \rho (1-\lambda ^2/2)= & {} 0.124^{+ 0.019}_{-0.018}\\ \bar{\eta } = \eta (1-\lambda ^2/2)= & {} \, 0.356 \pm 0.011 \, . \end{aligned}$$

6.3.8 CP Violation

Weak interactions violate the parity and the charge conjugation symmetries. But, for a while, it was thought that the combined action of charge and parity transformation (CP) would restore the harmony physicists like so much. Indeed a left-handed neutrino transforms under CP into a right-handed antineutrino and the conjugate CP world still obeys to the V-A theory. However, surprisingly, the study of the $$K^{0}-\overline{K}^{0}$$ system revealed in 1964 a small violation of the CP symmetry. In the turn of the century, CP violation was observed in many channels in the B sector. Since then, an intense theoretical and experimental work has been developed for the interpretation of these effects in the framework of the standard model, in particular by the precise determination of the parameters of the CKM matrix and by testing its self-consistency.

6.3.8.1 $$K^{\mathbf {0}}-\overline{K}^{\mathbf {0}}$$ Mixing

Already in 1955, Gell-Mann and Pais had observed that the $$K^{0}(d\bar{s})$$ and the $$\overline{K}^{0}(s\bar{d})$$, which are eigenstates of the strong interaction, could mix through weak box diagrams as those represented in Fig. 6.43.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig43_HTML.gif
Fig. 6.43

Leading box diagrams for the $$K^{0}-\overline{K}^{0}$$ mixing

A pure $$K^{0}$$ ($$\overline{K}^{0}$$) beam will thus develop a $$\overline{K}^{0}$$ ($$K^{0}$$) component, and at each time, a linear combination of the $$K^{0}$$ and of the $$\overline{K}^{0}$$ may be observed. Since CP is conserved in hadronic decays, the combinations which are eigenstates of CP are of particular relevance.

$$K^{0}$$ or $$\overline{K}^{0}$$ are not CP eigenstates: in fact, they are the antiparticle of each other and the action of the CP operator may be, choosing an appropriate phase convention, written as:
$$\begin{aligned} {CP} \left| K^{0}\right\rangle= & {} +\left| \overline{K}^{0}\right\rangle \end{aligned}$$
(6.269)
$$\begin{aligned} {CP} \left| \overline{K}^{0}\right\rangle= & {} +\left| K^{0}\right\rangle \, . \end{aligned}$$
(6.270)
Then the linear combinations
$$\begin{aligned} \left| K_{1}\right\rangle= & {} \frac{1}{\sqrt{2}}\left( \left| K^{0}\right\rangle +\left| \overline{K}^{0}\right\rangle \right) \end{aligned}$$
(6.271)
$$\begin{aligned} \left| K_{2}\right\rangle= & {} \frac{1}{\sqrt{2}}\left( \left| K^{0}\right\rangle -\left| \overline{K}^{0}\right\rangle \right) \end{aligned}$$
(6.272)
are CP eigenstates with eigenvalues $$+1$$ and $$-1$$, respectively.

The $$K_{1}$$ can thus decay into a two-pion system (which has CP $$=+1$$ eigenvalue), while the $$K_{2}$$ can, if CP is conserved, only decay into a three-pion system (which has CP $$=-1\ $$eigenvalue).

The phase spaces associated with these decay modes are, however, quite different: $$\left( m_{K}-2m_{\pi }\right) $$ $$\sim 220$$ MeV; $$\left( m_{K}-3m_{\pi }\right) \sim 80$$ MeV. Thus, the corresponding lifetimes are also quite different:
$$ \tau \left( K_{1}\rightarrow \pi \pi \right) \ \sim \ 0.1\,\,\mathrm{{ns}} \; ; \; \tau \left( K_{2}\rightarrow \pi \pi \pi \right) \ \sim \ 52\,\,\mathrm{{ns}}. $$
The short and the long lifetime states are usually designated by K-short ($$K_{S}$$) and K-long ($$K_{L}$$), respectively. These states are eigenstates of the free-particle Hamiltonian, which includes weak mixing terms, and if CP were a perfect symmetry, they would coincide with $$|K_1\,{\rangle }$$ and $$|K_2\,{\rangle }$$, respectively. The $$K_S$$ and $$K_L$$ wavefunctions evolve with time, respectively, as
$$\begin{aligned} |K_S(t)\rangle= & {} |K_S(t=0)\rangle e^{- \left( i m_S + \varGamma _S/2 \right) t}\end{aligned}$$
(6.273)
$$\begin{aligned} |K_L(t)\rangle= & {} |K_L(t=0)\rangle e^{- \left( i m_L + \varGamma _L/2 \right) t} \end{aligned}$$
(6.274)
where $$m_S$$ ($$m_L$$) and $$\varGamma _S$$ ($$\varGamma _L$$) are, respectively, the mass and the width of the $$K_S$$ ($$K_L$$) mesons (see Sect. 2.​6).
$$K^0$$ and $$\bar{K}^0$$, being a combination of $$K_S$$ and $$K_L$$,
$$\begin{aligned} \left| K^{0}\right\rangle= & {} \frac{1}{\sqrt{2}}\left( \left| K_S\right\rangle +\left| {K}_L\right\rangle \right) \end{aligned}$$
(6.275)
$$\begin{aligned} \left| \overline{K}^{0}\right\rangle= & {} \frac{1}{\sqrt{2}}\left( \left| K_S\right\rangle -\left| {K}_L\right\rangle \right) \end{aligned}$$
(6.276)
will also evolve in time. Indeed, considering initially a beam of pure $$K^{0}$$ with an energy of a few GeV, just after a few tens of cm, the large majority of the $$K_{S}$$ mesons will decay and the beam will become a pure $$K_{L}$$ beam. The probability to find a $$K^{0}$$ in this beam after a time t can be expressed as:
$$\begin{aligned} P_{K^{0}\longrightarrow K^{0}}(t)= & {} \left| \frac{1}{\sqrt{2}}\left( \langle K^0 | K_S(t)\rangle + \langle K^0 | K_L(t)\rangle \right) \right| ^2 \nonumber \\= & {} \frac{1}{4}\left( e^{-{\varGamma }_{S}t}+e^{-{\varGamma }_{L}t}+2e^{-\left( {\varGamma }_{S}+{\varGamma }_{L}\right) t/2\ }\cos \left( \varDelta m\ t\right) \right) \end{aligned}$$
(6.277)
where $${{\varGamma }_{S}=1/\tau }_{s}$$, $$\ {{\varGamma }_{L}=1/\tau }_{L}$$, and $$\varDelta m$$ is the difference between the masses of the two eigenstates. The last term, coming from the interference of the two amplitudes, provides a direct measurement of $$\varDelta m$$.
Similarly, the probability to find a $$\bar{K^{0}}$$ in this beam after a time t can be expressed as:
$$\begin{aligned} P_{K^{0}\longrightarrow \bar{K^{0}}}(t)= & {} \left| \frac{1}{\sqrt{2}}\left( \langle \bar{K^0} | K_S(t)\rangle + \langle \bar{K^0} | K_L(t)\rangle \right) \right| ^2 \nonumber \\= & {} \frac{1}{4}\left( e^{-{\varGamma }_{S}t}+e^{-{\varGamma }_{L}t}-2e^{-\left( {\varGamma }_{S}+{\varGamma }_{L}\right) t/2\ }\cos \left( \varDelta m\ t\right) \right) \end{aligned}$$
(6.278)
In the limit $${\varGamma }_{S}\longrightarrow 0$$,$$\ {\varGamma }_{L}\longrightarrow 0$$, a pure flavor oscillation between $$K^{0}$$ or $$\overline{K}^{0}$$ would occur:
$$ P_{K^{0}\longrightarrow K^{0}}\left( t\right) =\frac{1}{2}\left( 1+\cos \left( \varDelta m\ t\right) \right) ={\cos }^{2}\left( \varDelta m\ t\right) \, . $$
$$ P_{K^{0}\longrightarrow \overline{K^{0}}}\left( t\right) =\frac{1}{2}\left( 1-\cos \left( \varDelta m\ t\right) \right) ={\sin }^{2}\left( \varDelta m\ t\right) \, . $$
In the real case, however, the oscillation is damped and the survival probability of both $${K^{0}}$$ and $$\overline{K}^{0}$$ converges quickly to $$\ 1/4\ e^{-{\varGamma }_{L}t}$$.
Measuring the initial oscillation through the study of semileptonic decays, which will be discussed later on in this section, $$\varDelta m$$ was determined to be
$$ \varDelta m\sim \left( 3.483\pm 0.006\right) \times {10}^{-15}\ \mathrm {GeV}\, . $$
$$K_{S}$$ and $$K_{L}$$ have quite different lifetimes but almost the same mass.

6.3.8.2 CP Violation in 2$$\pi $$ Modes

In 1964, Christenson, Cronin, Fitch, and Turlay8 performed the historical experience (Fig. 6.44) that revealed by the first time the existence of a small fraction of two-pion decays in a $$K_L$$ beam:
$$ R=\frac{\varGamma \left( K_{L}\rightarrow \pi ^{+}\pi ^{-}\right) }{\varGamma \left( K_{L}\rightarrow \mathrm{{all\;charged\;modes}}\right) }=\left( 2.0\pm 0.4\right) \times {10}^{-3} \, . $$
The $$K_{L}$$ beam was produced in a primary target placed $$17.5\,\mathrm {m}$$ downstream the experiment, and the observed decays occurred in a volume of He gas to minimize interactions. Two spectrometers each composed by two spark chambers separated by a magnet and terminated by a scintillator and a water Cherenkov measured and identified the charged decay products.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig44_HTML.gif
Fig. 6.44

Layout of the Christenson, Cronin, Fitch, and Turlay experiment that demonstrated the existence of the decay $$K_{L}\rightarrow \pi ^{+}\pi ^{-}$$. ©The Nobel Foundation

The presence of two-pion decay modes implied that the long-lived $$K_{L}$$ was not a pure eigenstate of CP. The $$K_{S}$$ and $$K_{L}$$ should then have a small component of $$K_{1}$$ and $$K_{2}$$, respectively:
$$\begin{aligned} \left| K_{S}\right\rangle =\frac{1}{\sqrt{1+\left| \varepsilon \right| ^{2}}}\ \left( \left| K_{1}\right\rangle +\varepsilon \left| K_{2}\right\rangle \right) \end{aligned}$$
(6.279)
$$\begin{aligned} \left| K_{L}\right\rangle =\frac{1}{\sqrt{1+\left| \varepsilon \right| ^{2}}}\ \left( \left| K_{2}\right\rangle - \varepsilon \left| K_{1}\right\rangle \right) \end{aligned}$$
(6.280)
where $$\varepsilon $$ is a small complex parameter
$$\begin{aligned} \phi _{\varepsilon }\simeq \tan ^{-1}\frac{2\varDelta m}{\varDelta \varGamma } \end{aligned}$$
(6.281)
and $$\varDelta m$$ and $$\varDelta \varGamma $$ are, respectively, the differences between the masses and the decay widths of the two eigenstates.
Alternatively, $$K_{S}$$ and $$K_{L}$$ can be expressed as a function of the flavor eigenstates $$K^{0}$$ and $$\overline{K}^{0}$$ as
$$\begin{aligned} \left| K_{s}\right\rangle =\frac{1}{\sqrt{2\left( 1+\left| \varepsilon \right| ^{2}\right) }}\left( \left( 1+\varepsilon \right) \left| K^{0}\right\rangle +(1-\varepsilon )\left| \overline{K}^{0}\right\rangle \right) \end{aligned}$$
(6.282)
$$\begin{aligned} \left| K_{L}\right\rangle =\frac{1}{\sqrt{2\left( 1+\left| \varepsilon \right| ^{2}\right) }}\left( \left( 1+\varepsilon \right) \left| K^{0}\right\rangle -(1-\varepsilon )\left| \overline{K}^{0}\right\rangle \right) \end{aligned}$$
(6.283)
or, inverting the last two equations,
$$\begin{aligned} \left| K^{0}\right\rangle= & {} \frac{1}{1+\varepsilon }\sqrt{\frac{1+\left| \varepsilon \right| ^{2}}{2}}\ \left( \left| K_{s}\right\rangle +\left| K_{L}\right\rangle \right) \end{aligned}$$
(6.284)
$$\begin{aligned} \left| \overline{K}^{0}\right\rangle= & {} \frac{1}{1+\varepsilon }\sqrt{\frac{1+\left| \varepsilon \right| ^{2}}{2}}\ \left( \left| K_{s}\right\rangle -\left| K_{L}\right\rangle \right) \, . \end{aligned}$$
(6.285)
The probability that a state initially produced as a pure $$K^0$$ or $$\bar{K}^0$$ will decay into a $$2\pi $$ system will then evolve in time. A “2$$\pi $$ asymmetry” is usually defined as:
$$\begin{aligned} A_{\pm }(t)=\frac{\varGamma \left( \overline{K}^{0}_{t=0}\longrightarrow \ \pi ^{+}\pi ^{-}_{(t)}\right) \,-\,\varGamma \left( K^{0}_{t=0}\longrightarrow \ \pi ^{+}\pi ^{-}_{(t)}\right) }{\varGamma \left( \overline{K}^{0}_{t=0}\longrightarrow \ \pi ^{+}\pi ^{-}_{(t)}\right) \,+\,\varGamma \left( K^{0}_{t=0}\longrightarrow \ \pi ^{+}\pi ^{-}_{(t)}\right) } \, . \end{aligned}$$
(6.286)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig45_HTML.gif
Fig. 6.45

Asymmetry in 2$$\pi $$ decays between $$K^{0}$$ and $$\overline{K}^{0}$$ tagged events. Time is measured in $$K_{s}$$ lifetimes.

From A. Angelopoulos et al. Physics Reports 374 (2003) 165

This asymmetry depends on $$\varepsilon $$ and $$\varDelta m$$ and was measured, for instance, by the CPLEAR experiment at CERN (Fig. 6.45) as a function of the time. Fixing $$\varDelta m$$ to the world average, it was obtained:
$$ \left| \varepsilon \right| =\left( 2.264\pm 0.023\right) {10}^{-3},\phi _{\varepsilon }={\left( 43.19\pm 0.53\right) }^{\circ } \, . $$

6.3.8.3 CP Violation in Semileptonic $${{K}}^{{{0}}}$$, $$\overline{K}^{0}$$ Decays

$$K^{0}$$ and $$\overline{K}^{0}$$ decay also semileptonically through the channels:
$$ K^{0}\rightarrow \pi ^{-}e^{+}\nu _{e}\; ; \; \overline{K}^{0}\rightarrow \pi ^{+}e^{-}{\overline{\nu }}_{e} $$
and thus CP violation can also be tested measuring the charge asymmetry $$A_{L}$$,
$$\begin{aligned} A_{L}=\frac{K_{L}\rightarrow \pi ^{-}l^{+}\nu -K_{L}\rightarrow \pi ^{+}l^{-}\overline{\nu }}{K_{L}\rightarrow \pi ^{-}l^{+}\nu +K_{L}\rightarrow \pi ^{+}l^{-}\overline{\nu }}. \end{aligned}$$
(6.287)
This asymmetry is related to the CP violating parameter $$\varepsilon $$:
$$\begin{aligned} A_{L}=\frac{{\left( 1+\varepsilon \right) }^{2}-{\left( 1-\varepsilon \right) }^{2}}{{\left( 1+\varepsilon \right) }^{2}+{\left( 1-\varepsilon \right) }^{2}}\approx 2\ Re\left( \varepsilon \right) . \end{aligned}$$
(6.288)
The measured value $$A_{L}$$ is positive, and it is in good agreement with the measurement of $$\varepsilon $$ obtained in the 2$$\pi $$ decay modes. The number of $$K_{L}$$ having in their decay products an electron is slighter smaller (0.66%) than the number of $$K_{L}$$ having in their decay products a positron. There is thus an unambiguous way to define what is matter and what is antimatter.

6.3.8.4 Direct CP Violation

CP violation was so far discussed, in the mixing system $$K^{0}-\,\overline{K}^{0}$$, in terms of a not perfect identification between the free-particle Hamiltonian eigenstates ($$K_{S}$$, $$K_{L}$$) and the CP eigenstates ($$K_{1}$$, $$K_{2}$$) as it was expressed in equations 6.279 and 6.280.

In this context, the decays of $$K_{s}$$ and $$K_{L}$$ into 2$$\pi $$ modes are only due to the presence in both states of a $$K_{1}$$ component. It is then expected that the ratio of the decay amplitudes of the $$K_{L}$$ and of the $$K_{s}$$ into 2$$\pi $$ modes should be equal to $$\varepsilon $$ and independent of the charges of the two pions:
$$\begin{aligned} \eta =\frac{A\left( K_{L}\rightarrow \pi \pi \right) }{A\left( K_{s}\rightarrow \pi \pi \right) }=\varepsilon \, . \end{aligned}$$
(6.289)
However, it was experimentally established that
$$\begin{aligned} \eta ^{+-}=\frac{A\left( K_{L}\rightarrow \pi ^{+}\pi ^{-}\right) }{A\left( K_{S}\rightarrow \pi ^{+}\pi ^{-}\right) } \end{aligned}$$
(6.290)
and
$$\begin{aligned} \eta ^{00}=\frac{A\left( K_{L}\rightarrow \pi ^{0}\pi ^{0}\right) }{A\left( K_{S}\rightarrow \pi ^{0}\pi ^{0}\right) } \end{aligned}$$
(6.291)
although having both a similar value (about 2 $$\times $$ $$10^{-3}$$) are significantly, different. In fact, their present experimental ratio is:
$$\begin{aligned} \left| \frac{\eta ^{00}}{\eta ^{+-}}\right| =0.9950\pm 0.0007 \, . \end{aligned}$$
(6.292)
This difference is interpreted as the existence of a direct CP violation in the $$K_{2}$$ decays. In other words, the decay rate of a meson to a given final state is not equal to the decay rate of its antimeson to the corresponding CP-conjugated final state:
$$\begin{aligned} \varGamma \left( M\rightarrow f\right) \ne \varGamma \left( \overline{M}\rightarrow \overline{f}\right) \, . \end{aligned}$$
(6.293)
The CP violation discussed previously in the mixing of the system $$K_{0}\,-\,\overline{K}_{0}$$ is now denominated indirect CP violation. This CP violation is related to the observation that the oscillation of a given meson to its antimeson may be different from the inverse oscillation of the antimeson to the meson:
$$\begin{aligned} \varGamma \left( M\rightarrow \overline{M}\right) \ne \varGamma \left( \overline{M}\rightarrow M\right) . \end{aligned}$$
(6.294)
Finally, CP violation may also occur whenever both the meson and its antimeson can decay to a common final state with or without $$M-\overline{M}$$ mixing:
$$\begin{aligned} \varGamma \left( M\rightarrow f\right) \ne \varGamma \left( \overline{M}\rightarrow f\right) . \end{aligned}$$
(6.295)
In this case, both direct and indirect CP violations may be present.
The direct CP violation is usually quantified by a parameter $$\varepsilon '$$. Assuming that this direct CP violation occurs in the K decays into 2$$\pi $$ modes due to the fact that the 2$$\pi $$ system may be formed in different isospin states ($$I=0,\, 2$$) and the corresponding decay amplitudes may interfere, it can be shown that $$\eta ^{+-}$$ and $$\eta ^{00}$$ can be written as
$$\begin{aligned} \eta ^{+-}&=\varepsilon +\varepsilon '\end{aligned}$$
(6.296)
$$\begin{aligned} \eta ^{00}&=\varepsilon -2\,\varepsilon ' \, . \end{aligned}$$
(6.297)
The ratio between the CP violating parameters can also be related to the double ratio of the decay probabilities $$K_{L}$$ and $$K_{s}$$ into specific 2$$\pi $$ modes:
$$\begin{aligned} Re\left( \frac{\varepsilon ^{'}}{\varepsilon }\right) =\frac{1}{6}\left( 1-\frac{\left| \eta ^{00}\right| ^{2}}{\left| \eta ^{\pm }\right| ^{2}}\right) =\frac{1}{6}\ \left( 1-\frac{\varGamma \left( K_{L}\rightarrow \pi ^{0}\pi ^{0}\right) \varGamma \left( K_{s}\rightarrow \pi ^{+}\pi ^{-}\right) }{\varGamma \left( K_{L}\rightarrow \pi ^{+}\pi ^{-}\right) \varGamma \left( K_{s}\rightarrow \pi ^{0}\pi ^{0}\right) }\right) . \end{aligned}$$
(6.298)
The present (PDG 2016) experimental value for this ratio is
$$\begin{aligned} Re\left( \frac{\varepsilon ^{'}}{\varepsilon }\right) \approx \frac{\varepsilon ^{'}}{\varepsilon }=\left( 1.66\pm 0.23\right) \times {10}^{-3} \, . \end{aligned}$$
(6.299)

6.3.8.5 CP Violation in the B Sector

Around 40 years after the discovery of the CP violation in the $${K}^{0}-\overline{{K}}^{0}$$ system, a large CP violation in the $${B}^{0}-\overline{{B}}^{0}$$ system was observed. The $${B}^{0}$$ ($$\overline{{B}}^{0}$$) differs at the quark level from the $${K}^{0}$$ ($$\overline{{K}}^{0})$$ just by the replacement of the s ($$\bar{s}$$) quark by a b ($$\overline{b}$$) quark. Thus, $${B}^{0}$$ and $$\overline{{B}}^{0}$$ should mix through similar weak box diagrams (Fig. 6.46), and the CP eigenstates should be also a combination of both.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig46_HTML.gif
Fig. 6.46

Leading box diagrams for the $$B^{0}-\overline{B}^{0}$$ mixing.

From S. Braibant, G. Giacomelli, and M. Spurio, “Particles and fundamental interactions”, Springer 2012

However, these CP eigenstates have similar lifetimes since the b quark has a much larger mass than the s quark and thus the decay phase space is large for both CP eigenstates. These eigenstates are called B-Light ($$B_{L}$$) and B-Heavy ($$B_{H}$$) according to their masses, although their mass difference, $${\varDelta m}_{\mathrm{B}^{0}}\sim \left( 3.337\pm 0.033\right) \times {10}^{-13}$$ GeV, is small. The $$B_{L}$$ and $$B_{H}$$ meson cannot, therefore, be disentangled just by allowing one of them to decay and thus there are no pure $$B_{L}$$ or $$B_{H}$$ beams. Another strategy has to be followed.

In fact, the observation of the CP violation in the B sector was first found studying the time evolution of the decay rates of the $${B}^{0}$$ and the $$\overline{{B}}^{0}$$ mesons to a common final state ($$\varGamma \left( M\rightarrow f\right) \ne \varGamma \left( \overline{M}\rightarrow f\right) $$), namely to $$J/\psi \ K_{S}$$.

At the BaBar experiment,9 B mesons pairs were produced in the reaction
$$ e^{+}e^{-}\rightarrow \varUpsilon \left( 4S\right) \rightarrow B^{0}\overline{B}^{0} \, . $$
The $$B^{0}\overline{B}^{0}$$ states evolved entangled, and therefore, if one of the mesons was observed (“tagged”) at a given time, the other had to be its antiparticle. The “tag” of the flavor of the B mesons could be done through the determination of the charge of the lepton in B semileptonic decays:
$$\begin{aligned} B^{0}\rightarrow D^{-}l^{+}\nu _{l}\ (\overline{b}\rightarrow \bar{c}\, l^{+}\nu _{l}) \; ; \; {\overline{B}}^{0}\rightarrow D^{+}l^{-}{\overline{\nu }}_{l}(b\rightarrow c{\ l}^{-}{\overline{\nu }}_{l}) \, . \end{aligned}$$
(6.300)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig47_HTML.gif
Fig. 6.47

Decay rate to $$J/\psi \ K_{S}$$ as a function of time of each of the B flavor states (top) and the derived time asymmetry (bottom).

From C. Chen (BaBar), Contribution to the 34th International Conference on High-Energy Physics (July 2008)

It was thus possible to determine the decay rate of the untagged B meson to $$J/\psi \ K_{S}$$ as a function of its decay time. This rate is shown, both for “tagged” $$B^{0}$$and $${\overline{B}}^{0}$$ in Fig. 6.47. The observed asymmetry:
$$\begin{aligned} A_{CP}\left( t\right) =\frac{\varGamma \left( {\overline{B}}^{0}(t)\rightarrow J/\psi \ K_{S}\right) -\varGamma (B^{0}(t)\rightarrow J/\psi \ K_{S})}{\varGamma \left( {\overline{B}}^{0}(t)\rightarrow J/\psi \ K_{S}\right) +\varGamma (B^{0}(t)\rightarrow J/\psi \ K_{S})} \end{aligned}$$
(6.301)
is a clear proof of the CP violation in this channel. This asymmetry can be explained by the fact that the decays can occur with or without mixing. The decay amplitudes for these channels may interfere. In the case of the $$B^{0}$$, the relevant amplitudes are $$A_{1}\left( B^{0}\rightarrow J/\psi \ K_{S}\right) $$ and $$A_{2}\left( B^{0}\rightarrow {\overline{B}}^{0}\rightarrow J/\psi \ K_{S}\right) $$.

Nowadays, after the experiments Belle and BaBar at the B factories at KEK and SLAC, respectively, and after the first years of the LHCB experiment at LHC, there is already a rich spectrum of B channels where CP violation was observed at a level above $$5\sigma $$. These results allowed a precise determination of most of the parameters of the CKM matrix and intensive tests of its unitarity as it will be briefly discussed in the next section.

6.3.8.6 CP Violation in the Standard Model

CP violation in weak interactions can be linked to the existence of the complex phase of the CKM matrix which is expressed by the parameters $$\delta $$ and $$\eta $$, respectively, in the KM and in the Wolfenstein parametrizations (see Sect. 6.3.7). As a consequence, a necessary condition for the appearance of the complex phase, and thus for CP violation, is the presence of at least three generations of quarks (this clarifies the power of the intuition by Kobayashi and Maskawa). The reason why a complex phase in the CKM matrix causes CP violation can be seen as follows. Consider a process $$A\rightarrow B$$ and the CP-conjugated $$\overline{A}\rightarrow \overline{B}$$ between their antiparticles, with the appropriate helicity reversal. If there is no CP violation, the amplitudes, let us call them $${\mathcal {M}}$$ and $$\tilde{{\mathcal {M}}}$$, respectively, must be given by the same complex number (except that the CKM terms get conjugated). We can separate the magnitude and phase by writing
$$\begin{aligned} {\mathcal {M}}= & {} \left| {\mathcal {M}}_{1}\right| e^{i\phi _{1}}e^{i\delta _{1}} \end{aligned}$$
(6.302)
$$\begin{aligned} \tilde{{\mathcal {M}}}= & {} \left| {\mathcal {M}}_{1}\right| e^{i\phi _{1}}e^{-i\delta _{1}} \end{aligned}$$
(6.303)
where $$\delta _{1}$$ is the phase term introduced from the CKM matrix (called often “weak phase”) and $$\phi _{1}$$ is the phase term generated by CP-invariants interactions in the decay (called often “strong phase”). The exact values of these phases depend on the convention but the differences between the weak phases and between the strong phases in any two different terms of the decay amplitude are independent of the convention.
Since physically measurable reaction rates are proportional to $${\left| {\mathcal {M}}\right| }^{2}$$, so far nothing is different. However, consider a process for which there are different paths (say for simplicity two paths). Now we have:
$$\begin{aligned} {\mathcal {M}}= & {} \left| {\mathcal {M}}_{1}\right| e^{i\phi _{1}}e^{i\delta _{1}}+\left| {\mathcal {M}}_{2}\right| e^{i\phi _{2}}e^{i\delta _{2}}\end{aligned}$$
(6.304)
$$\begin{aligned} \tilde{{\mathcal {M}}}= & {} \left| {\mathcal {M}}_{1}\right| e^{i\phi _{1}}e^{-i\delta _{1}}+\left| {\mathcal {M}}_{2}\right| e^{i\phi _{2}}e^{i\delta _{2}} \end{aligned}$$
(6.305)
and in general $${\left| {\mathcal {M}}\right| }^{2}\ne {\left| \tilde{{\mathcal {M}}}\right| }^{2}$$. Thus, a complex phase may give rise to processes that proceed at different rates for particles and antiparticles, and the CP symmetry may be violated. For example, the decay $$B^{0}\rightarrow K^{+}\pi ^{-}$$ is 13% more common than its CP conjugate $${\overline{B}}^{0}\rightarrow K^{-}\pi ^{+}$$.
The unitarity of the CKM matrix imposes, as we have discussed in Sect. 6.3.7), three independent orthogonality conditions:
$$ \sum _{k}V_{jk}^{*}V_{ik}=0\ (i > j). $$
These conditions are sums of three complex numbers and thus can be represented in a complex plane as triangles, usually called the unitarity triangles.
In the triangles obtained by taking scalar products of neighboring rows or columns, the modulus of one of the sides is much smaller than the other two. The equation for which the moduli of the triangle are most comparable is
$$\begin{aligned} V_{ud}V_{ub}^{*}+V_{cd}V_{cb}^{*}+V_{td}V_{tb}^{*}=0 \, . \end{aligned}$$
(6.306)
The corresponding triangle is shown in Fig. 6.48.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig48_HTML.gif
Fig. 6.48

One of the six unitarity triangles. The description of the sides in terms of the parameters in the Wolfenstein parametrization is shown

The triangle is represented in the ($$\overline{\rho },\overline{\eta }$$) phase space (see the discussion on the Wolfenstein parametrization in Sect. 6.3.7); its sides were divided by $$\left| V_{cd}V_{cb}^{*}\right| $$, which is the best-known element in the sum; and it is rotated in order that the side with unit length is aligned along the real ($$\overline{\rho }$$) axis. The apex of the triangle is by construction located at ($$\overline{\rho },\overline{\eta }$$), and the angles can be defined by:
$$\begin{aligned} \alpha \equiv \arg \left( -\frac{V_{td}V_{tb}^{*}}{V_{ud}V_{ub}^{*}}\right) \ ;\;\beta \equiv \arg \left( -\frac{V_{cd}V_{cb}^{*}}{V_{td}V_{tb}^{*}}\right) \ ;\;\gamma \equiv \arg \left( -\frac{V_{ud}V_{ub}^{*}}{V_{cd}V_{cb}^{*}}\right) \, . \end{aligned}$$
(6.307)
It can also be demonstrated that the areas of all unitarity triangles are the same, and they equal half of the so-called Jarlskog invariant (from the Swedish physicist Cecilia Jarlskog), which can be expressed as $$J\simeq A^{2}\lambda ^{6}\eta $$ in the Wolfenstein parametrization.

The fact that the Jarlskog invariant is proportional to $$\eta $$ shows that the unitarity triangle is a measure of CP violation: if there is no CP violation, the triangle degenerates into a line. If the three sides do not close to a triangle, this might indicate that the CKM matrix is not unitary, which would imply the existence of new physics, in particular the existence of a fourth quark family.

The present (2016) experimental constrains on the CKM unitarity triangle, as well as a global fit to all the existing measurements by the CKMfitter group,10 are shown in Fig. 6.49.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig49_HTML.gif
Fig. 6.49

Unitarity triangle and global CKM fit in the plane ($$\overline{\rho },\overline{\eta }$$). Results from PDG 2017; updated results and plots are available at http://​ckmfitter.​in2p3.​fr

All present results are consistent with the CKM matrix being the only source of CP violation in the standard model. Nevertheless, it is widely believed that the observed matter–antimatter asymmetry in the Universe (see next section) requires the existence of new sources of CP violation that might be revealed either in the quark sector as small inconsistencies at the CKM matrix, or elsewhere, like in precise measurements of the neutrino oscillations or of the neutron electric dipole moments. The real nature of CP violation is still to be understood.

6.3.9 Matter–Antimatter Asymmetry

The existence of antimatter predicted by Dirac in 1930 and discovered by Anderson (see Chap. 3) is still today the object of intense study and speculation: Would the physics of an antimatter-dominated Universe be identical to the physics of the matter-dominated Universe we are leaving in? Is there any other CP violation process than the tiny ones observed so far? How, in the framework of the Big Bang model, did the Universe became matter dominated?

Antiparticles are currently produced in accelerators and observed in cosmic rays interactions in a small amount level (for instance, $$\ \overline{p}/p\sim {10}^{-4}$$) (see Chap. 10). At CERN the study of antimatter atoms has been pursued in the last 20 years. Antihydrogen atoms have been formed and trapped for periods as long as 16 min and recently the first antihydrogen beams were produced. The way is open to detailed studies of the antihydrogen hyperfine transitions and to the measurement of the gravitational interactions between matter and antimatter. The electric charge of the antihydrogen atom was found by the ALPHA experiment to be compatible with zero to eight decimal places ($$Q_{\overline{H}}\simeq \left( -1.3\ \pm 1.1\pm 0.4\right) \ {10}^{-8}$$ $$e$$).

No primordial antimatter was observed so far, while the relative abundance of baryons ($${n}_{{B}}$$) to photons ($${n}_{\gamma }$$) was found to be (see Sect. 8.​1.​3):
$$\begin{aligned} {\eta }\,=\,\frac{{n}_{{B}}}{{n}_{\gamma }}\sim {5}\times {10}^{{-}{10}}{\ \ }. \end{aligned}$$
(6.308)
Although apparently small, this number is many orders of magnitude higher than what could be expected if there would be in the early Universe a equal number of baryons and antibaryons. Indeed in such case the annihilation between baryons and antibaryons would have occurred until its interaction rate equals the expansion rate of the Universe (see Sect. 8.​1.​2) and the expected ratios were computed to be:
$$\begin{aligned} \frac{{n}_{{B}}}{{n}_{\gamma }}\,=\,\frac{{n}_{\overline{{B}}}}{{n}_{\gamma }}\sim {\ }{10}^{{-}{18}} \, . \end{aligned}$$
(6.309)
The excess of matter over antimatter should then be present before nucleons and antinucleons are formed. On the other hand, inflation (see Sect. 8.​3.​2) would wipe out any excess of baryonic charge present in the beginning of the Big Bang. Thus, this excess had to be originated by some unknown mechanism (baryogenesis) after inflation and before or during the quark–gluon plasma stage.
In 1967, soon after the discovery of the CMB and of the violation of CP in the $${K}^{0}-\overline{{K}^{0}}$$ system (see Sect. 6.3.8.2), Andrej Sakharov11 modeled the Universe evolution from a baryonic number $${B=0\ }$$initial state to the $${B}\ne {0}$$ present state. This model imposed three conditions which are nowadays known as the Sakharov conditions:
  1. 1.

    Baryonic number (B) should be violated.

     
  2. 2.

    Charge (C) and Charge and Parity (CP) symmetries should be violated.

     
  3. 3.

    Baryon-number violating interactions should have occurred in the early Universe out of thermal equilibrium.

     

The first condition is obvious. The second is necessary since if C and CP were conserved any baryonic charge excess produced in a given reaction would be compensated by the conjugated reaction. The third is more subtle: if the baryon-number violating interactions would have occurred in thermal equilibrium, other processes would restore the symmetry between baryons and antibaryons imposed by Boltzmann distribution.

Thermal equilibrium may have been broken when symmetry-breaking processes had occurred. Whenever two phases are present, the boundary regions between these (for instance the surfaces of bubbles in boiling water) are out of thermal equilibrium. In the framework of the standard model (see Chap. 7), this fact could in principle had occurred at the electroweak phase transition. However, it was demonstrated analytically and numerically that, for a Higgs with a mass as the one observed recently ($$m_{H}\sim {125\,\mathrm{GeV}}$$), the electroweak phase transition does not provide the thermal instability required for the formation of the present baryon asymmetry in the Universe.

The exact mechanism responsible for the observed matter–antimatter asymmetry in the Universe is still to be discovered. Clearly the standard model is not the end of physics.

6.4 Strong Interactions and QCD

The quark model simplifies the description of hadrons. We saw that deep inelastic scattering evidences a physical reality for quarks, although the interaction between these particles is very peculiar, since no free quarks have been observed up to now. A heuristic form of the potential between quarks with the characteristics needed has been shown.

Within the quark model, we needed to introduce a new quantum number, the color, to explain how bound stated of three identical quarks can exist and not violate the Pauli exclusion principle. Invariance with respect to color can be described by a symmetry group SU(3)$$_c$$, where the subscript c indicates color.

The theory of quantum chromodynamics (QCD) enhances the concept of color from a role of label to the role of charge and is the basis for the description of the interactions binding quarks in hadrons. The phenomenological description through an effective potential can be seen as a limit of this exact description, and the strong interactions binding nucleons can be explained as van der Waals forces between neutral objects.

QCD has been extensively tested and is very successful. The American physicists David J. Gross, David Politzer, and Frank Wilczek shared the 2004 Nobel Prize for physics by devising an elegant mathematical framework to express the asymptotic (i.e., in the limit of very short distances, equivalent to the high momentum transfer limit) freedom of quarks in hadrons, leading to the development of QCD.

However, a caveat should be stressed. At very short distances, QCD is essentially a theory of free quarks and gluons—with relatively weak interactions, and observables can be perturbatively calculated. At longer wavelengths, of the order the proton size $$\sim $$$$\mathrm {fm} = 10^{-15}\,\mathrm {m}$$, the coupling parameter between partons becomes too large to compute observables (we remind that exact solutions are in general impossible, and perturbative calculations must be performed): the Lagrangian of QCD, that in principle contains all physics, becomes de facto of little help in this regime. Parts of QCD can thus be calculated in terms of the fundamental parameters using the full dynamical (Lagrangian) representation, while for other sectors one should use models, guided by the characteristics of the theory, whose effective parameters cannot be calculated but can be constrained by experimental data.

6.4.1 Yang–Mills Theories

Before formulating QCD as a gauge theory, we must extend the formalism shown for the description of electromagnetism (Sect. 6.2.6) to a symmetry group like SU(3). This extension is not trivial, and it was formulated by Yang and Mills in the 1950s.

U(1). Let us first summarize the ingredients of the U(1) gauge theory—which is the prototype of the abelian gauge theories, i.e., of the gauge theories defined by symmetry groups for which the generators commute. We have seen in Sect. 6.2.3 that the requirement that physics is invariant under local U(1) phase transformation implies the existence of the photon gauge field. QED can be derived by requiring the Lagrangian to be invariant under local U(1) transformations of the form $$U = e^{iq\chi (x)I}$$—note the identity operator I, which, in the case of U(1), is just unity. The recipe is:
  • Find the gauge invariance of the theory—in the case of electromagnetism U(1):
    $$\begin{aligned} \psi (x) \rightarrow \psi '(x) = U(x) \psi (x) = \psi (x) e^{iq\chi (x)} \, . \end{aligned}$$
    (6.310)
  • Replace the derivative in the Lagrangian with a covariant derivative
    $$\begin{aligned} \partial _\mu \rightarrow D_\mu = \partial _\mu + iq A_\mu (x) \end{aligned}$$
    (6.311)
    where $$A_\mu $$ transforms as
    $$\begin{aligned} A_\mu \rightarrow A'_\mu = A_\mu + \partial _\mu \chi \, . \end{aligned}$$
    (6.312)
The Lagrangian
$$\begin{aligned} \mathcal {L}_{\mathrm {QED}} = \bar{\psi }(i\gamma ^\mu D_\mu -m)\psi - \frac{1}{4}F_{\mu \nu }F^{\mu \nu } \end{aligned}$$
(6.313)
with
$$\begin{aligned} F_{\mu \nu } = \partial _\mu A_\nu - \partial _\nu A_\mu = \frac{1}{iq} [D_\mu , D_\nu ] \end{aligned}$$
(6.314)
is invariant for the local gauge transformation, and the field $$A_\mu $$ and its interactions with $$\psi $$ are defined by the invariance itself. Note that the Lagrangian can be written as
$$\begin{aligned} \mathcal {L} = \mathcal {L}_\mathrm {loc} + \mathcal {L}_\mathrm {gf} \end{aligned}$$
where $$\mathcal {L}_\mathrm {loc}$$ is the locally invariant Lagrangian for the particle, $$\mathcal {L}_\mathrm {gf}$$ is the field Lagrangian.

What we have seen for U(1) can be trivially extended to symmetries with more than one generator, if the generators commute (Abelian symmetry groups).

Non-Abelian Symmetry Groups and Yang–Mills Theories . When the symmetry group is non-Abelian, i.e., generators do not commute, the above recipes must be generalized. If the generators of the symmetry are $$T^a$$, with $$a=$$1, ..., n,  one can write the gauge invariance as
$$\begin{aligned} \psi (x) \rightarrow \psi '(x)=e^{i\, g_s \sum _a \epsilon _a(x) \, T^a} \, \psi (x) . \end{aligned}$$
(6.315)
From now on, we shall not explicitly write the sum over a—the index varying within the set of the generators, or of the gauge bosons, which will be assumed implicitly when the index is repeated; generators are a group. We do not associate any particular meaning to the fact that a is subscript or superscript.
If the commutation relations hold
$$\begin{aligned}{}[T^a, T^b]=if^{abc}T^c \, , \end{aligned}$$
(6.316)
one can define the covariant derivative as
$$\begin{aligned} D_\mu = \partial _\mu + igT^a\mathcal{{A}}^a_\mu \end{aligned}$$
(6.317)
where $$\mathcal{{A}}^a_\mu $$ are the vector potentials, and g is the coupling parameter. In four dimensions, the coupling parameter g is a pure number and for a SU(n) group one has $$a,b, c=1\ldots n^2-1$$.
The gauge field Lagrangian has the form
$$\begin{aligned} \mathcal {L}_\mathrm {gf} = - \frac{1}{4}F^{a\mu \nu } F_{\mu \nu }^a \, . \end{aligned}$$
(6.318)
The relation
$$\begin{aligned} F_{\mu \nu }^a = \partial _\mu \mathcal{{A}}_\nu ^a-\partial _\nu \mathcal{{A}}_\mu ^a+gf^{abc}\mathcal{{A}}_\mu ^b\mathcal{{A}}_\nu ^c \end{aligned}$$
(6.319)
can be derived by the commutator
$$\begin{aligned}{}[D_\mu , D_\nu ] = -igT^aF_{\mu \nu }^a \, . \end{aligned}$$
(6.320)
The field is self-interacting: from the given Lagrangian, one can derive the equations
$$\begin{aligned} \partial ^\mu F_{\mu \nu }^a+gf^{abc}\mathcal{{A}}^{\mu b}F_{\mu \nu }^c=0 \, . \end{aligned}$$
(6.321)
A source $$J_\mu ^a$$ enters into the equations of motion as
$$\begin{aligned} \partial ^\mu F_{\mu \nu }^a+gf^{abc}\mathcal{{A}}^{b\mu }F_{\mu \nu }^c=-J_\nu ^a \, . \end{aligned}$$
(6.322)
One can demonstrate that a Yang–Mills theory is not renormalizable for dimensions greater than four.

6.4.2 The Lagrangian of QCD

QCD is based on the gauge group SU(3), the Special Unitary group in 3 dimensions (each dimension is a color, conventionally $$Red,\,Green,\, Blue$$). This group is represented by the set of unitary $$3\times 3$$ complex matrices with determinant one (see Sect. 5.​3.​5).

Since there are nine linearly independent unitary complex matrices, there are a total of eight independent directions in this matrix space, i.e., the carriers of color (called gluons) are eight. Another way of seeing that the number of gluons is eight is that SU(3) has eight generators; each generator represents a color exchange, and thus a gauge boson (a gluon) in color space.

These matrices can operate both on each other (combinations of successive gauge transformations, physically corresponding to successive gluon emissions and/or gluon self-interactions) and on a set of complex 3-vectors, representing quarks in color space.

Due to the presence of color, a generic particle wave function can be written as a three-vector $$\psi = ({{\psi _{qR}}},{\psi _{qG}}, {\psi _{qB}})$$ which is a superposition of fields with a definite color index $$i=Red,\,Green,\, Blue$$. The SU(3) symmetry corresponds to the freedom of rotation in this three-dimensional space. As we did for the electromagnetic gauge invariance, we can express the local gauge invariance as the invariance of the Lagrangian with respect to the gauge transformation
$$\begin{aligned} \psi (x) \rightarrow \psi '(x)=e^{i\, g_s \epsilon _a(x) \, t^a} \psi (x) \end{aligned}$$
(6.323)
where the $$t^a\, (a=1\,\ldots \, 8)$$ are the eight generators of the SU(3) group, and the $$\epsilon _a(x)$$ are generic local transformations. $$g_s$$ is the strong coupling, related to $$\alpha _s$$ by the relation $$g_s^2 = 4\pi \alpha _s$$; we shall return to the strong coupling in more detail later.
Usually, the generators of SU(3) are written as
$$\begin{aligned} t^a = \frac{1}{2} \lambda ^a \end{aligned}$$
(6.324)
where the $$\lambda $$ are the so-called Gell–Mann matrices, defined as:
$$\lambda ^{1}=\left( \begin{array}{ccc} 0 &{} 1 &{} 0\\ 1 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 \end{array}\right) \;;\;\lambda ^{2}=\left( \begin{array}{ccc} 0 &{} -i &{} 0\\ i &{} 0 &{} 0\\ 0 &{} 0 &{} 0 \end{array}\right) \;;\;\lambda ^{3}=\left( \begin{array}{ccc} 1 &{} 0 &{} 0\\ 0 &{} -1 &{} 0\\ 0 &{} 0 &{} 0 \end{array}\right) $$
$$ \lambda ^{4}=\left( \begin{array}{ccc} 0 &{} 0 &{} 1\\ 0 &{} 0 &{} 0\\ 1 &{} 0 &{} 0 \end{array}\right) \;;\;\lambda ^{5}=\left( \begin{array}{ccc} 0 &{} 0 &{} -i\\ 0 &{} 0 &{} 0\\ i &{} 0 &{} 0 \end{array}\right) $$
$$ \lambda ^{6}=\left( \begin{array}{ccc} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 1\\ 0 &{} 1 &{} 0 \end{array}\right) \;;\;\lambda ^{7}=\left( \begin{array}{ccc} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} -i\\ 0 &{} i &{} 0 \end{array}\right) \;;\ \lambda ^{8}=\frac{1}{\sqrt{3}}\left( \begin{array}{ccc} 1 &{} 0 &{} 0\\ 0 &{} 1 &{} 0\\ 0 &{} 0 &{} -2 \end{array}\right) \, . $$
As discussed in Sect. 5.​3.​5, these generators are just the SU(3) analogs of the Pauli matrices in SU(2) (one can see it by looking at $$\lambda ^1$$, $$\lambda ^2$$ and $$\lambda ^3$$). Note that superscribing or subscribing an index for a matrix makes no difference in this case.
As a consequence of the local gauge symmetry, eight massless fields $$\mathcal{{A}}^a_\mu $$ will appear (one for each generator); these are the gluon fields. The covariant derivative can be written as
$$\begin{aligned} D_{\mu } = \partial _\mu + i g_s t^a \mathcal{{A}}_\mu ^a~. \end{aligned}$$
(6.325)
Finally, the QCD Lagrangian can be written as
$$\begin{aligned} \mathcal{L} = \bar{\psi }_q (i\gamma ^\mu )(D_\mu )\psi _q - m_q\bar{\psi }_q\psi _{q} - \frac{1}{4} G^a_{\mu \nu } G^{a\mu \nu }~, \end{aligned}$$
(6.326)
where $$m_q$$ is the quark mass, and $$G^a_{\mu \nu }$$ is the gluon field strength tensor for a gluon with color index a, defined as
$$\begin{aligned} G^a_{\mu \nu } = \partial _\mu \mathcal{{A}}^a_{\nu } - \partial _\nu \mathcal{{A}}^a_{\mu } + g_s f^{abc}\mathcal{{A}}^b_{\mu }\mathcal{{A}}^c_{\nu } \, , \end{aligned}$$
(6.327)
and the $$f^{abc}$$ are defined by the commutation relation $$ [ t^a , t^b ] = i f^{abc} t^c$$. These terms arise since the generators do not commute.
To guarantee the local invariance, the field $$\mathcal{{A}}^c$$ transforms as:
$$\begin{aligned} \mathcal{{A}}^c_\mu \rightarrow \mathcal{{A}}'^c_\mu = \mathcal{{A}}^c_\mu - \partial _\mu \epsilon ^c - g_s f^{abc} \epsilon ^a \mathcal{{A}}^b_\mu \, . \end{aligned}$$
(6.328)

6.4.3 Vertices in QCD; Color Factors

The only stable hadronic states are neutral in color. The simplest example is the combination of a quark and antiquark, which in color space corresponds to
$$\begin{aligned} 3 \otimes \overline{3} = 8 \oplus 1~. \end{aligned}$$
(6.329)
A random (color-uncorrelated) quark–antiquark pair has a $$1/N^2=1/9$$ chance to be in a singlet state, corresponding to the symmetric wave function $$\frac{1}{{\sqrt{3}}} \left( \left| R\bar{R}\right\rangle +\left| G\bar{G}\right\rangle +\left| B\bar{B}\right\rangle \right) $$; otherwise it is in an overall octet state (Fig. 6.50).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig50_HTML.gif
Fig. 6.50

Combinations of a quark and an antiquark in color space

Correlated production processes like $$Z\rightarrow q\bar{q}$$ or $$g\rightarrow q\bar{q}$$ will project out specific components (here the singlet and octet, respectively).

In final states, we average over all incoming colors and sum over all possible outgoing ones. Color factors are thus associated with QCD processes; such factors basically count the number of “paths through color space” that the process can take, and multiply the probability for a process to happen.

A simple example is given by the decay $$Z\rightarrow q\bar{q}$$ (see Sect. 7.​5.​1). This vertex contains a $$\delta _{ij}$$ in color space: the outgoing quark and antiquark must have identical (anti-)colors. Squaring the corresponding matrix element and summing over final state colors yields a color factor
$$\begin{aligned} e^+e^-\rightarrow Z \rightarrow q\bar{q}~~~:~~~\sum _{\mathrm {colors}}|\mathcal {M}|^2 \propto \delta _{ij}\delta ^*_{ji} = \mathrm {Tr}\{\delta \} = N_C = 3~, \end{aligned}$$
(6.330)
since i and j are quark indices.
Another example is given by the so-called Drell–Yan process, $$q\bar{q}\rightarrow \gamma ^*/Z\rightarrow \ell ^+\ell ^-$$ (Sect. 6.4.7.1) which is just the reverse of the previous one. The square of the matrix element must be the same as before, but since the quarks are here incoming, we must average rather than sum over their colors, leading to
$$\begin{aligned} q\bar{q}\rightarrow Z\rightarrow e^+e^-~~~:~~~\frac{1}{9}\sum _{\mathrm {colors}}|\mathcal {M}|^2 \propto \frac{1}{9}\delta _{ij}\delta ^*_{ji} = \frac{1}{9} \mathrm {Tr}\{\delta \} = \frac{1}{3}~, \end{aligned}$$
(6.331)
and the color factor entails now a suppression due to the fact that only quarks of matching colors can produce a Z boson. The chance that a quark and an antiquark picked at random have a corresponding color–anticolor is $$1/N_C$$.

Color factors enter also in the calculation of probabilities for the vertices of QCD. In Fig. 6.51, one can see the definition of color factors for the three-body vertices $$q\rightarrow q g$$, $$g \rightarrow g g$$ (notice the difference from QED: being gluons colored, the “triple gluon vertex” can exist, while the $$\gamma \rightarrow \gamma \gamma $$ vertex does not exist) and $$g \rightarrow q \bar{q}$$.

After tedious calculations, the color factors are
$$\begin{aligned} T_F = \frac{1}{2} \qquad \qquad C_F = \frac{4}{3} \qquad \qquad C_A = N_C = 3~. \end{aligned}$$
(6.332)

6.4.4 The Strong Coupling

When we discussed QED, we analyzed the fact that renormalization can be absorbed in a running value for the charge, or a running value for the coupling parameter.

This can be interpreted physically as follows. A point-like charge polarizes the vacuum, creating electron–positron pairs which orient themselves as dipoles screening the charge itself. As $$q^2$$ increases (i.e., as the distance from the bare charge decreases), the effective charge perceived increases, because there is less screening. Mathematically, this is equivalent to the assumption that the coupling parameter increases as $$q^2$$ increases.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig51_HTML.gif
Fig. 6.51

Basic three-body vertices of QCD, and definition of the color factors

Also in the case of QCD, the calculation based on the currents gives a logarithmic expression for the coupling parameter, which is governed by the so-called beta function ,
$$\begin{aligned} Q^2 \frac{\partial \alpha _s}{\partial Q^2} = \frac{\partial \alpha _s}{\partial \ln Q^2} = \beta (\alpha _s)~, \end{aligned}$$
(6.333)
where
$$\begin{aligned} \beta (\alpha _s) = -\alpha _s^2(b_0 + b_1\alpha _s + b_2\alpha _s^2 + \cdots ), \end{aligned}$$
(6.334)
with
$$\begin{aligned} b_0= & {} \frac{11C_A - 4 T_R n_f}{12\pi }~,\end{aligned}$$
(6.335)
$$\begin{aligned} b_1= & {} \frac{17C_A^2 - 10 T_R C_A n_f - 6 T_R C_F n_f}{24\pi ^2} ~=~ \frac{153-19\, n_f}{24\pi ^2}. \end{aligned}$$
(6.336)
In the expression for $$b_0$$, the first term is due to gluon loops and the second to the quark loops. In the same way, the first term in the $$b_1$$ coefficient comes from double gluon loops, and the others represent mixed quark–gluon loops.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig52_HTML.gif
Fig. 6.52

Dependence of $$\alpha _s$$ on the energy scale Q; a fit to QCD is superimposed.

From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001

At variance with the QED expression (6.212), the running parameter increases with decreasing $$q^2$$.
$$\begin{aligned} \alpha _s(q^2) = \alpha _s(\mu ^2) \frac{1}{1+b_0 \alpha _s(\mu ^2)\ln \frac{q^2}{\mu ^2} + \mathcal{O}(\alpha _s^2)}~. \end{aligned}$$
(6.337)
There is thus no possibility to define a limiting value for $$q^2 \rightarrow 0$$, starting from which a perturbative expansion could be made (this was the case for QED). The value of the strong coupling must thus be specified at a given reference scale, typically $$q^2=M^2_Z$$ (where most measurements have been performed thanks to LEP), from which we can obtain its value at any other scale by solving Eq. 6.333,
$$\begin{aligned} \alpha _s(q^2) \simeq \alpha _s(M_Z^2) \frac{1}{1+b_0 \alpha _s(M_Z^2)\ln \frac{Q^2}{M_Z^2}} \, . \end{aligned}$$
(6.338)
The running coupling parameter is shown as calculated from $$\alpha _s(M_Z)=0.1185$$, in Fig. 6.52, and compared to the experimental data.

The dependence of $$b_0$$ on the number of flavors $$n_f$$ entails a dependence of the slope of the energy evolution on the number of contributing flavors: the running changes slope across quark flavor thresholds. However, from $$q \sim 1$$ GeV to present accelerator energies, an effective $$n_f=3$$ approximation is reasonable, being the production of heavier quarks strongly suppressed.

Notice that in QCD, quark–antiquark pairs screen the color charge, like $$e^+e^-$$ pairs in QED. Antiscreening (which leads to increase the charge at larger distances) comes from gluon loops; getting closer to a quark the antiscreening effect of the virtual gluons is reduced. Since the contribution from virtual quarks and virtual gluons to screening is opposite, the winner is decided by the number of different flavors. For standard QCD with three colors, antiscreening prevails for $$n_f < 16$$.

6.4.5 Asymptotic Freedom and Confinement

When quarks are very close to each other, they behave almost as free particles. This is the famous “asymptotic freedom” of QCD. As a consequence, perturbation theory becomes accurate at higher energies (Eq. 6.337). Conversely, the potential grows at large distances.

In addition, the evolution of $$\alpha _s$$ with energy must make it comparable to the electromagnetic and weak couplings at some (large) energy, which, looking to our present extrapolations, may lie at some $$10^{15}$$$$10^{17}\,$$GeV—but such “unification” might happen at lower energies if new, yet undiscovered, particles generate large corrections to the evolution. After this point, we do not know how the further evolution could behave.

At a scale
$$\begin{aligned} \varLambda \sim 200\, \text{ MeV } \end{aligned}$$
(6.339)
the perturbative coupling (6.337) starts diverging; this is called the Landau pole. Note however that Eq. 6.337 is perturbative, and more terms are needed near the Landau pole: strong interactions indeed do not exhibit a divergence for $$Q\rightarrow \varLambda $$.

6.4.5.1 Quark–Gluon Plasma

Asymptotic freedom entails that at extremely high temperature and/or density, a new phase of matter should appear due to QCD. In this phase, called quark–gluon plasma (QGP), quarks and gluons become free: the color charges of partons are screened. It is believed that during the first few ms after the Big Bang the Universe was in a QGP state, and flavors were equiprobable.

QGP should be formed when temperatures are close to 200 MeV and density is large enough. This makes the ion–ion colliders the ideal place to reproduce this state.

One characteristic of QGP should be that jets are “quenched”: the high density of particles in the “fireball” which is formed after the collision absorbs jets in such a way that in the end no jet or just one jet appears.

Many experiments at hadron colliders tried to create this new state of matter in the 1980s and 1990s, and CERN announced indirect evidence for QGP in 2000. Current experiments at the Relativistic Heavy Ion Collider (RHIC) at BNL and at CERN’s LHC are continuing this effort, by colliding relativistically accelerated gold (at RHIC) or lead (at LHC) ions. Also RHIC experiments have claimed to have created a QGP with a temperature 4 $$T \sim 4 \times 10^{12}$$ K (about 350 MeV).

The observation and the study of the QGP at the LHC are discussed in more detail in Sect. 6.4.7.3.

6.4.6 Hadronization; Final States from Hadronic Interactions

Hadronization is the process by which a set of colored partons becomes a set of color-singlet hadrons.

At large energies, QCD processes can be described directly by the QCD Lagrangian. Quarks radiate gluons, which branch into gluons or generate $$q\bar{q}$$ pairs, and so on. This is a parton shower, quite similar in concept to the electromagnetic showers described by QED.

However, at a certain hadronization scale $$Q_{\mathrm {had}}$$ we are not able anymore to perform perturbative calculations. We must turn to QCD-inspired phenomenological models to describe a transition of colored partons into colorless states, and the further branchings.

The problem of hadron generation from a high-energy collision is thus modeled through four steps (Fig. 6.53):
  1. 1.

    Evolution of partons through a parton shower.

     
  2. 2a.

    Grouping of the partons onto high-mass color-neutral states. Depending on the model these states are called “strings” or “clusters”—the difference is not relevant for the purpose of this book; we shall describe in larger detail the “string” model in the following.

     
  3. 2b.

    Map of strings/clusters onto a set of primary hadrons (via string break or cluster splitting).

     
  4. 3.

    Sequential decays of the unstable hadrons into secondaries (e.g., $$\rho \rightarrow \pi \pi $$, $$\varLambda \rightarrow n \pi $$, $$\pi ^0 \rightarrow \gamma \gamma $$, ...).

     
The physics governing steps 2a and 2b is nonperturbative, and pertains to hadronization; some properties are anyway bound by the QCD Lagrangian.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig53_HTML.gif
Fig. 6.53

The creation of a multihadronic final state from the decay of a Z boson or from a virtual photon state generated in an $$e^+e^-$$ collision

An important result in lattice QCD,12 confirmed by quarkonium spectroscopy, is that the potential of the color-dipole field between a charge and an anticharge at distances $$r \gg 1$$ fm can be approximated as $$V \sim kr$$ (Fig. 6.54). This is called “linear confinement,” and it justifies the string model of hadronization, discussed below in Sect. 6.4.6.1.

6.4.6.1 String Model

The Lund string model, implemented in the Pythia [F6.10] simulation software, is nowadays commonly used to model hadronic interactions. We shall shortly describe now the main characteristics of this model; many of the basic concepts are shared by any string-inspired method. A more complete discussion can be found in the book by Andersson [F6.9].
images/304327_2_En_6_Chapter/304327_2_En_6_Fig54_HTML.gif
Fig. 6.54

The QCD effective potential

Consider the production of a $$q\bar{q}$$ pair, for instance in the process $$e^+e^-\rightarrow \gamma ^*/Z\rightarrow q\bar{q} \rightarrow \text{ hadrons }$$. As the quarks move apart, a potential
$$\begin{aligned} V(r) = \kappa \, r \end{aligned}$$
(6.340)
is stretched among them (at short distances, a Coulomb term proportional to 1 / r should be added). Such a potential describes a string with energy per unit length $$\kappa $$, which has been determined from hadron spectroscopy and from fits to simulations to have the value $$\kappa ~\sim ~1\,\text{ GeV/fm }~\sim ~0.2\,\text{ GeV }^2$$ (Fig. 6.54). The color flow in a string stores energy (Fig. 6.55).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig55_HTML.gif
Fig. 6.55

The color flow in a string stores energy in a tube.

Adapted from a lecture by T. Sjöstrand

A soft gluon possibly emitted does not affect very much the string evolution (string fragmentation is “infrared safe” with respect to the emission of soft and collinear gluons). A hard gluon, instead, can store enough energy that the qg and the $$g\bar{q}$$ elements operate as two different strings (Fig. 6.56). The quark fragmentation is different from the gluon fragmentation since quarks are only connected to a single string, while gluons have one on either side; the energy transferred to strings by gluons is thus roughly double compared to quarks.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig56_HTML.gif
Fig. 6.56

Illustration of a $$qg\bar{q}$$ system. Color conservation entails the fact that the color string goes from quarks to gluons and vice versa rather than from quark to antiquark

As the string endpoints move apart, their kinetic energy is converted into potential energy stored in the string itself (Eq. 6.340). This process continues until by quantum fluctuation a quark–antiquark pair emerges transforming energy from the string into mass. The original endpoint partons are now screened from each other, and the string is broken in two separate color-singlet pieces, $$(q\bar{q}) \rightarrow (q\bar{q}')+(q'\bar{q})$$, as shown in Fig. 6.57. This process then continues until only final state hadrons remain, as described in the following.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig57_HTML.gif
Fig. 6.57

String breaking by quark pair creation in the string field; time evolution goes from bottom to top

The individual string breaks are modeled from quantum mechanical tunneling, which leads to a suppression of transverse energies and masses:
$$\begin{aligned} \mathrm {Prob}(m_q^2,p_{\perp q}^2)~\propto ~\exp \left( \frac{-\pi m_q^2}{\kappa }\right) \exp \left( \frac{-\pi p_{\perp q}^2}{\kappa }\right) ~, \end{aligned}$$
(6.341)
where $$m_q$$ is the mass of the produced quark and $$p_{\perp }$$ is the transverse momentum with respect to the string. The $$p_{\perp }$$spectrum of the quarks is thus independent of the quark flavor, and
$$\begin{aligned} \left\langle p_{\perp q}^2\right\rangle = \sigma ^2 = \kappa /\pi \sim (250\,\text{ MeV })^2~. \end{aligned}$$
(6.342)
The mass suppression implied by Eq. 6.341 is such that strangeness suppression with respect to the creation of u or d, $$s/u \sim s/d \sim $$, is 0.2–0.3. This suppression is consistent with experimental measurements, e.g., of the $$K/\pi $$ ratio in the final states from Z decays.

By inserting the charm quark mass in Eq. 6.341, one obtains a relative suppression of charm of the order of $$10^{-11}$$. Heavy quarks can therefore be produced only in the perturbative stage and not during fragmentation.

Baryon production can be incorporated in the same picture if string breaks occur also by the production of pairs of diquarks, bound states of two quarks in a $$\bar{3}$$ representation (e.g., “red $$\,+\,$$ blue $$=$$ antigreen”). The relative probability of diquark–antidiquark to quark–antiquark production is extracted from experimental measurements, e.g., of the $$p/\pi $$ ratio.

The creation of excited states (e.g., hadrons with nonzero orbital momentum between quarks) is modeled by a probability that such events occur; this probability is again tuned on the final multiplicities measured for particles in hard collisions.

With $$p_{\perp }^2$$ and $$m^2$$ in the simulation of the fragmentation fixed from the extraction of random numbers distributed as in Eq. 6.341, the final step is to model the fraction, z, of the initial quark’s longitudinal momentum that is carried by the final hadron; in first approximation, this should scale with energy for large enough energies. The form of the probability density for z used in the Lund model, the so-called fragmentation function f(z), is
$$\begin{aligned} f(z) \propto \frac{1}{z} (1-z)^a \exp \left( -\frac{b\,(m_h^2 + p_{\perp h}^2)}{z}\right) ~, \end{aligned}$$
(6.343)
which is known as the Lund symmetric fragmentation function (normalized to unit integral). These functions can be flavor dependent, and they are tuned from the experimental data. The mass dependence in f(z) suggests a harder fragmentation function for heavier quarks (Fig. 6.58): this means that charm and beauty primary hadrons take most of the energy.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig58_HTML.gif
Fig. 6.58

Fragmentation function in the Lund parametrization for quark–antiquark strings. Curves from left to right correspond to higher masses.

Adapted from a lecture by T. Sjöstrand

The process of iterative selection of flavors, transverse momenta, and z values for pairs breaking a string is illustrated in Fig. 6.59. A quark u produced in a hard process at high energy emerges from the parton shower, and lies at one extreme of a string. A $$d\bar{d}$$ pair is created from the vacuum; the $$\bar{d}$$ combines with the u and forms a $$\pi ^+$$, which carries a fraction $$z_1$$ of the total momentum $$p_+$$. The next hadron takes a fraction $$z_2$$ of the remaining momentum, etc. The $$z_i$$ are random numbers generated according to a probability density function corresponding to the Lund fragmentation function.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig59_HTML.gif
Fig. 6.59

Iterative selection of flavors and momenta in the Lund string fragmentation model.

6.4.6.2 Multiplicity in Hard Fragmentation

Average multiplicity is one of the basic observables characterizing hadronic final states. It is extensively studied both theoretically and experimentally at several center-of-mass energies. Experimentally, since the detection of charged particles is simpler than the detection of neutrals, one studies the average charged particle multiplicity. In the limit of large energies, most of the particles in the final state are pions, and one can assume, by isospin symmetry, that the number of neutral pions is half the number of charged pions (pions are an isospin triplet).

In order to define the number of particles, one has to define what a stable hadron is. Typically, multiplicity is computed at a time $$\varDelta t= 10^{-12}$$ s after the collision–this interval is larger than the typical lifetime of particles hadronically decaying, $$10^{-23}$$ s, but shorter than the typical weak decay lifetimes.

The problem of the energy dependence of the multiplicity was already studied by Fermi and Landau in the 1930 s. With simple thermodynamical arguments, they concluded that the multiplicity from a hard interaction should be proportional to the square root of the center-of-mass energy:
$$\begin{aligned} \langle n\rangle ({E_{CM}}) = a \sqrt{E_{CM}} \end{aligned}$$
(6.344)
A more precise expression has been obtained from QCD. The expression including leading- and next-to-leading order calculation is:
$$\begin{aligned} \langle n\rangle ({E_{CM}}) = a [\alpha _s({E_{CM}})]^b e^{c/\sqrt{\alpha _s({E_{CM}})}} \left( 1+\mathcal{{O}}(\sqrt{\alpha _s({E_{CM}}})\right) \ , \end{aligned}$$
(6.345)
where a is a parameter (not calculable from perturbation theory) whose value should be fitted from the data. The constants $$b=0.49$$ and $$c=2.27$$ are calculated from the theory.

The summary of the experimental data is shown in Fig. 6.60; a plot comparing the charge multiplicity in $$e^+e^-$$ annihilations with expression 6.345 in a wide range of energies will be discussed in larger detail in the next chapter (Fig. 7.​18). The charged particle multiplicity at the Z pole, 91.2 GeV, is about 21 (the total multiplicity including $$\pi ^0$$ before their decays is about 30).

The thermodynamical model by Fermi and Landau predicts that the multiplicity of a particle of mass m is asymptotically proportional to $$1/m^2$$.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig60_HTML.gif
Fig. 6.60

Charged particle multiplicity in $$e^+e^-$$ and $$p\bar{p}$$ collisions, pp and ep collisions versus the center-of-mass energy.

From K.A. Olive et al. (Particle Data Group), Chin. Phys. C 38 (2014) 090001

6.4.6.3 Jets in Electron–Positron Annihilation

In the quark–antiquark fragmentation into hadrons at low energies, the dominant feature is the production of resonances.

When energy increases, however, primary quarks and antiquarks start carrying a relevant momentum, large enough to allow string breakings. The fragmentation, as seen in the previous section, is essentially a soft process for what is related to the generation of transverse momenta. The phenomenological consequence is the materialization of jets of particles along the direction of the primary quark and antiquark (Fig. 6.61, left).

Since transverse momenta are almost independent of the collision energy while longitudinal momenta are of the order of half the center-of-mass energy, the collimation of jets increases as energy increases.

The angular distribution of jet axes in a blob of energy generated by $$e^+e^-$$ annihilation follows the dependence
$$\begin{aligned} \frac{d\sigma }{d \cos \theta } \propto (1 + \cos ^2 \theta ) \end{aligned}$$
expected for spin 1/2 objects.
Some characteristics of quarks can be seen also by the ratio of the cross section into hadrons to the cross section into $$\mu ^+\mu ^-$$ pairs, as discussed in Sect. 5.​4.​2. QED predicts that this ratio should be equal to the sum of squared charges of the charged hadronic particles produced; due to the nature of QCD, the sum has to be extended over quarks and over colors. For $$2m_t \gg \sqrt{s} \gg 2m_b$$,
$$\begin{aligned} R = 3 \left( \frac{1}{9} + \frac{4}{9} + \frac{1}{9} + \frac{4}{9} + \frac{1}{9} \right) = \frac{11}{3} \, . \end{aligned}$$
The $$\mathcal{{O}}(\alpha _S)$$ process $$qg\bar{q}$$ (Fig. 6.56) can give events with three jets (Fig. 6.61, right). Notice that, as one can see from Fig. 6.56, one expects an excess of particles in the direction of the gluon jet, with respect of the opposite direction, since this is where most of the color field is. This effect is called the string effect and has been observed by the LEP experiments at CERN in the 1990s; we shall discuss it in the next chapter. This is evident also from the comparison of the color factors—as well as from considerations based on color conservation.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig61_HTML.gif
Fig. 6.61

A two-jet event (left) and a three-jet event (right) observed by the ALEPH experiment at LEP. Source: CERN

Jet production was first observed at $$e^+e^-$$ colliders only in 1975. It was not an easy observation, and the reason is that the question “how many jets are there in an event,” which at first sight seems to be trivial, is in itself meaningless, because there is arbitrariness in the definition of jets. A jet is a bunch of particles flying into similar directions in space; the number of jets in a final state of a collision depends on the clustering criteria which define two particles as belonging to the same bunch.

6.4.6.4 Jets in Hadron–Hadron Collisions

The situation is more complicated when final state hadrons come from a hadron–hadron interaction. On top of the interaction between the two partons responsible for a hard scattering, there are in general additional interactions between the beam remnant partons; the results of such interaction are called the “underlying event” (Fig. 6.62).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig62_HTML.gif
Fig. 6.62

Pictorial representation of a hadron–hadron interaction.

From J.M. Campbell et al. Rept. Prog. Phys. 70 (2007) 89

Usually, the underlying event comes from a soft interaction involving low momentum transfer; therefore, perturbative QCD cannot be applied and it has to be described by models. Contributions to the final energy may come from additional gluon radiation from the initial state or from the final state partons; typically, the products have small transverse momentum with respect to the direction of the collision (in the center-of-mass system). In particular, in a collision at accelerators, many final products of the collision will be lost in the beam pipe.

To characterize QCD interactions, a useful quantity is the so-called rapidity y of a particle:
$$\begin{aligned} y = \frac{1}{2} \ln \frac{E+p_{z}}{E-p_{z}} , \end{aligned}$$
(6.346)
where z is the common direction of the colliding hadrons in the center-of-mass13 (the “beam” axis).

Under a boost in the z direction, rapidity transforms by the addition of a fixed quantity. This means that rapidity differences between pairs of particles are invariant with respect to Lorentz boosts along z.

In most collisions in high-energy hadronic scattering, the distribution of final state hadrons is approximately uniform in rapidity, within kinematic limits: the distribution of final state hadrons is approximately invariant under boosts in the z direction. Thus, detector elements should be approximately uniformly spaced in rapidity—indeed they are.

For a nonrelativistic particle, rapidity is the same as velocity along the z-axis:
$$\begin{aligned} y= & {} \frac{1}{2} \ln \frac{E+p_{z}}{E-p_{z}} \simeq \frac{1}{2} \ln \frac{m+mv_{z}}{m-mv_{z}} \simeq v_{z} . \end{aligned}$$
(6.347)
Note that nonrelativistic velocities transform as well additively under boosts (as guaranteed by the Galilei transformation).
The rapidity of a particle is not easy to measure, since one should know its mass. We thus define a variable easier to measure: the pseudorapidity $$\eta $$
$$\begin{aligned} \eta = - \ln \tan \frac{\theta }{2} , \end{aligned}$$
(6.348)
where $$\theta $$ is the angle of the momentum of the particle relative to the $$+z$$ axis. One can derive an expression for rapidity in terms of pseudorapidity and transverse momentum:
$$\begin{aligned} y = \ln \frac{\sqrt{m^{2} + p_{T}^{2} \cosh ^{2}\eta } + p_{T} \sinh \eta }{\sqrt{m^{2}+p_{T}^{2}}} \end{aligned}$$
(6.349)
in the limit $$m \ll p_{T}$$, $$y\rightarrow \eta $$. This explains the name “pseudorapidity.” Angles, and hence pseudorapidity, are easy to measure—but it is really the rapidity that is of physical significance.

To make the distinction between rapidity and pseudorapidity clear, let us examine the limit on the rapidities of the produced particles of a given mass at a given c.m. energy. There is clearly a limit on rapidity, but there is no limit on pseudorapidity, since a particle can be physically produced at zero angle (or at $$180^{\circ }$$), where pseudorapidity is infinite. The particles for which the distinction is very significant are those for which the transverse momentum is substantially less than the mass. Note that $$y<\eta $$ always.

6.4.7 Hadronic Cross Section

The two extreme limits of QCD, asymptotic freedom (perturbative) and confinement (nonperturbative), translate in two radical different strategies in the computation of the cross sections of the hadronic processes. At large momentum transfer (hard processes), cross sections can be computed as the convolution of the partonic (quarks and gluons) elementary cross sections over the parton distribution functions (PDFs). At low transfer momentum (soft interactions), cross sections must be computed using phenomenological models that describe the distribution of matter inside hadrons and whose parameters must be determined from data. The soft processes are dominant. At the LHC for instance (Fig. 6.63), the total proton–proton cross section is of the order of 100 millibarn while the Higgs production cross section is of the order of tens of picobarn (a difference of 10 orders of magnitude!).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig63_HTML.gif
Fig. 6.63

Proton–(anti)proton cross sections at high energies. Cross-sectional values for several important processes are given. The right vertical axis reports the number of events for a luminosity value $${L}=10^{33}$$ cm$$^{-}$$ $$^{2}$$s$$^{-}$$ $$^{1}$$.

From N. Cartiglia, arXiv:1305.6131 [hep-ex]

At high momentum transfer, the number of partons, mostly gluons, at small x, increases very fast as shown in Fig. 5.​25. This fast rise, responsible for the increase of the total cross sections, can be explained by the possibility, at these energies, that gluons radiated by the valence quarks radiate themselves new gluons forming gluonic cascades. However, at higher energies, the gluons in the cascades interact with each other suppressing the emission of new soft gluons and a saturation state often described as the Color Glass Condensate (CGC) is reached. In high-energy, heavy-ion collisions, high densities may be accessible over extended regions and a Quark–Gluon Plasma (QGP) may be formed.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig64_HTML.gif
Fig. 6.64

First-order representation of a hadronic hard interaction producing a final state X

6.4.7.1 Hard Processes

In hadronic hard processes the factorization assumption, tested first in the deep inelastic scattering, holds. The time scale of the elementary interaction between partons (or as in case of deep inelastic scattering between the virtual photon and the quarks) is basically given by the inverse of the transferred momentum Q
$$\begin{aligned} {\tau }_{int}\sim \ Q^{-1} \end{aligned}$$
(6.350)
while the hadron timescale is given by the inverse of the QCD nonperturbative scale
$$\begin{aligned} {\varLambda }_\mathrm{{QCD}}\sim 200\ \mathrm{{MeV}} \Longrightarrow {\tau }_\mathrm{{had}}\sim 1/{\varLambda }_\mathrm{{QCD}} \sim \ 3\times 10^{-24} \mathrm{{s}} \, . \end{aligned}$$
(6.351)
Hence, whenever $${\tau }_\mathrm{int}\ll {\tau }_\mathrm{had}$$ the processes at each timescale can be considered independent. Thus in the production of the final state X (for instance a $${\mu }^{+}{\mu }^{-}$$ dilepton, or a multijet system, or a Higgs boson, ...) by the collision of two hadrons $$h_{1}$$ and $${\ h}_{2}$$ with, respectively, four-momenta $$p_{1}$$ and $$p_{2}$$:
$$\begin{aligned} h_{1}\left( p_{1}\right) {\ h}_{2}\left( p_{2}\right) \rightarrow X+\cdots \, , \end{aligned}$$
(6.352)
the inclusive cross section can be given in leading order (LO) by (see Fig. 6.64):
$$\begin{aligned} {\sigma }_{h_{1}h_{2}\rightarrow X}=\sum _{ij\ }{\int _{0}^{1}{dx_{1}}\int _{0}^{1}{dx_{2}}}\ f_{h_{1}}^{i}{\left( x_{1}, Q\right) \ \ f}_{h_{2}}^{j}\left( x_{2}, Q\right) \ \ {\widehat{\sigma }}_{ij\rightarrow X}\left( \hat{s}\right) \end{aligned}$$
(6.353)
where $${f}_{{h}_{{1}}}^{{i}}$$ and $${f}_{{h}_{{2}}}^{{j}}$$ are the parton distribution functions evaluated at the scale Q, $$x_{1}$$ and $$x_{2}$$ are the fractions of momentum carried, respectively, by the partons i and j, and $${\widehat{\sigma }}_{ij\rightarrow X}$$ is the partonic cross section evaluated at an effective squared c.m. energy
$$\begin{aligned} \hat{s}={x}_{{1}}{x}_{{2}}s \, , \end{aligned}$$
(6.354)
s being the square of the c.m. energy of the hadronic collision.

The scale Q is usually set to the effective c.m. energy $$\sqrt{\hat{s}}$$ (if X is a resonance, its mass) or to half of the jet transverse energy for high $$p_{\bot }$$ processes. The exact value of this scale is somehow arbitrary. If one were able to compute all order diagrams involved in a given process, then the final result would not depend on this particular choice. However, in practice, it is important to set the right scale in order that the corrections of higher-order diagrams would have a small contribution.

Lower-order diagrams give the right order of magnitude, but to match the present experimental accuracy (in particular at the LHC) higher-order diagrams are needed. Next-to-leading-order (NLO), one-loop calculations were computed for many processes since many years and nowadays several predictions at two-loop level, next-to-next-to-leading-order (NNLO), are already available for several processes, as for instance the Higgs boson production at the LHC.

The partons not involved in the hard scattering (spectator partons) carry a non-negligible fraction of the total energy and may be involved in interactions with small momentum transfer. These interactions contribute to the so-called underlying event.

Drell–Yan Processes. The production of dileptons in the collision of two hadrons (known as the Drell–Yan process) was first interpreted in terms of quark–antiquark annihilation by Sydney Drell and Tung-Mow Yan in 1970. Its leading-order diagram (Fig. 6.65) follows the factorization scheme discussed above where the annihilation cross section $${\widehat{\sigma }}_{q\overline{q}\rightarrow \ell \overline{\ell }}$$ is a pure QED process given by:
$$\begin{aligned} {\sigma }_{q\overline{q}\rightarrow \ell \overline{\ell }}=\ \ \frac{1}{N_{c}}\ {Q_{q}}^{2}\frac{4\pi {\alpha }^{2}}{3M^{2}} \, . \end{aligned}$$
(6.355)
$$Q_{q}$$ is the quark charge, and $$M^{2}$$ is the square of the c.m. energy of the system of the colliding quark–antiquark pair (i.e., the square of the invariant mass of the dilepton system). $$M^{2}$$ is thus given by
$$\begin{aligned} M^{2}=\hat{s}={x}_{{1}}{x}_{{2}}s \, . \end{aligned}$$
(6.356)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig65_HTML.gif
Fig. 6.65

Leading-order diagram of the Drell–Yan process.

By user:E2m [public domain], via Wikimedia Commons

Finally note that, as it was already discussed in Sect. 5.​4.​2, the color factor $$N_{c}$$ appears in the denominator (average over the incoming colors) in contrast with what happens in the reverse process $$\ell \overline{\ell }\rightarrow q\overline{q}$$ (sum over outgoing colors) whose cross section is given by
$$\begin{aligned} {\sigma }_{\ell \overline{\ell }\rightarrow q\overline{q}}=\ \ N_{c}\ {Q_{q}}^{2}\frac{4\pi {\alpha }^{2}}{3s} \, . \end{aligned}$$
(6.357)
There is a net topological difference between the final states of the $$e^{+}e^{-}$$ and $$q\bar{q}$$ processes. While in $$e^{+}e^{-}$$ interactions, the scattering into two leptons or two jets implies a back-to-back topology, in the Drell–Yan the topology is back-to-back in the plane transverse to the beam axis but, since each quark or antiquark carries an arbitrary fraction of the momentum of the parent hadron, the system has in general nonzero momentum component along the beam axis.
It is then important to observe that the rapidity of the dilepton system is by energy–momentum conservation equal to the rapidity of the quark–antiquark system,
$$\begin{aligned} y=y_{\ell \overline{\ell }}=y_{q\overline{q}}\ . \end{aligned}$$
(6.358)
Neglecting the transverse momentum, the rapidity is given by
$$\begin{aligned} y\equiv \frac{1}{2}\ln \frac{E_{\ell \overline{\ell }}+P_{Z\ell \overline{\ell }}}{E_{\ell \overline{\ell }}-P_{Z\ell \overline{\ell }}}=\frac{1}{2}\ln \frac{E_{q\overline{q}}+P_{Zq\overline{q}}}{E_{q\overline{q}}-P_{Zq\overline{q}}}=\frac{1}{2}\ln \frac{{x}_{{1}}}{{x}_{{2}}}\ \, . \end{aligned}$$
(6.359)
Then, if the mass M and the rapidity y of the dilepton are measured, the momentum fractions of the quark and antiquark can, in this particular case, be directly accessed. In fact, inverting the equations relating M and y with $${x}_{{1}},{x}_{{2}}$$ one obtains:
$$\begin{aligned} {x}_{{1}}=\frac{M}{\sqrt{s}}\; e^{{y}} \; ; \; {x}_{{2}}=\frac{M}{\sqrt{s}}\; e^{-y} \, . \end{aligned}$$
(6.360)
The Drell–Yan differential cross section can now be written in terms of M and y. Computing the Jacobian of the change of the variables from $$\left( {x}_{{1}}{, x}_{{2}}\right) $$ to $$\left( M, y\right) $$,
$$\begin{aligned} \frac{d\left( {x_{1}{x}}_{{2}}\right) }{d\left( y, M\right) }=\frac{2M}{s}. \end{aligned}$$
(6.361)
It can be easily shown that the differential Drell–Yan cross section for the collision of two hadrons is just:
$$\begin{aligned} \frac{d\sigma }{dMdy}=\frac{8\pi {\alpha }^{2}}{9Ms}{f(x_1};x_2) \end{aligned}$$
(6.362)
where $$f(x_1;x_2)$$ is the combined PDF for the fractions of momentum carried by the colliding quark and antiquark weighted by the square of the quark charge. For instance, in the case of proton–antiproton scattering one has, assuming that the quark PDFs in the proton are identical to the antiquark PDFs in the antiproton and neglecting the contributions of the antiquark (quark) of the proton (antiproton) and of other quarks than u and d:
$$\begin{aligned} {f(x_1};{x_2)=}\left( \frac{4}{9}{\ }u\left( {x}_{{1}}\right) u\left( {x}_{{2}}\right) +\frac{1}{9}{\ }d\left( {x}_{{1}}\right) d\left( {x}_{{2}}\right) \right) \end{aligned}$$
(6.363)
where
$$\begin{aligned} {u}\left( {x}\right) \,=\,{u}^{p}(x)={\overline{{u}}}^{\overline{p}}(x) \end{aligned}$$
(6.364)
$$\begin{aligned} {d}\left( {x}\right) \,=\,{d}^{p}(x)={\overline{{d}}}^{\overline{p}}(x)\, . \end{aligned}$$
(6.365)
In proton–proton collisions at the LHC, the antiquark must come from the sea. Anyhow, to have a good description of the dilepton data (see Fig. 6.66) it is not enough to consider the leading-order diagram discussed above. In fact, the peak observed around $$M\sim 91$$ GeV corresponds to the Z resonance, not accounted in the naïve Drell–Yan model, and next-to-next leading-order (NNLO) diagrams are needed to have a good agreement between data and theory.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig66_HTML.gif
Fig. 6.66

Dilepton cross section measured by CMS.

From V. Khachatryan et al. (CMS Collaboration), The European Physical Journal C75 (2015) 147

Multijet Production. Multijet events in hadronic interactions at high energies are an important background for all the hard physics channels with final hadronic states, in particular for the searches for new physics; the calculation of their characteristics is a direct test of QCD. At large transferred momentum, their cross section may be computed following the factorization scheme discussed above but involving at LO already a large number of elementary two-parton diagrams ($$qq\rightarrow {qq}$$,$$\ qg\rightarrow {qg}$$,$$\ gg\rightarrow {qg}$$, $$q\overline{q}\rightarrow {gg}$$, ...).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig67_HTML.gif
Fig. 6.67

Inclusive jet cross section measured by CMS.

From S. Chatrchyan et al. (CMS Collaboration), Phys. Rev. Lett. 107 (2011) 132001

The transverse momentum ($$P_{T}$$) of the jets is, in these processes, a key final state variable and together with the jets rapidities ($${y}_{i}$$) has to be related to the partonic variables in order that a comparison data/theory may be possible. For instance, in the production of two jets from the t-channel gluon exchange, the elementary LO cross section is given by
$$\begin{aligned} \frac{d\sigma }{dQ^{2}dx_{1}dx_{2}}=\frac{4\pi {{\alpha }_{s}}^{2}}{9Q^{2}}\left[ 1+\left( 1-\frac{Q^{2}}{\hat{s}}\right) \right] \end{aligned}$$
(6.366)
and the following relations can be established between the partonic and the final state variables:
$$\begin{aligned} {x}_{{1}}=\frac{P_{T}}{\sqrt{s}}\ \left( e^{y_{1}}+e^{y_{2}}\right) \end{aligned}$$
(6.367)
$$\begin{aligned} {x}_{{2}}=\frac{P_{T}}{\sqrt{s}}\ \left( e^{{-y}_{1}}+e^{{-y}_{2}}\right) \end{aligned}$$
(6.368)
$$\begin{aligned} Q^{2}={P_{T}}^{2}\left( 1+e^{y_{1}-y_{2}}\right) \, . \end{aligned}$$
(6.369)
In practice, such calculations are performed numerically using sophisticated computer programs. However, the comparison of the prediction from this calculation with the LHC data provides a powerful test of QCD which spans many orders of magnitude (see Fig. 6.67).

6.4.7.2 Soft Processes

At low momentum transfer, the factorization assumption breaks down. Therefore, it is no longer possible to compute the cross sections adding up perturbative interactions between partons, being the nonperturbative aspects of the hadrons “frozen” in the Parton Distribution Functions. The interaction between hadrons is thus described by phenomenological models.

A strategy is to use optical models and their application to quantum mechanics (for an extended treatment see Ref. [F6.4] ). The interaction of a particle with momentum $$\mathbf {p}=\ \hbar \mathbf {k}$$ with a target may be seen as the scattering of a plane wave by a diffusion center (see Fig. 6.68). The final state at large distance from the collision point can then be described by the superposition of the incoming plane wave with an outgoing spherical scattered wave:
$$\begin{aligned} \psi \left( \mathbf {r}\right) \ \sim \ e^{ikz}+F\left( E,\theta \right) \ \frac{e^{i\mathbf {k} \cdot \mathbf {r}}}{r} \end{aligned}$$
(6.370)
where z is the coordinate along the beam axis, $$\theta $$ is the scattering angle, E the energy, and $$F\left( E,\theta \right) $$ is denominated as the elastic scattering amplitude.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig68_HTML.gif
Fig. 6.68

Plane wave scattering by a diffusion center having as result an outcoming spherical wave

The elastic differential cross section can be shown to be
$$\begin{aligned} \frac{d\sigma }{d\varOmega }={\left| F\left( E,\theta \right) \right| }^{2} \, . \end{aligned}$$
(6.371)
In the forward region ($$\theta \>0)$$, the interference between the incident and the scattered waves is non-negligible. In fact, this term has a net effect on the reduction of the incident flux that can be seen as a kind of “shadow” created by the diffusion center. An important theorem, the Optical Theorem , connects the total cross section with the imaginary part of the forward elastic scattering amplitude:
$$\begin{aligned} {\sigma }_{{tot}\ }\left( E\right) =\frac{4\pi }{k}\ \mathrm {Im}F\left( E, 0\right) . \end{aligned}$$
(6.372)
The elastic cross section is just the integral of the elastic differential cross section,
$$\begin{aligned} {\sigma }_{el}\left( E\right) =\int {{\left| F\left( E,\theta \right) \right| }^{2}}\ d\varOmega \end{aligned}$$
(6.373)
and the inelastic cross section just the difference of the two cross sections
$$\begin{aligned} {\sigma }_{inel}\left( E\right) ={\sigma }_{tot}\left( E\right) -{\sigma }_{el}\left( E\right) \, . \end{aligned}$$
(6.374)
It is often useful to decompose the elastic scattering amplitude in terms of angular quantum number l (for spinless particles scattering the angular momentum $$\mathbf {L}$$ is conserved; in the case of particles with spin the good quantity will be the total angular momentum $$\mathbf {J}=\mathbf {L}+\mathbf {S}$$):
$$\begin{aligned} F\left( E,\theta \right) =\frac{1}{k}\sum _{l=0}^{l=\infty }{\left( 2l+1\right) }\ f_{l}\left( E\right) \ P_{l}\left( \cos \theta \right) \end{aligned}$$
(6.375)
where the functions $$f_{l}\left( E\right) $$ are the partial wave amplitudes and $$P_{l}$$ are the Legendre polynomials which form an orthonormal basis.
Cross sections can be also written as a function of the partial wave amplitudes:
$$\begin{aligned} {\sigma }_{el}\left( E\right) =\frac{4\pi }{K^{2}}\sum _{l=0}^{l=\infty }{\left( 2l+1\right) }{\left| f_{l}\left( E\right) \right| }^{2} \end{aligned}$$
(6.376)
$$\begin{aligned} {\sigma }_{tot}\left( E\right) =\frac{4\pi }{K^{2}}\sum _{l=0}^{l=\infty }{\left( 2l+1\right) }\ \mathrm {Im}f_{l}\left( E\right) \end{aligned}$$
(6.377)
and again $$\sigma _{inel}$$ is simply the difference between $$\sigma _{tot}$$ and $$\sigma _{el}$$.
The optical theorem applied now at each partial wave imposes the following relation (unitarity condition):
$$\begin{aligned} \mathrm {Im}f_{l}\left( E\right) \ge {\left| f_{l}\left( E\right) \right| }^{2}\, . \end{aligned}$$
(6.378)
Noting that
$$\begin{aligned} \ {\left| f_{l}-\frac{i}{2}\right| }^{2} = {\left| f_{l}\right| }^{2}-\mathrm {Im}f_{l} + \frac{1}{4} \end{aligned}$$
(6.379)
this condition can be expressed as
$$\begin{aligned} {\left| f_{l}-\frac{i}{2}\right| }^{2}\le \frac{1}{4} \, . \end{aligned}$$
(6.380)
This relation is automatically satisfied if the partial wave amplitude is written as
$$\begin{aligned} f_{l}=\frac{i}{2}\ \left( 1-e^{2i{\delta }_{l}}\right) \end{aligned}$$
(6.381)
being $${\delta }_{l}$$ a complex number.
Whenever $${\delta }_{l}$$ is a pure real number
$$\begin{aligned} \mathrm {Im}f_{l}\left( E\right) ={\left| f_{l}\left( E\right) \right| }^{2} \end{aligned}$$
(6.382)
and the scattering is totally elastic (the inelastic cross section is zero).
On the other hand, if the wavelength associated with the beam particle is much smaller than the target region,
$$\begin{aligned} \lambda \sim \frac{1}{k}\ll R \end{aligned}$$
(6.383)
a description in terms of the classical impact parameter b (Fig. 6.69) is appropriate.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig69_HTML.gif
Fig. 6.69

Impact parameter definition in the scattering of a particle with momentum $$\mathbf {k}$$ over a target region with radius R

Defining
$$\begin{aligned} b\equiv \frac{1}{k}\left( l+\frac{1}{2}\right) \end{aligned}$$
(6.384)
the elastic scattering amplitude can then be expressed as
$$\begin{aligned} F\left( E,\theta \right) =2k\sum _{b=1/k}^{l=\infty }{b\ \triangle b}\ f_{bk-1/2}\left( E\right) \ P_{bk-1/2}\left( \cos \theta \right) \end{aligned}$$
(6.385)
with $$\triangle b=1/k$$ which is the granularity of the sum.
In the limit $$k\rightarrow \infty $$, $$\triangle b\rightarrow 0$$, and the sum can be approximated by an integral
$$\begin{aligned} F\left( E,\theta \right) =2k\int _{0}^{\infty }{b\ db\ a\left( b, E\right) \ P_{bk-1/2}\left( \cos \theta \right) } \end{aligned}$$
(6.386)
where the Legendre polynomials $$P_{l}\left( \cos \theta \right) $$ were replaced by the Legendre functions $$P_{\nu }\left( \cos \theta \right) $$, being $$\nu $$ a real positive number, and the partial wave amplitudes $$f_{l}$$ were interpolated giving rise to the scattering amplitude $$a\left( b, E\right) $$.
For small scattering angles, the Legendre functions may be approximated by a zeroth-order Bessel Function $$J_{0}\left( b,\theta \right) $$ and finally one can write
$$\begin{aligned} F\left( E,\theta \right) \cong 2k\int _{0}^{\infty }b\ db\ a\left( b, E\right) \ J_{0}(b,\theta ) \, . \end{aligned}$$
(6.387)
The scattering amplitude $$a\left( b, E\right) $$ is thus related to the elastic wave amplitude discussed above basically by a Bessel–Fourier transform.
Following a similar strategy to ensure automatically unitarity, $$a\left( b, s\right) $$ may be parametrized as
$$\begin{aligned} a\left( s, b\right) =\ \frac{i}{2}\ (1-e^{i\chi (b, s)}) \end{aligned}$$
(6.388)
where
$$\begin{aligned} \chi \left( b,s\right) =\chi _{R}\left( b,s\right) +i\ {\chi }_{I}\left( b, s\right) \end{aligned}$$
(6.389)
is called the eikonal function.
It can be shown that the cross sections are related to the eikonal by the following expressions:
$$\begin{aligned} {\sigma }_{el}\left( s\right) =\int {d^{2}b{\left| 1-e^{i\chi (b, s)}\right| }^{2}} \end{aligned}$$
(6.390)
$$\begin{aligned} {\sigma }_{tot}\left( s\right) =2\int {d^{2}b\left( 1-\cos \left( {\chi }_{R}\left( b,s\right) \right) {e}^{-{\chi }_{I}(b, s)}\right) } \end{aligned}$$
(6.391)
$$\begin{aligned} {\sigma }_{inel}\left( s\right) =\int {d^{2}b\ \left( 1-e^{-2{\chi }_{I}(b, s)}\right) } \end{aligned}$$
(6.392)
(the integrations run over the target region with a radius R).
Note that:
  • if $${\chi }_{I}=0$$ then $${\sigma }_{inel}=0$$ and all the interactions are elastic;

  • if $${\chi }_{R}=0$$ and $${\chi }_{I}\rightarrow \infty $$ for $$b\le R$$, then $${\sigma }_{inel}={\sigma }_{el}$$ and $${\sigma }_{tot\ }=2\pi R^{2}$$. This is the so-called black disk limit.

In a first approximation, hadrons may be described by gray disks with mean radius R and $$\chi \left( b, s\right) =i\varOmega {(s)}$$ for $$b\le R$$ and 0 otherwise. The opacity $$\varOmega $$ is a real number ($$0<\varOmega <\infty $$). In fact, the main features of proton–proton cross sections can be reproduced in such a simple model (Fig. 6.70). In the high energy limit, the gray disk tends asymptotically to a black disk and thus thereafter the increase of the cross section, limited by the Froissart Bound to $${\ln }^{2}(s)$$, is just determined by the increase of the mean radius.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig70_HTML.gif
Fig. 6.70

The total cross section (left) and the ratio of the elastic and total cross sections in proton–proton interactions as a function of the c.m. energy. Points are experimental data and the lines are coming from a fit using a gray disk model.

From R. Conceia̧o et al. Nuclear Physics A 888 (2012) 58

The eikonal has no dimensions: it is just a complex number and it is a function of the impact parameter. Using a semiclassical argument, its imaginary part can be associated with the mean number of parton–parton collisions $$\overline{n}\left( b, s\right) $$. In fact, if such collisions were independent (no correlation means no diffraction), the probability to have n collisions at an impact parameter b would follow a Poisson distribution around the average:
$$\begin{aligned} P\left( n,\overline{n}\right) =\frac{{\left( \overline{n}\right) }^{n}{e}^{{-}\overline{n}}}{n!} \, . \end{aligned}$$
(6.393)
The probability to have at least one collision is given by
$$\begin{aligned} {\sigma }_{inel}\left( s, b\right) =1-{e}^{{-}\overline{n}} \end{aligned}$$
(6.394)
and thus
$$\begin{aligned} {\sigma }_{inel}\left( s\right) =\int {d^{2}b}\left( 1-{e}^{{-}\overline{n}}\right) . \end{aligned}$$
(6.395)
Hence in this approximation
$$\begin{aligned} {\chi }_{I}\left( b, s\right) =\frac{1}{2}\overline{n}\left( b, s\right) \, . \end{aligned}$$
(6.396)
$${\chi }_{I}\left( b, s\right) $$ is often computed as the sum of the different kind of parton–parton interactions, factorizing each term into a transverse density function and the corresponding cross section:
$$\begin{aligned} {\chi }_{I}\left( b,s\right) =\sum {G_{i}\left( b, s\right) }{\sigma }_{i}\, . \end{aligned}$$
(6.397)
For instance,
$$\begin{aligned} {\chi }_{I}\left( b,s\right) =G_{qq}\left( b,s\right) {\sigma }_{qq}+G_{qg}\left( b,s\right) {\sigma }_{qg}+G_{gg}\left( b, s\right) {\sigma }_{gg} \end{aligned}$$
(6.398)
where qq, qg, gg stay respectively for the quark–quark, quark–gluon, and gluon–gluon interactions.
On the other hand, there are models where $${\chi }_{I}$$ is divided in perturbative (hard) and nonperturbative (soft) terms:
$$\begin{aligned} {\chi }_{I}\left( b,s\right) =G_\mathrm{soft}\left( b,s\right) {\sigma }_\mathrm{soft}+G_\mathrm{hard}\left( b, s\right) {\sigma }_\mathrm{hard}. \end{aligned}$$
(6.399)
The transverse density functions $$G_{i}\left( b, s\right) $$ must take into account the overlap of the two hadrons and can be computed as the convolution of the Fourier transform of the form factors of the two hadrons.
This strategy can be extended to nucleus–nucleus interactions which are then seen as an independent sum of nucleon–nucleon interactions. This approximation, known as the Glauber14 model , can be written as:
$$\begin{aligned} {\sigma }_{NN\ }\left( s\right) =\int {d^{2}b\ }\left( 1-{e}^{{-}G\left( b, s\right) {\ \sigma }_{nn}(s)}\right) \, . \end{aligned}$$
(6.400)
The function $$G\left( b, s\right) $$ takes now into account the geometrical overlap of the two nuclei and indicates the probability per unit of area of finding simultaneously one nucleon in each nucleus at a given impact parameter.

6.4.7.3 High Density, High Energy; Quark–Gluon Plasma

At high density and high energy new phenomena may appear.

At high density, whenever one is able to pack densely hadronic matter, as for instance in the core of dense neutron stars, in the first seconds of the Universe (the Big Bang), or in heavy-ion collisions at high energy (the little bangs), we can expect that some kind of color screening occurs and partons become asymptotically free. The confinement scale is basically set by the size of hadrons, with an energy density $$\varepsilon $$ of the order of 1 GeV/fm$$^{3}$$; thus, if in larger space regions such an energy density is attained, a free gas of quarks and gluons may be formed. That order of magnitude, which corresponds to a transition temperature of around 170–190 MeV, is confirmed by nonperturbative QCD calculations using lattices (see Fig. 6.71). At this temperature, following a simplified Stefan–Boltzmann law for a relativistic free gas, there should be a fast increase of the energy density corresponding to the increase of the effective internal number degrees of freedom $${g}_{*}$$ from a free gas of pions ($${g}_{*}=3$$) to a new state of matter where quarks and gluons are asymptotically free ($${g}_{*}=37$$, considering two quark flavors). This new matter state is usually dubbed as the Quark–Gluon Plasma (QGP).
images/304327_2_En_6_Chapter/304327_2_En_6_Fig71_HTML.gif
Fig. 6.71

Energy density of hadronic state of matter, with baryonic number zero, according to a lattice calculation. A sharp rise is observed near the critical temperature $${\ T}_{c}\sim 170$$–190 MeV.

From C. Bernard et al. hep-lat/0610017

The phase transition between hadronic and QGP states depends also strongly on the net baryon contents of the system. At the core of dense neutron stars, QGP may occur at very low temperatures. The precise QCD phase diagram is therefore complex and still controversial. A simplified sketch is presented in Fig. 6.72 where the existence of a possible critical point is represented.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig72_HTML.gif
Fig. 6.72

Schematic representation of the QCD phase diagram as a function of the temperature and of the baryonic potential (measure the difference in the quark and antiquark contents of the system).

In Pb–Pb collisions at the LHC, c.m. energies per nucleon of $$5.02\,\mathrm {TeV}$$, corresponding to an energy density for central events (head-on collisions, low-impact parameters) above 15 GeV/fm$$^{3}$$, have been attained. The multiplicity of such events is huge with thousand of particles detected (Fig. 6.73). Such events are an ideal laboratory to study the formation and the characteristics of the QGP. Both global observables, as the asymmetry of the flow of the final state particles, and hard probes like high transverse momentum particles, di-jets events, and specific heavy hadrons, are under intense scrutiny.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig73_HTML.gif
Fig. 6.73

First lead-lead event recorded by ALICE detector at LHC at c.m. energy per nucleon of 2.76 TeV. Thousands of charged particles were recorded by the time-projection chamber. Source: CERN

images/304327_2_En_6_Chapter/304327_2_En_6_Fig74_HTML.gif
Fig. 6.74

Artistic representation of a heavy-ion collision. The reaction plane is defined by the momentum vectors of the two ions, and the shape of the interaction region is due to the sharp pression gradients.

images/304327_2_En_6_Chapter/304327_2_En_6_Fig75_HTML.gif
Fig. 6.75

Shear viscosity to entropy density ratio for several fluids. $$T_{c}$$ is the critical temperature at which transition occurs (deconfinement in the case of QCD).

From S. Cremonini et al. JHEP 1208 (2012)

An asymmetry of the flow of the final state particles can be predicted as a consequence of the anisotropies in the pressure gradients due to the shape and structure of the nucleus–nucleus interaction region (Fig. 6.74). In fact, more and faster particles are expected and seen in the region of the interaction plane (defined by the directions of the two nuclei in the c.m. reference frame) where compression is higher. Although the in-out modulation (elliptic flow) is qualitatively in agreement with the predictions, quantitatively the effect is smaller than the expected with the assumption of a QGP formed by a free gas of quarks and gluons. Some kind of collective phenomenon should exist. In fact, the QGP behaves rather like a strongly coupled liquid with low viscosity. The measured ratio of its shear (dynamic) viscosity to its entropy density ($$\eta /s$$) is lower than in ordinary liquids and is near to the ideal hydrodynamic limit (Fig. 6.75). Such surprising behavior was first discovered at the RHIC collider at energies lower than the LHC.

The study at the LHC of two-particle correlation functions for pairs of charged particles showed also unexpected features like a “ridge”-like structure at $$\varDelta \Phi \sim 0$$ extending by several $$\eta $$ units (Fig. 6.76)
images/304327_2_En_6_Chapter/304327_2_En_6_Fig76_HTML.gif
Fig. 6.76

2-D two-particle correlation function for high-multiplicity p-Pb collision events at 5.02 TeV for pairs of charged particles. The sharp near-side peaks from jet correlations were truncated to better visualize the “ridge”-like structure.

From CMS Collaboration, Phys. Lett. B718 (2013) 795

images/304327_2_En_6_Chapter/304327_2_En_6_Fig77_HTML.gif
Fig. 6.77

Display of an unbalanced di-jet event recorded by the CMS experiment at the LHC in lead–lead collisions at a c.m. energy of 2.76 TeV per nucleon. The plot shows the sum of the electromagnetic and hadronic transverse energies as a function of the pseudorapidity and the azimuthal angle. The two identified jets are highlighted.

From S. Chatrchyan et al. (CMS Collaboration), Phys. Rev. C84 (2011) 024906

Partons resulting from elementary hard processes inside the QGP have to cross a high dense medium and thus may suffer significant energy losses or even be absorbed in what is generically called “quenching” . The most spectacular observation of such phenomena is in di-jet events, where one of the high $$P_{T}$$ jets loose a large fraction of its energy (Fig. 6.77). This “extinction” of jets is usually quantified in terms of the nuclear suppression factor $$R_{AA}$$ defined as the ratio between differential $$P_{T}$$ distributions in nucleus–nucleus and in proton–proton collisions:
$$\begin{aligned} R_{AA}=\frac{d^{2}N_{AA}/dydP_{T}}{N_\mathrm{coll}d^{2}N_{pp}/dydP_{T}} \, , \end{aligned}$$
(6.401)
where $$N_\mathrm{coll}$$ is the average number of nucleon–nucleon collisions at each specific rapidity bin.

In the absence of “medium effects,” $$R_{AA}$$ may reflect a possible modification of the PDFs in nuclei as compared to the ones in free nucleons but should not be far from the unity. The measurement at the LHC (Fig. 6.78, left) showed however a clear suppression demonstrating significant energy losses in the medium and in this way it can provide information of the dynamical properties of the medium, such as its density.

Not only loss processes may occur in the presence of a hot and dense medium (QGP). The production of high-energy quarkonia (bound states of heavy quark–antiquark pairs) may also be suppressed whenever QGP is formed as initial proposed on a seminal paper in 1986 by Matsui and Satz in the case of the J/$$\psi $$ ($$c\bar{c}$$ pair) production in high-energy heavy-ion collisions. The underlined proposed mechanism was a color analog of Debye screening which describes the screening of electrical charges in the plasma. Evidence of such suppression was soon reported at CERN in fixed target oxygen–uranium collisions at $$200\;\mathrm {GeV}$$ per nucleon by the NA38 collaboration. Many other results were published in the following years, and a long discussion was held on whether the observed suppression was due to the absorption of these fragile $$c\bar{c}$$ states by the surrounding nuclear matter or to the possible existence of the QGP. In 2007 the NA60 Collaboration reported, in indium–indium fixed target collisions at $$158\;\mathrm {GeV}$$ per nucleon, the existence of an anomalous J/$$\psi $$ suppression not compatible with the nuclear absorption effects. However, this anomalous suppression did not increase at higher c.m. energies, and recently showed a clear decrease at the LHC (Fig. 6.78, right). Meanwhile, the possible (re)combination of charm and anticharm quarks at the boundaries of the QGP region was proposed as an enhancement production mechanism, and such mechanism seems to be able to describe the present data.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig78_HTML.gif
Fig. 6.78

Left: The nuclear modification factor $$R_{AA}$$ as a function of $$P_{T}$$, measured by the ATLAS experiment at LHC at c.m. energy per nucleon of 5.02 TeV, for five centrality intervals. From ATLAS-CONF-2017-012J. Right: $$R_{AA}$$ for inclusive J/$$\psi $$ production at mid rapidity as reported by PHENIX (RHIC) and ALICE (LHC) experiments at c.m. energy per nucleon of 0.2 and 2.76  TeV, respectively.

The study of the $$J/\psi $$ production, as well as of other quarkonia states, is extremely important to study QGP as it allows for a thermal spectroscopy of the QGP evolution. The dissociation/association of these $$q\overline{q}$$ pairs is intrinsically related to the QGP temperature; as such, as this medium expands and cools down, these pairs may recombine and each flavor has a different recombination temperature. However, the competition between the dissociation and association effects is not trivial and so far it was not yet experimentally assessed.

The process of formation of the QGP in high-energy heavy-ions collisions is theoretically challenging. It is generally accepted that in the first moments of the collisions, the two nuclei had already reached the saturation state described by the color glass condensate (CGC) referred at the beginning of Sect. 6.4.7. Then a fast thermalization process occur ending in the formation of a QGP state described by relativistic hydrodynamic models. The intermediated stage, not experimentally accessible and not theoretically well established, is designated as glasma . Finally, the QGP “freezes-out” into a gas of hadrons. Such scheme is pictured out in an artistic representation in Fig. 6.79.
images/304327_2_En_6_Chapter/304327_2_En_6_Fig79_HTML.gif
Fig. 6.79

An artistic representation of the time–space diagram of the evolution of the states created in heavy-ion collisions.

From “Relativistic Dissipative Hydrodynamic description of the Quark-Gluon Plasma” A. Monai 2014 (http://​www.​springer.​com/​978-4-431-54797-6)

In ultrahigh-energy cosmic ray experiments (see Chap. 10), events with c.m. energies well above those presently attainable in human-made accelerators are detected. Higher Q$$^{2}$$ and thus smaller scales ranges can then be explored opening a new possible window to test hadronic interactions.

Further Reading

  1. [F6.1]

    M. Thomson, “Modern Particle Physics,” Cambridge University Press 2013. A recent, pedagogical and rigorous book covering the main aspects of particle physics at advanced undergraduate and early graduate level.

     
  2. [F6.2]

    A. Bettini, “Introduction to Elementary Particle Physics” (second edition), Cambridge University Press 2014. A very good introduction to Particle Physics at the undergraduate level starting from the experimental aspects and deeply discussing relevant experiments.

     
  3. [F6.3]

    D. Griffiths, “Introduction to Elementary Particles” (second edition), Wiley-VCH 2008. A reference book at the undergraduate level with many proposed problems at the end of each chapter; rather oriented on the theoretical aspects.

     
  4. [F6.4]

    S. Gasiorowicz, “Quantum Physics” (third edition), Wiley 2003. Provides a concise and solid introduction to quantum mechanics. It is very useful for students that had already been exposed to the subject.

     
  5. [F6.5]

    I.J.R. Aitchison, A.J.G. Hey, “Gauge Theories in Particle Physics: A Practical Introduction” (fourth edition—2 volumes), CRC Press, 2012. Provides a pedagogical and complete discussion on gauge field theories in the Standard Model of Particle Physics from QED (vol. 1) to electroweak theory and QCD (vol. 2).

     
  6. [F6.6]

    F. Halzen, A.D. Martin, “Quarks and Leptons: An Introductory Course in Modern Particle Physics”, Wiley 1984. A book at early graduate level providing in a clear way the theories of modern physics in how to approach which teaches people how to do calculations.

     
  7. [F6.7]

    M. Merk, W. Hulsbergen, I. van Vulpen, “Particle Physics 1”, Nikhef 2016. Concise and clear lecture notes at a master level covering from the QED to the Electroweak symmetry breaking.

     
  8. [F6.8]

    J. Romão, “Particle Physics”, 2014, http://​porthos.​ist.​utl.​pt/​Public/​textos/​fp. Lecture notes for a one-semester master course in theoretical particle physics; also a very good introduction to quantum field theory.

     
  9. [F6.9]

    B. Andersson, “The Lund Model”, Cambridge University Press, 2005. The physics behind the Pythia/Lund model.

     
  10. [F6.10]

    T. Sjöstrand et al. “An Introduction to PYTHIA 8.2”, Computer Physics Communications 191 (2015) 159. A technical explanation of the reference Monte Carlo code for the simulation of hadronic processes, with links to the physics behind.

     
Exercises
  1. 1.

    Spinless particles interaction. Determine, in the high-energy limit, the electromagnetic differential cross section between two spinless charged nonidentical particles.

     
  2. 2.

    Dirac equation invariance. Show that the Dirac equation written using the covariant derivative is gauge-invariant.

     
  3. 3.
    Bilinear covariants. Show that
    1. (a)

      $$\overline{\psi }\psi $$ is a scalar;

       
    2. (b)

      $$\overline{\psi }{\gamma }^5\psi $$ is a pseudoscalar;

       
    3. (c)

      $$\overline{\psi }{\gamma }^{\mu }\psi $$ is a four-vector;

       
    4. (d)

      $$\overline{\psi }{\gamma }^{\mu }{\gamma }^5\psi $$ is a pseudo four-vector.

       
     
  4. 4.
    Chirality and helicity. Show that the right helicity eigenstate $$u_{\uparrow }$$ can be decomposed in the right ($$u_R$$) and left $$(u_L$$) chiral states as follows:
    $$ {u_{\uparrow }=\frac{1}{2}\left( 1+\frac{p}{E+m}\right) \ u_R+\ \frac{1}{2}\left( 1-\frac{p}{E+m}\right) u_L} \, . $$
     
  5. 5.

    Running electromagnetic coupling. Calculate $$\alpha (Q^2)$$ for $$Q = 1000$$ GeV.

     
  6. 6.

    $${{{\nu }}}_{{{\mu }}}$$ beams. Consider a beam of $${\nu }_{\mu }$$ produced through the decay of a primary beam containing pions (90%) and kaons (10%). The primary beam has a momentum of 10 GeV and an intensity of 10$${}^{10}$$ s$${}^{-1}$$.

    1. (a)

      Determine the number of pions and kaons that will decay in a tunnel 100 m long.

       
    2. (b)

      Determine the energy spectrum of the decay products.

       
    3. (c)

      Calculate the contamination of the $${\nu }_{\mu }$$ beam, i.e., the fraction of $${\nu }_{{e}}$$ present in that beam.

       
     
  7. 7.
    $${{{\nu }}}_{{{\mu }}}$$ semileptonic interaction. Considering the process $${\nu }_{\mu }p\longrightarrow {\mu }^-X$$:
    1. (a)

      Discuss what X could be (start by computing the available energy in the center of mass).

       
    2. (b)

      Write the amplitude at lower order for the process for the interaction of the $${\nu }_{\mu }$$ with the valence quark d ($${\nu }_{\mu }d\longrightarrow {\mu }^-u$$).

       
    3. (c)

      Compute the effective energy in the center of mass for this process supposing that the energy of the $${\nu }_{\mu }$$ is 10 GeV and the produced muon takes 5 GeV and is detected at an angle of 10$$^{\circ }$$ with the $${\nu }_{\mu }$$ beam.

       
    4. (d)

      Write the cross section of the process $${\nu }_{\mu }p\longrightarrow {\mu }^-X$$ as a function of the elementary cross section $${\nu }_{\mu }d\longrightarrow {\mu }^-u$$.

       
     
  8. 8.
    Neutrino and antineutrino deep inelastic scattering. Determine, in the framework of the quark parton model, the ratio:
    $$ \frac{\sigma \left( {\overline{\nu }}_{\mu }N\longrightarrow {\mu }^+X\right) }{\sigma \left( {\nu }_{\mu }N\longrightarrow {\mu }^-X\right) } $$
    where N stands for an isoscalar (same number of protons and neutrons) nucleus. Consider that the involved energies are much higher than the particle masses. Take into account only diagrams with valence quarks.
     
  9. 9.

    Feynman rules. What is the lowest-order diagram for the process $$\gamma \gamma \rightarrow e^+ e^-$$?

     
  10. 10.

    Bhabha scattering. Draw the QED Feynman diagrams at lowest (leading) order for the elastic $$e^+e^-$$ scattering and discuss why the Bhabha scattering measurements at LEP are done at very small polar angle.

     
  11. 11.

    Bhabha scattering: higher orders. Draw the QED Feynman diagrams at next-to-leading order for the Bhabha scattering.

     
  12. 12.

    Compton scattering and Feynman rules. Draw the leading-order Feynman diagram(s) for the Compton scattering $$\gamma e^- \rightarrow \gamma e^-$$ and compute the amplitude for the process.

     
  13. 13.

    Top pair production. Consider the pair production of top/antitop quarks at a proton–antiproton collider. Draw the dominant first-order Feynman diagram for this reaction and estimate what should be the minimal beam energy of a collider to make the process happen. Discuss which channels have a clear experimental signature.

     
  14. 14.

    c quark decay. Consider the decay of the c quark. Draw the dominant first-order Feynman diagrams of this decay and express the corresponding decay rates as a function of the muon decay rate and of the Cabibbo angle. Make an estimation of the c quark lifetime knowing that the muon lifetime is about 2.2 $$\upmu $$s.

     
  15. 15.

    Gray disk model in proton–proton interactions. Determine, in the framework of the gray disk model, the mean radius and the opacity of the proton as a function of the c.m. energy (you can use Fig. 6.70 to extract the total and the elastic proton–proton cross sections).