335 13.2 Dirac equation consisting of a single atomic layer. Its low-energy spectrum consists of two cones symmetric in energy space, the states in the lower cone being completely filled. This spectrum is a close match to the spectrum of (two-dimensional) massless Dirac fermions, see Exercise 4 at the end of this chapter. We can re-express the operator of a bispinor Dirac field in terms of the operators and the solutions (13.55) and (13.56) ψˆ (r) = 1 √ V k,α ψ(e) α (k)cˆα,k exp{ik · r} + ψ(p) α (k)bˆ† α,k exp{−ik · r} , (13.67) with ψ(e) α (k) = ψα,+(k) and ψ(p) α (k) = i(σy)αβψβ,−(−k) to make it conform to our definition of the positron CAPs. The indices α and β are ±1 and indicate spin, and ψ±,±(k) is a four-component bispinor. Control question. Do you see how the definitions of ψ(e) α (k) and ψ(p) β (k) follow? Let us introduce the compact notations u(k) = hc¯ k ffl 2E(k)[mc2 + E(k)] and w(k) = ffl 1 − u(k)2, (13.68) where all E(k) are taken positive. With this, ψ(e) α = wψα (u · σ)ψα and ψ(p) α = i(u · σ)σyψα iwσyψα . (13.69) Let us emphasize here for the sake of clarity that the functions ψ(e) α and ψ(p) α are fourcomponent bispinors while the function ψα is a two-component spinor. In both cases the index α labels spin, and can take the two values ±1. As we know, the fermion field is not a physical quantity by itself: physical quantities are at least bilinear in terms of the fields. The most important quantities are the fermion density and current. Let us express the corresponding operators in terms of CAPs, we will need these relations when dealing with interactions. We substitute (13.67) into (13.49), ρˆ(r) = 1 V k,q,α,β Ree αβ(k − q, k)cˆ † k−q,αcˆk,β + Rpp αβ(k − q, k)bˆ† k−q,αbˆ k,β + Rep αβ(k − 1 2q, −k − 1 2q)cˆ † k−q/2,αbˆ† −k−q/2,β (13.70) +Rpe αβ(k + 1 2q, −k + 1 2q)bˆk+q/2,αcˆ−k+q/2,β exp{iq · r}, ˆj(r) = c V k,q,α,β Jee αβ(k − q, k)cˆ † k−q,αcˆk,β + J pp αβ(k − q, k)bˆ† k−q,αbˆ k,β + J ep αβ(k − 1 2q, −k − 1 2q)cˆ † k−q/2,αbˆ† −k−q/2,β (13.71) +J pe αβ(k + 1 2q, −k + 1 2q)bˆk+q/2,αcˆ−k+q/2,β exp{iq · r},
336 Relativistic quantum mechanics the coefficients R and J being given by Ree(k1, k2) = −Rpp(k1, k2) = w1w2 + u1 · u2 + iσ · (u1 × u2), (13.72) Rep(k1, k2) = [Rpe(k2, k1)]† = (w1u2 + w2u1) · σiσy, (13.73) Jee(k1, k2) = −Jpp(k1, k2) = w1u2 + w2u1 + iσ × (u2w1 − u1w2), (13.74) Jep(k1, k2) = [Jpe(k2, k1)]† = w1w2σiσy + (u1 · σ)σ(u2 · σ)iσy, (13.75) where we use the notation ua ≡ u(ka) and wa ≡ w(ka). The coefficients with ee and pp are diagonal in spin space, i.e. they come with a 2×2 unity matrix. Note that here ˆj represents the particle current density: to obtain the (charge) current density one has to include a factor e. In the non-relativistic limit of k1,2 mc/h¯ we recover the familiar spin-independent expressions Ree = 1 and Jee(k1, k2) = h¯(k1 + k2) 2mc . (13.76) For positrons, they are of opposite sign, as expected for particles of opposite charge. At k1 = k2, that is, for contributions to slowly varying densities and currents, we have Ree = 1 and Jee = v(k)/c, in accordance with the assumptions made in Chapter 9. A new element typical for relativistic physics is the spin dependence of density and current at k1 = k2. This is a manifestation of spin–orbit coupling, see Exercise 3 at the end of this chapter. Another new element is that in general the density and current cannot just be simply separated into electron and positron contributions: there are interference terms coming with Rep, Jep, etc. These terms manifest the fact that electrons and positrons are not independent particles, but arising from the same Dirac field. The interference terms can create and annihilate electron–positron pairs. Terms proportional to iσy create a pair in a spin-singlet state, while those coming with iσyσ create pairs in spin-triplet states. 13.2.3 Interaction with the electromagnetic field Electrons possess electric charge and should therefore interact with the electromagnetic field. We can derive this interaction with the same gauge trick we used in Chapter 9. The relativistic Dirac equation γˆμ ∂ ∂xμ + i mc h¯ ψ = 0, (13.77) is gauge-invariant. This means that if we transform the bispinor ψ (xμ) → ψ(xμ) exp{i(xμ)}, (13.78) and simultaneously transform the equation into γ μ ∂ ∂xμ − i ∂ ∂xμ ψ + i mc h¯ ψ = 0, (13.79)
337 13.3 Quantum electrodynamics it describes precisely the same physics in the sense that all observable quantities remain the same. One can see this, for instance, from the fact that the four-dimensional current density remains invariant under this transformation, ψ¯ γ μψ → ψγ¯ μψ. (13.80) As discussed in Chapter 9, gauge invariance is not a void abstraction. The fundamental interactions are believed to originate from gauge invariance, so (13.79) should suffice to fix the form of interaction between radiation and relativistic electrons. This is achieved by the substitution ∂ ∂xμ → e h¯ Aμ, (13.81) which corresponds to the use of “long" derivatives in the presence of an electromagnetic field, ∂ ∂xμ → ∂ ∂xμ − i e h¯ Aμ. (13.82) So, the Dirac electron in the presence of an electromagnetic field is described by the following Lorentz covariant equation, γ μ ∂ ∂xμ − i e h¯ Aμ ψ + i mc h¯ ψ = 0. (13.83) At the moment, this equation is for a single-electron wave function in the presence of a classical electromagnetic field. This equation has many applications by itself. It describes, for instance, the relativistic hydrogen atom and spin–orbit interaction in solids. Besides, the equation is ready for a second-quantization procedure. 13.3 Quantum electrodynamics To perform the quantization, one uses the analogy between the equation for a wave function and the equation for a corresponding fermionic field operator. So the same equation is valid for bispinor field operators ψˆ (xμ). The classical field Aμ is also replaced by the bosonic field operator, extensively studied in Chapters 7–9. This then defines the reaction of the electrons to the electromagnetic field. But, as we know, the electrons also affect the field. To take this into account one quantizes the Maxwell equation (13.27) and uses expression (13.49) for the four-vector current density. The resulting equations define a quantum field theory – quantum electrodynamics. Being non-linear, the set of equations cannot be solved exactly. There are several ways to implement this quantization. A consistently Lorentz-invariant way is to use quantum Lagrangians rather than Hamiltonians: indeed, any Hamiltonian distinguishes time from the other spacetime coordinates. One would also impose the Lorentz-invariant Lorenz gauge ∂μAμ = 0. This would be the most adequate treatment, resulting in elegant short forms and formulas. However, it requires learning specific techniques not applied in general quantum mechanics and not discussed in this book.
338 Relativistic quantum mechanics Instead, we do it in a simpler way. We stick to the Hamiltonian description and deal with the electromagnetic field in the Coulomb gauge. We use the electron and photon operators in the same form as in the previous chapters. The main difference is that we bring positrons into the picture. 13.3.1 Hamiltonian The full Hamiltonian consists of terms describing the free fermions and photons, the Coulomb repulsion between the fermions, and their interaction with the radiation, Hˆ = Hˆ F + Hˆ C + Hˆ int. (13.84) The free particle Hamiltonian assumes its usual form, Hˆ F = σ,k E(k) cˆ † σ,kcˆσ,k + bˆ† σ,kbˆ σ,k + h¯ω(k)aˆ † k,σ aˆk,σ , (13.85) where aˆ(†) are photon CAPs and σ labels both the projections of the fermion spin and the photon polarization. We write the Coulomb term in the Fourier representation, Hˆ C = 2παhc¯ V q 1 q2 ρˆ−qρˆq, (13.86) where α is the fine structure constant α = e2/4πε0hc¯ . The Fourier components of the density operator ρˆ(r) = # a ψˆ † a (r)ψˆ a(r) are given by (13.70). The interaction with the photons is described by the familiar term Hˆ int = −e dr Aˆ (r) · ˆj(r). We thus simply substitute the operators in terms of the corresponding CAPs to arrive at Hˆ int = −√ 2πα ch¯ √ V k,q α,β,γ 1 √q aˆ† q,γ Kee αβγ (k − q, k)cˆ † k−q,αcˆk,β +Kpp αβγ (k − q, k)bˆ† k−q,αbˆ k,β + Kep αβγ (k − 1 2q, −k − 1 2q)cˆ † k−q/2,αbˆ† −k−q/2,β +Kpe αβγ (k + 1 2q, −k + 1 2q)bˆk+q/2,αcˆ−k+q/2,β + H.c. , (13.87) where Kee αβγ (k1, k2) = eγ · Jee αβ(k1, k2) and similar relations hold for Kep, Kpe, and Kpp. So, what is this Hamiltonian good for? It turns out that relativistic quantum electrodynamics predicts a set of new experimentally observable effects at sufficiently high energies fl mc2 that are absent in usual electrodynamics. For instance, electrons and positrons can annihilate while producing light. Also, virtual emission of electron–positron pairs results in non-linear electric effects present in the vacuum, like the scattering of a photon on a phonon. In general one can say that the number of fermionic particles is not conserved in course of a scattering event: new electron–positron pairs could be created. The above theory also predicts relativistic corrections to the more common scattering processes. For instance, theoretical predictions for scattering cross-sections can be obtained with the help of the Hamiltonian (13.84).
339 13.3 Quantum electrodynamics 13.3.2 Perturbation theory and divergences We see that for both interaction terms – Coulomb and radiative – the interaction enters the Hamiltonian with the small dimensionless parameter α. This suggests that also in the relativistic world the interactions can be efficiently described by perturbation theory. Indeed, perturbation theory has been successfully used to evaluate cross-sections of unusual scattering processes as mentioned above, as well as relativistic corrections to the usual scattering processes. However, we noted already at the first steps of the development of quantum electrodynamics that the perturbation series suffer from serious problems: divergences contributed to by virtual states of high energy. It took several decades to overcome these problems and to understand how and why the divergences cancel for observable quantities. At the end of the day, the problems with the calculations appeared to be more interesting and important than the results of the calculations. The understanding how to handle the divergences of the perturbation series has changed the understanding of what a physical theory is and contributed to the concept of renormalization, the most important concept that has emerged in physics since the middle of the twentieth century. To see the divergences, it is sufficient to look at the interaction corrections to very simple quantities. To start with, let us evaluate such correction to the energy of an electron with wave vector k and spin σ. Without interactions, the energy is E(k) = ffl (mc2)2 + (hck ¯ )2. It is important for further calculation to recognize that the particle energy is in fact an excitation energy: it is the difference of the energies of the excited state cˆ † k,σ |v and of the vacuum |v. We thus need to compute the corrections to the energies of both states and take the difference. Let us consider first the effect of the Coulomb term. In this case, we deal with the firstorder correction fl α that reads [δE(k)]C = v|ˆck,σ Hˆ Ccˆ † k,σ |v−v|Hˆ C|v. (13.88) The Hamiltonian Hˆ C (see (13.86)) contains two density operators ρˆ±q. We see that the only terms in ρˆ giving a non-zero result when acting on the Dirac vacuum are those ∝ ˆc†bˆ† with all possible wave vectors in the density operator on the right. After the action, the vacuum is in one of the virtual states with an electron–positron pair. Most of the terms creating electron–positron pairs however commute with cˆ † k,σ and therefore provide the same contribution to the energies of the vacuum and the excited state. They thus cancel in (13.88) and do not contribute to the change of the particle energy. There are two exceptions: (i) terms in Hˆ C which create and annihilate an electron–positron pair with the electron in the state k, σ contribute to the second term in (13.88) but not to the first term, and (ii) terms with cˆ † k−q,σ cˆk,σ in the right density operator and cˆ † k,σ cˆk−q,σ in the left one contribute to the first term in (13.88) but not to the second. The states thus involved in the correction are schematically depicted in Fig. 13.2a, and we see that the correction of interest is the difference of the two contributions [δE(k)]C = α dq (2π)3 2πhc¯ q2 σ Ree σ σ (k, k − q)Ree σ σ (k − q, k) − Rep σ σ (k, k − q)Rpe σ σ (k − q, k) . (13.89)
340 Relativistic quantum mechanics Fig. 13.2 Quantum states involved in divergent corrections to particle energies and interaction constants. The black arrows represent electron states, the white arrows positron states, and the wiggly lines photon states. (a) The Coulomb correction to the electron energy. (b) The radiation correction to the electron energy. (c) Correction to the photon energy. (d) Vertex correction. Let us investigate the convergence of the integral at q k. A rough estimation would come from a dimensional analysis. The coefficients R are dimensionless and tend to a constant limit at q → ∞. The integral is then estimated as [δE]C fl αhc¯ dq, and we see that it diverges linearly at the upper limit! Fortunately, the situation is not so bad since the terms with Ree and Rep cancel each other at q → ∞. Therefore, the existence of antiparticles helps us to fight the divergences. Substituting explicit expressions for Ree, Rep, and Rpe yields [δE(k)]C = 2παhc¯ dq (2π)3q2 m2c4 E(k)E(k − q) , (13.90) the term in brackets emerging from the cancellation of electron–electron and electron– positron terms. We note that at large q the energy goes to the limit E(k − q) → hcq ¯ . Therefore, the integral over q still diverges logarithmically at large q. Let us assume an upper cut-off qup in momentum space and set E(k − q) → hcq ¯ , which is valid at q qlow fl mc/h¯, and we take the integral over q with the logarithmic accuracy cutting it at qup at the upper limit and at qlow at the lower limit, [δE(k)]C = 2πα m2c4 E(k) dq (2π)3q3 = α m2c4 πE(k) L, where L = qup qlow dq q = ln qup qlow . (13.91)
341 13.3 Quantum electrodynamics A compact way to represent this correction is to ascribe the change of the electron’s energy to a change of its rest mass. Indeed, from E = ffl (mc2)2 + (chk¯ )2 we see that δE = (δm)mc4/E, and the energy shift is equivalent to a relative change of the rest mass δm m C = α π L. (13.92) We see that the Coulomb term leads to a positive interaction-induced correction to the electron (and, by symmetry, to the positron) mass that eventually logarithmically diverges at large wave vectors. We postpone the discussion of this rather surprising fact until we have found all divergent quantities in quantum electrodynamics. There is also a radiation correction to E(k). It results from a second order perturbation in Hˆ int, and therefore is also proportional to α. The correction to the energy of the one-particle state involves a virtual state composed of a photon with wave vector q and polarization β and and an electron with wave vector k − q and polarization σ . The corresponding “missing” contribution for the correction to the vacuum energy involves the virtual state composed of an electron with k, σ, a photon with q, β, and a positron with −k − q, σ . (Fig. 13.2(b)). We thus arrive at [δE(k)]R = −2πα β,σ dq(hc¯ ) 2 (2π)3q / |Kee σ σ β(k, k − q)| 2 E(k − q) + hcq ¯ − E(k) − |Kep σ σ β(k, −k − q)| 2 E(k − q) + hcq ¯ + E(k) 0 . (13.93) Similar to the Coulomb correction, a dimensional estimation leads to a linear divergence. Electron–electron and electron–positron terms cancel each other at large q, lowering the degree of divergence again to a logarithmic one. The calculation yields [δE(k)]R = 5α 6π m2c4 E(k) L − 2α 3π E(k)L. (13.94) We know how to deal with the first term in this expression: it gives a change of the electron mass. The second term is more intriguing since it gives a correction even at large k, where the spectrum does not depend on the mass. One can regard this correction as a change of the speed of light. Indeed, for small changes δE = δc c E + m2c4 E , (13.95) and with this we obtain δm m = δm m C + δm m R = 5α 2π L and δc c = − 2α 3π L. (13.96) We can check if the change computed is indeed a change of the speed of the light: we can actually compute the corresponding correction to the photon spectrum. The correction arises in second order in Hˆ int. Let us compute this correction to the energy of the state with a photon having q, γ . The correction involves a virtual state with an electron–positron pair, the electron and positron being in the states k, σ and −k + q, σ . The “missed” virtual state in the correction to the vacuum energy consists of a photon with q, γ , an electron
342 Relativistic quantum mechanics and positron with k, σ and −k − q, σ , respectively. The total correction is schematically depicted in Fig. 13.2(c), and reads explicitly δω(q) = 2πα q dkhc¯ 2 (2π)3 / |Kep σ σ γ (k, −k + q)| E(−k + q) + E(k) − hcq ¯ − |Kep σ σ γ (k, −k − q)| E(−k − q) + E(k) + hcq ¯ 0 . (13.97) Singling out the logarithmically divergent part, we indeed confirm that the correction to the speed of light is given by (13.96). Another logarithmic divergence emerges from the corrections to the matrix elements, socalled vertex corrections. Let us consider an external field with a uniform vector potential A. It enters the Hamiltonian with the corresponding current and gives rise to the matrix element between two electron states |k, σ=ˆc † k,σ |v. As we know from Chapter 9, the element is proportional to the electron velocity and charge, M = −eA · v(k). (13.98) With the interaction, the original electron states acquire a small admixture of the states with a photon and an electron with another wave vector k − q, |k, σ=ˆc † k,σ |v + q,σ ,γ βq,σ ,γ aˆ† q,γ cˆ † k−q,σ |v, (13.99) where the admixture coefficients βn of the state |n into the state |m are given by firstorder perturbation theory, βn = (Hˆ int)mn/(Em − En). These admixtures change the matrix element M. This correction is pictured in Fig. 13.2(d), and reads (δM)1 = −e q,σ ,γ |βq,σ ,γ | 2A · v(k − q) = −e2πα(hc¯ ) 2 dq (2π)3q A · v(k − q) |Kee σ σ γ (k, k − q)| 2 (E(k − q) − E(k) + chq¯ )2 . (13.100) Singling out the logarithmically divergent term at large q gives (see Exercise 5) (δM)1 = M α 12π L. (13.101) It is interesting to note that the integral also diverges at small q, that is, at q k. Such divergences are called infrared, since they are associated with small wave vectors and long wave lengths, while the divergences at large wave vectors are called ultraviolet. This divergence is also a logarithmic one. The calculation (Exercise 5) gives (δM)1,infrared = M α π 1 + 1 2 c v − v c lnc − v v + c Linfrared, (13.102) where we cut the integral over q at qlow and qup fl k and introduce Linfrared ≡ ln(qup/qlow). Now let us note that the (δM)1 we concentrated on is not the only correction to the matrix element. The wave function actually acquires a second-order correction, |k, σ → ⎧ ⎨ ⎩ 1 − 1 2 q,σ ,γ |βq,σ ,γ | 2 ⎫ ⎬ ⎭ |k, σ, (13.103)
343 13.4 Renormalization which leads to another correction (δM)2 to the matrix element. As a matter of fact, the firstorder and second-order corrections can be collected to a nice expression. One can argue about this using gauge invariance: in fact, a time- and space-independent vector potential corresponds to a shift of the wave vector, and the diagonal matrix element considered is the derivative of energy E(k) with respect to this shift. We reckon that M = −eA · v, not dependent on the interaction. Therefore, the sum of the corrections is proportional to the correction to the velocity, (δM)1 + (δM)2 = −eA · δv. (13.104) We come to the important conclusion that the electron charge is not affected by the interaction, δe = 0. The interaction is controlled by so-called dimensionless charge α. We remember that, α = e2/4πε0hc¯ . Since the speed of light is changing, α itself is modified as well, δα α = −δc c = 2α 3π L. (13.105) If the vector potential is not uniform, the vertex interaction correction is less trivial. Choosing a vector potential in the form corresponding to a uniform magnetic field, and computing the matrix element gives an addition to the electron energy proportional to the field. At zero velocity, this is proportional to the spin and is in fact a Zeeman energy ∝ B · σˆ. The interaction correction to the matrix element is thus a change of the actual magnetic moment μe of the electron in comparison with 2μB. The correction does not contain logarithmic divergences and is given by μe 2μB = 1 + α 2π , (13.106) up to first order in α. This change, called the anomalous magnetic moment, can be readily measured and conforms to the theoretical predictions. To summarize the results of this section, we attempted to apply quantum perturbation theory in α to quantum electrodynamics. We have found corrections ∝ α to the electron mass, the speed of light, and the interaction strength α itself. The corrections are formally infinite. They diverge either at large (ultraviolet) or small (infrared) values of the wave vectors. 13.4 Renormalization This presents a problem: one does not expect such behavior from a good theory. To put it more precisely, the calculation done reveals a set of problems of different significance and importance. Let us recognize the problems and filter out those that are easy to deal with. First of all, we note that there would be no problem with ultraviolet divergences if the Hamiltonian in question described excitations in a solid, rather than fundamental excitations in the Universe. For a solid, we would have a natural cut-off above which the theory would not be valid any longer. This cut-off qup can be set to, say, the atomic scale. With this, the corrections would not diverge. Moreover, they would hardly exceed the scale of
344 Relativistic quantum mechanics α since the factor L is only a logarithm of large numbers. However, in the context of this chapter, nothing suggests the existence of a natural short-distance cut-off below which the known physical laws cease to work. The ad hoc assumption of the existence of such a cut-off is a blow to the elegance and simplicity of the theory. Infrared divergences would indicate trouble for a solid-state theory as well. However, in Chapter 12 we gave some clues how to handle these divergences. The point we made in Section 12.9 is that the vacuum can be regarded as an Ohmic environment for charged particles and leads to the orthogonality catastrophe at low energies (and, therefore, at small wave vectors). The orthogonality catastrophe taking place indicates that in this case multiphoton processes are important. Once those are taken into account – this would require summing up the contributions of all orders of perturbation theory in α – the divergences are gone. This may inspire a hypothesis that the ultraviolet divergences can also be removed by summing up all orders of perturbation theory. If this were the case, it would indicate a nonanalytical dependence of observable quantities on α. To give a hypothetical example, the interaction correction to the electron mass could in the end look like δm/m = α ln(1/α). This expression cannot be expanded in Taylor series in α, each term of perturbation theory would diverge. Physically, this would indicate the emergence of a short-distance scale fl α(h¯/mc) below which the electromagnetic interactions become weaker. This is a fair hypothesis to check, and in fact it has been checked. The result is negative: summing up all orders of perturbation theory does not remove the ultraviolet divergences. Moreover, as we will see, electromagnetic interactions are in fact enhanced at short distance scale. Another problem is that, from the point of view of relativistic symmetry, it seems very troublesome that we have found corrections to the speed of light, whether these corrections are divergent or finite. Indeed, relativistic symmetry is introduced at the spacetime level, long before any material fields come about in this spacetime. One can trace the origin of the correction to the fact that the perturbation theory is formulated at the Hamiltonian level, this explicitly breaking the Lorentz invariance. In particular, our upper cut-off has been set in a non-invariant way: we only cut wave vectors, and do not take any care about the frequencies. However, it remains unclear why the result of the calculation does depend on the way – invariant or non-invariant – we perform it. In principle, observable quantities should not depend on this. The problem of the ultraviolet divergences persisted for decades. The solution – the principle of renormalization – was not accepted commonly and instantly, as happens with most advances in physics. Dirac doubted its validity till his last days, yearning for a more consistent theory. Feynman, despite his crucial role in the development of quantum electrodynamics, called renormalization a “hocus-pocus.” However, the ideas and tools of renormalization have slowly spread among the next generation of physicists and nowadays they are indispensable for the analysis of any sufficiently complex physical model. The key idea of renormalization has already been mentioned in Chapter 11, where we discussed the difference between “bare” and “dressed” quantities. Bare quantities are parameters entering the Hamiltonian, while dressed quantities can be observed in an experiment. Let us take the electron mass as an example. The quantity m we measure in experiments and find in the tables is a dressed mass, corresponding to our concrete world
345 13.4 Renormalization with an interaction constant α ≈ 1/137. The fact that the mass depends on the interaction suggests that the bare mass – the quantity m∗ entering the free single-particle Hamiltonian Hˆ F – differs from m. Up to first order in α, we have m = m∗ + δm, where the interaction correction is given by (13.96). We can now re-express the bare mass m∗ = m−δm, substitute this into the Hamiltonian and expand in δm ∝ α. In comparison with the original Hamiltonian, we get an extra term δHˆ = (δm) ∂Hˆ F ∂m = δm m k,σ m2c4 E(k) cˆ † σ,kcˆσ,k + bˆ† σ,kbˆ σ,k . (13.107) This extra term is called a counterterm. It is proportional to α and as such should be taken into account while computing perturbative corrections of first order in α. This cancels the ultraviolet divergences in this order! The results of perturbation theory then do not depend on the cut-off wave vector qup and are therefore well defined. In our Hamiltonian approach, we need to add counterterms that cancel ultraviolet divergent corrections to the mass and speed of light. Since the coefficients R in the density operator and K in the current density operator depend on these quantities, the explicit form of the counterterms is rather cumbersome. In a Lorentz-invariant (Lagrangian) formulation, one needs three simple counterterms that correspond to the renormalization of the Dirac field, the photon field, and the electron mass. Quantum electrodynamics was shown to be renormalizable. This means that the finite set of three counterterms is sufficient to cancel the divergences in all the orders of perturbation theory. There are examples of unrenormalizable models. In that case, the number of counterterms required for the cancellation increases with the order of the perturbation series and is thus infinite. The current consensus is that these models are “bad,” not describing any physics and have to be modified to achieve renormalizability. Renormalization brings about the concept of rescaling: a sophisticated analogue of the method of dimensional analysis. Let us provide a simple illustration of this rescaling. We know now that quantum electrodynamics is a renormalizable theory so that all physical results do not depend on the upper cut-off. The upper cut-off, however, explicitly enters the counterterms and integrals one has to deal with when computing perturbation corrections. This makes us rich: we have a set of equivalent theories that differ in the parameter qup, or, better to say, L. The theories differ not only in L. Since the corrections to observables like m, c, and α depend on L, these observables are also different for different equivalent theories. One can interpret this as a dependence of these observables on the scale. In relativistic theory, the spatial scale is related to the particle energies involved, qup fl E/(hc¯ ). Let us assume that we are attempting to measure α using experiments involving particle scattering with different particle energies. We get a fine structure constant that depends on the scale, α(L). How can we evaluate this? Let us use (13.105) and, instead of writing a perturbation series, we change the scale L in small steps, L → L + δL. In this way, we rewrite (13.105) as a differential equation dα dL = 2 3π α2. (13.108)
346 Relativistic quantum mechanics This equation has a solution α(L) = α0 1 − α0(3π/2)L, (13.109) where α0 = α(L = 0). Let us interpret this result. The scale L = 0 corresponds to low energies of the order of mc2. We know from our experience that α0 ≈ 1/137. What happens if we increase the energy? The effective strength of the interaction increases as well. The increase, which is quite slow in the beginning, accelerates so that α(L) reaches infinity at a certain scale Lc = 2/(3πα0) corresponding to qc ≈ 4 · 10124mc/h¯. The momentum (or energy) where this happens is called a Landau pole. In fact, we cannot honestly say that α turns to infinity at this scale. Equation 13.105 was derived perturbatively, that is, under the assumption that α 1. It will not work if α grows to values fl 1. However, even the first-order perturbative calculation indicates that α grows and reaches values of the order of 1 at sufficiently short distances or high energies. Quantum electrodynamics, being a perturbative theory at the usual scales, becomes a non-perturbative theory at very short distances. This illustration of renormalization concludes this book on Advanced Quantum Mechanics. We have been through a variety of topics, trying to keep the technical side as simple as possible. In this way, any person of a practical inclination can get a glimpse of various fields, while making calculations involving CAPs and quantum states. This tool set constitutes the backbone of quantum mechanics and will be useful for many future applications of this marvelous theory. Dear reader, best of luck with using this!
347 13.4 Renormalization Table 13.1 Summary: Relativistic quantum mechanics Principle of relativity, all physics is the same in a stationary or constant velocity references frame Galilean transformation: r = r − vt + r0 and t = t + t0 relate different reference frames treats r and t separately, turned out inconsistent with experiments Lorentz transformation: x = γ (x − vt), y = y, z = z, and t = γ [t − (v/c2)x] with γ = ffl 1 − (v/c)2 is correct relation for v = vex mixes r and t → concept of four-dimensional spacetime all three-dimensional vectorial quantities are four-vectors in relativistic physics: spacetime xμ = (ct, r), velocity uμ = γ (c, v), momentum pμ = (E/c, γ mv), vector and scalar potential Aμ = (φ/c, A), current and charge density j μ = (cρ, j) these vectors are contravariant, their covariant counterparts read xμ = (ct, −r), uμ = γ (c, −v), etc. Dirac equation, the search for a “relativistic Schrödinger equation” Klein–Gordon equation: ) h¯ 2 + m2c2 * ψ = 0, with = (1/c2)∂2 t − ∇2 problem: Lorentz covariant but can lead to negative probability densities Dirac equation: γˆμ ∂ ∂xμ + i mc h¯ ψ = 0, with γˆ 0 = / 1 0 0 −1 0 , γˆ = / 0 σˆ −σˆ 0 0 Lorentz covariant and always positive definite probability density explicitly: / −i∂t −icσˆ · ∇ icσˆ · ∇ i∂t 0 ψ = −mc2 h¯ ψ, with bispinor ψ = / ψA ψB 0 look for solutions: ψ ∝ exp{− i h¯ pμxμ} → momentum states with E = ±ffl m2c4 + p2c2 field operator: ψˆ (r, t) = # k,σ, ψσ, (k)aˆσ,,k exp ) − i h¯ E(k)t + ik · r * , with σ, = ± Dirac vacuum: all negative energy states are filled, |v = , k aˆ † +,−,k aˆ † −,−,k|0 a “hole” in the Dirac sea corresponds to an antiparticle electron CAPs: cˆ † +,k = ˆa† +,+,k and cˆ † −,k = ˆa† −,+,k positron CAPs: bˆ† +,k = −ˆa−,−,(−k) and bˆ† −,k = ˆa+,−,(−k) in general, operators like ρˆ and ˆj now contain terms proportional to cˆ†cˆ, bˆ†bˆ, cˆ†bˆ†, and bˆcˆ Quantum electrodynamics, describes Dirac particles in the presence of an electromagnetic field Hamiltonian: (free electrons, positrons, and photons) + (Coulomb interaction) + (interaction between matter and radiation) perturbation theory: small parameter in interaction terms is α = e2/4πε0hc¯ ≈ 1/137 divergences: interaction corrections to particle energies diverge logarithmically introduce cut-off: corrections can be interpreted as corrections to m, c, and α δm m = 5α 2π L and δc c = −δα α = − 2α 3π L, with L = ln qup qlow Renormalization “bare” quantities are parameters in the Hamiltonian, “dressed” quantities are observable counterterms: extra terms in the Hamiltonian to get “correct” observable quantities in QED these counterterms repair the divergences → QED is renormalizable
348 Relativistic quantum mechanics Exercises 1. Schrödinger–Pauli equation (solution included). The Hamiltonian describing the effect of a magnetic field on the spin of an electron, Hˆ = −μBB · σˆ, was introduced in Chapter 1 by arguing that spin is a form of angular momentum, and that it therefore adds to the magnetic moment of the electron. We now show how this Hamiltonian follows from the Dirac equation. a. Let us assume for simplicity a time-independent vector potential A. Write the stationary version of the Dirac equation in the presence of an electromagnetic field (13.83) in terms of A and ϕ, and as a coupled set of equations for the electron part ψA and positron part ψB of the bispinor, such as in (13.52). Eliminate the small ψB from the equations. b. We now use the fact that both the non-relativistic energy Enr ≡ E − mc2 and eϕ are much smaller than mc2. Rewrite the equation for ψA found above in terms of Enr instead of E, which makes mc2 the only large scale in the equation. Expand the fraction containing Enr in the denominator to first order in (v/c) 2. c. Keep only the zeroth-order term of the expansion and show that to this order the equation for ψA reads 1 2m(pˆ − eA) 2 − eh¯ 2mσˆ · B + eϕ ψA = EnrψA, indeed the ordinary Schrödinger equation including the interaction of the spin of the electron with the magnetic field. Hint: (σ · A)(σ · B) = A · B + iσ · (A × B). 2. Contraction or dilation? A first-year Klingon cadet studies Lorentz transformations. He inspects (13.6) and finds that the length intervals in two reference systems are related by L = ffl 1 − (v/c)2L. This brings him to the conclusion that space ships appear bigger while moving. He elaborates an example where a Klingon B’rel-class star ship of length 160 m appears as long as 1000 m, moving with a certain velocity v along the enemy lines. a. The inexperienced cadet made a common mistake based on our non-relativistic intuition. Which one? b. Help him to derive the correct formula for the apparent size change. c. What is the correct answer for the apparent size of a B’rel ship under the conditions of the example. 3. Spin–orbit interaction. In Exercise 1 we started expanding the electronic part of the Dirac equation in orders of (v/c) 2. If one includes the next order correction, one finds, inter alia, a term describing the spin–orbit coupling of the electrons. Similarly to the Zeeman term, this coupling was originally added as a quasi-phenomenological term to the electronic Hamiltonian. We now derive it from the Dirac equation. a. Write the decoupled Dirac equation for ψA, keeping the next term in the expansion of Exercise 1(b). Assume for simplicity that we have no magnetic field, A = 0. There are several problems with this expression. One of them is that the resulting ψA (and ψB) can no longer satisfy the normalization condition dr|ψ(r) | 2 = 1. This can
349 Exercises be seen as follows. We are interested in corrections to ψA of order ∼ (v/c) 2. However, ψB, which up to now we forgot about, is of order ∼ (v/c)ψA, see (13.53). The normalization condition for the whole bispinor thus becomes 1 = dr [ψ† A(r)ψA(r) + ψ† B(r)ψB(r)] ≈ dr ψ† A(r) 1 + pˆ2 4m2c2 ψA(r), which also forces corrections to ψA of the same order (v/c) 2. b. We introduce a “rescaled” two-component spinor ψ = AˆψA which is properly normalized, i.e. it satisfies dr|ψ(r)| 2 = 1. Find Aˆ. c. Write the equation found at (a) for ψ. This equation still contains Enr on both sides. In order to be able to write it in the form Hˆ ψ = Enrψ, multiply it from the left by Aˆ −1. Keep only terms up to order (v/c) 2. d. Show that pˆ2ϕ = h¯ 2(∇ · E) + 2ih¯E · pˆ + ϕpˆ2, and that (σ · pˆ)ϕ(σ · pˆ) = −h¯σ · (E × pˆ) + ih¯E · pˆ + ϕpˆ2. e. Use the relations derived at (d) to show that the equation found at (c) can be written as pˆ2 2m + eϕ − pˆ4 8m3c2 − eh¯σ · (E × pˆ) 4m2c2 − eh¯ 2 8m2c2 (∇ · E) ψ = Enrψ, (13.110) the third term describing the spin–orbit coupling of the electron, and the fourth term, the Darwin term, accounting for a small energy shift proportional to the charge density ∇ · E. f. Explain the significance of the third term in (13.110). 4. Graphene. The Dirac equation is traditionally only relevant in high-energy physics, when relativistic effects become important. For condensed matter systems at low temperatures, its main merit is that one can derive from it small corrections to the Hamiltonian which describe spin–orbit interaction, hyperfine coupling, etc. Some ten years ago, however, the Dirac equation suddenly started playing an important role in condensed matter physics. In 2004, it was for the first time demonstrated experimentally possible to isolate a single monatomic layer of graphite (called graphene), to connect flakes of graphene to electrodes, and to perform transport measurements on the flakes. The band structure of graphene has the peculiar property that the electronic spectrum close to the Fermi level can be exactly described by a Dirac equation for massless particles. Graphene consists of a hexagonal lattice of carbon atoms, see Fig. 13.3. Three of the four valence electrons of each carbon atom are used to form sp2-bonds which hold the lattice together. The fourth electron occupies an (out-of-plane) pz-orbital, and is not very strongly bound to the atom. These electrons can thus move through the lattice, hopping from one pz-orbital to a neighboring one, and participate in electrical transport. We denote the lattice vectors with b1 and b2 (see figure), and the set of all lattice sites is thus given by R = n1b1 + n2b2, the ns being integers. There are two atoms per unit cell (indicated with gray and white in the figure) which are separated by a. The available electronic states are thus ψpz(r − R) and ψpz(r − R − a), where ψpz(r) is the
350 Relativistic quantum mechanics Fig. 13.3 Part of the hexagonal lattice of graphene. The basis vectors b1 and b2 are indicated, as well as the vector a connecting two nearest-neighbor atoms. The unit cell of the lattice contains two atoms: atoms belonging to the two different sublattices are colored white and gray. wave function of a pz-state. We denote these states respectively by |R and |R + a, and assume the resulting electronic basis {|R, |R + a} to be orthonormal. We have a periodic lattice with a two-atom basis, so we can write the electronic states on the two sublattices in terms of Fourier components, |k(1) = 1 √N R eik·R|R and |k(2) = 1 √N R eik·R|R + a. a. Hopping between neighboring pz-states is described by Hˆ = t R |RR + a|+|RR + b1 + a|+|RR + b2 + a| + H.c. , (13.111) where for simplicity we assume t to be real. Write this Hamiltonian in the basis {|k(1), |k(2)}. b. Give the x- and y-components of the two lattice vectors b1,2 in terms of the lattice constant b = |b1|=|b2|. c. Diagonalize the Hamiltonian (13.111) and show that the eigenenergies read E(k) = ±t 1 + 4 cos( 1 2 bkx) cos( 1 2 √ 3bky) + 4 cos2( 1 2 bkx). (13.112) We found two electronic bands, corresponding to the positive and negative energies of (13.112). In pure undoped graphene, each atom contributes one electron. This means that in equilibrium and at zero temperature, all electronic states of the lower band are filled, and all of the upper band are empty. In analogy to semiconductors, we can thus call the lower band the valence band and the upper band the conduction band. d. Calculate the two allowed energies for an electronic state with wave vector K ≡ b−1( 2 3π, √ 2 3 π). What does this imply for the band structure? Give the Fermi energy. e. Electronic states with wave vectors close to K thus have energies close to the Fermi energy and therefore describe low-energy excitations. Let us investigate these states. We consider a state with wave vector K + k, where |k||K| is assumed. Write the Hamiltonian for this state in the basis {|k(1), |k(2)} as you did in (a). Expand the result to linear order in |k|, and show that it can be written as
351 Solutions Hˆ = vFσ · pˆ, a Dirac equation without the term mc2, that is, for massless particles. Give the Fermi velocity vF in terms of t and b. 5. Ultraviolet and infrared vertex divergences. Compute the divergences of the vertex corrections given by (13.100) at large and small q. a. Express # σ ,γ |Kee σ σ γ (k1, k2)| 2 in terms of u1,2 and w1,2. b. Give the limit of the integrand at q → ∞. c. Give the limit of the integrand at q → 0. d. Perform the integration over the direction of vector q in both cases. Solutions 1. Schrödinger–Pauli equation. a. The two coupled equations read σ · (pˆ − eA)ψB = 1 c (E − eϕ − mc2)ψA, −σ · (pˆ − eA)ψA = −1 c (E − eϕ + mc2)ψB. Eliminating ψB results in σ · (pˆ − eA) c2 E − eϕ + mc2 σ · (pˆ − eA)ψA = (E − eϕ − mc2)ψA. b. The energy Enr and eϕ are both assumed to be of the order of ∼ mv2, and thus smaller than mc2 by a factor (v/c) 2. If we expand in this small parameter, we find c2 E − eϕ + mc2 = 1 2m 2mc2 2mc2 + Enr − eϕ ≈ 1 2m 1 − Enr − eϕ 2mc2 . c. To zeroth order the equation thus reads 1 2m σ · (pˆ − eA) σ · (pˆ − eA) ψA = (Enr − eϕ)ψA. Applying the rule given in the hint yields 1 2m(pˆ − eA) 2 + eϕ + iσ · (−ih¯∇ − eA) × (−ih¯∇ − eA) ψA = EnrψA. The cross product can then be simplified, using (i) A × A = 0 and (ii) ∇ × A = (∇ × A) − A × ∇, where the differentiation operator inside the parentheses works only on the vector potential whereas all others work on everything to the right of it. Substituting ∇ × A = B in the resulting expression yields the Schrödinger–Pauli equation.
Index adiabatic switching, 18 Anderson model for magnetic impurity, 109–112 angular momentum, 22–24, 39 antiparticles, 334 beer, 240, 241 bispinor, 329, 330, 332, 335–337, 347–349 adjoint, 330 black-body radiation, 211, 212 Bloch equations, 254, 255 Bogoliubov–de Gennes equation, 128, 129, 132–134 Boltzmann distribution, 211, 212, 233, 246 Boltzmann factors, 31 boson bath, 274–280, 286 Bremsstrahlung, 227–229 broken symmetries, 98, 99 Casimir effect, 201–204 causality, 271, 276 Cherenkov radiation, 223, 225–227 coherent state, 240–251, 256, 260, 263, 295 commutation relations angular momentum operators, 22 boson CAPs, 65 boson field operators, 67 fermion CAPs, 77 fermion field operators, 78 magnon CAPs, 104 phase and number operators, 257 Cooper pair, 114–117, 131 Cooper pair box, 257 counterterm (renormalization), 345 coupled oscillators classical, 162, 163 creation and annihilation operators bosons, 63–65 electrons and positrons, 333 fermions, 75–77 magnons, 104 vortices, 150 damped oscillator, 269–273 damping coefficient, 269–271, 274, 276 delocalization probability, 292, 301, 302, 305–307 density matrix, 31–34 damped oscillator, 283, 284 reduced, 32, 34 density operator, 69 detailed balance condition, 212, 232 dipole approximation, 218–221 dipole moment operator, 219 Dirac equation, 329–333 adjoint, 330 with electromagnetic field, 337 Dirac field, 333 Dirac sea, 333 Dirac vacuum, 333, 334 dynamical susceptibility, 270–273, 278, 280, 281, 283, 293, 298, 303, 304 elastic string classical, 163–166 set of oscillators, 166, 167 electromagnetic field energy density, 170, 173, 191 energy flux, 170, 173, 191 field operators, 192, 193 uncertainty relations, 196, 197 vacuum fluctuations, 197, 198 zero-point energy, 194, 195 entanglement, 33–36 entropy, 33, 34 environment Ohmic, 304–306 subohmic, 304, 305 superohmic, 304, 306, 307 excitations electron–hole pairs, 102, 103 electrons, 99–101 holes, 99–101 in superconductors, 123, 124 topological, 142, 143, 145–147 expectation value, 5 Fermi’s golden rule, 18–20, 206, 207 second order, 20, 38 field operators bosons, 66, 67 fermions, 78 fine structure constant, 195, 338–343 352
353 Index fluctuation–dissipation theorem, 280 Fock energy, 96, 97, 106 Fock space, 43, 56, 57, 65, 66 four-vector, 324–326 contravariant, 323, 324 covariant, 324 electric current, 326 electromagnetic field, 326 gradient, 326 momentum, 324, 325 position, 323, 324 probability current, 330 velocity, 325 Galilean transformation, 318 gauge symmetry, 136, 139 gauge transformation, 171, 214, 215 gauges Coulomb gauge, 171, 172 Lorenz gauge, 171 Gaussian distribution, 233 generalized coordinate and momentum, 167, 173, 175 Goldstone bosons, 99, 103 Gross–Pitaevskii equation, 143, 145, 147, 156 harmonic oscillator, 29, 30 classical, 183, 184 quantization, 184, 185 Hartree energy, 96, 106 Heisenberg picture, 11 Hilbert space, 7 interaction picture, 12, 16, 19 Jaynes–Cummings Hamiltonian, 254 Klein–Gordon equation, 327, 328 Kramers–Kronig relation, 271 Kubo formula, 281 LC-oscillator, 174, 175 quantization, 198, 199 vacuum fluctuations, 199 zero-point energy, 199 Landau pole, 346 laser, 229–233, 249–251 light cone, 322 localization transition, 306 Lorentz contraction, 320 Lorentz covariance, 172 Lorentz transformation, 318, 319, 323 magnon, 103–107, 110, 189, 190 master equations, 210–212, 231 Maxwell equations, 168, 170–172 Maxwell–Bloch equations, 252–255 Minkowski metric, 323, 324 Minkowski spacetime, 321–323 Nambu box, 118, 123, 124 no-cloning theorem, 35, 36 non-interacting Bose gas, 135, 136 non-interacting Fermi gas, 90–92 number operator, 64, 65, 76, 77, 80 optical coherence time, 250 orthogonality catastrophe, 297, 298, 300, 302, 304, 308 parity, 54, 55 Pauli matrices, 25 Pauli principle, 54 periodic boundary conditions, 6 permutation operator, 44–47 permutation symmetry, 44 perturbation theory time-dependent, 14, 16–18 time-independent, 13, 14 phonon, 187–189 photon, 191, 192, 194 energy, 192 momentum, 192 Planck’s law, 212 Poisson distribution, 233, 246 positron, 333, 334 Poynting vector, 170, 191, 212 quantization postulate, 184 quantum electrodynamics, 337–339 divergences, 339–343 quantum information, 27, 34, 35 quantum noise, 279–282 quantum teleportation, 34, 35 qubit, 26–29, 32, 34–36 radiation absorption, 207–211 emission, 207–211 spontaneous emission, 209, 218, 229 stimulated emission, 209 radiative decay, 218–223 raising operator, 22 random-phase approximation, 106 relativity, 317, 318, 320 renormalization, 277, 339, 344, 345 rescaling (renormalization), 345, 346 rotating frame, 253 rotating wave approximation, 254 rotation operator, 21, 23 scalar potential, 170, 171 Schrödinger equation, 4 multi-particle, 43, 50 stationary, 6
354 Index Schrödinger picture, 11 selection rules, 222, 223 shake-up, 296–298, 300–302 Slater determinant, 55 spectral weight, 278 spin, 20, 21, 23–26, 40, 74 in a magnetic field, 24, 38, 40 spin waves, 110 spin–boson model, 292–294, 307 Stefan’s law, 214 symmetry postulate, 47–52, 54, 56, 60, 61, 75, 187 time dilation, 321 time-evolution operator, 15–17, 19 time-ordering operator, 15, 17 transition rate, 19, 20, 38 translation operator, 21 trial wave function method, 93 two-level system, 26, 27, 39 uncertainty relations, 195, 196, 263, 282, 283 vacuum fluctuations, 195, 196, 210 vector potential, 170–172 virtual state, 20 vortex, 146–152 winding number, 147 Zeeman splitting, 25 zero-point energy, 194, 195, 198, 199, 201