Today is our first encounter with a non-trivial theorem, the Sard theorem.
Theorem(Sard)Given a smooth map F:M→N, the set of critical values of F is a measure zero set in N.
We recall the definitions
A point
p∈M is a
critical point of
F, if
dF:TpM→TF(p)N is not surjective. The set of critical point is denoted as
Cr(F).
A point
q∈N is a
critical value of
F, if
q is the image of a critical point. The set of critical value is denoted as
ΔF, or called the discriminant locus.
A subset
A⊂Nn is
of measure zero (or negligible), if for any coordinate chart
(U,φ) on
N,
φ(A∩N) is of measure zero in
Rn.
Note: we don't have a canonical measure on N, but we have a class smooth density on N, which can be written as a smooth positive function times the Lebesgue measure induced by the coordinates.
It suffices to prove a local version of the Sard theorem, in a coordinate chart.
Theorem (Sard on Rn) Let U⊂Rn be an open set, F:U→Rm be a smooth map. Assume n≥m. Then the discriminant set ΔF is of measure zero.
Remark: We don't say anything about the measure of Cr(F), in fact, it can be quite large. For example, consider a non-negative function f:R→R, where f vanishes on [−1,1], then [−1,1]⊂Cr(f), but f([−1,1]) is just a single point, 0.
(I am copying the proof from Nicholascu's note, page 30, which was originally due to Milnor and Pontryagin.)
We denote Crk(F)⊂Cr(F) denote the subset of points in U such that all partial derivatives of F up to order k vanishes (check: this notion does not depends on the choice of coordinates). We obtain a descreasing filtration of closed sets
Cr(F)⊃Cr1(F)⊃Cr2(F)⊃Cr3(F)⊃⋯
Note: A point p∈Cr(F) means dF(p) is not surjective, a point p∈Cr1(F) means dF(p)=0.
We prove by induction. The case n=0 is trivial. We assume the case is true for any n′<n and any m≤n′. The inductive step is divided into 3 steps
Step 1: Show that
F(CrF−CrF1) is neligible .
Step 2: Show that
F(CrFk−CrFk+1) is neglible for all
k≥1
Step 3: Show that the set
F(CrFk) is neglible for some sufficiently large
k.
Step 1: Set CrF′=CrF−CrF1. We will show that there exists a countable open cover {Oj}j=1∞ of CrF′, such that F(Oj∩CrF′) is neglibile for all j. Since CrF′ is contained in a second countable space Rn, every open cover has a countable refinement, hence suffice to prove that for each u∈CrF′, there is a neighborhood N, such that F(N∩CrF′) is negligible.
Suppose p∈CrF′, since dF(p) is not identifically zero, we may choose coordinate chart (U,(xi))
centered around p and (V,(yj)) centered around F(p), such that in this coordinate F(x1,⋯,xn)=(F1(x),⋯,Fm(x)), and
F1(x)=x1.
( Why we can have such coordinates? We can choose coordinate (V,(yj)) first, and consider yj∘F on F−1(V) for all j, there exists at least one j, such that d(yj∘F)(p)=0, otherwise dF(p)=0. Then, wlog, assume j=1, and define x1=y1∘F. )
Next, for every t∈R, set
Nt={x∈N∣x1=t}
and define
Gt:Nt→Rm−1,p↦(F2(p),⋯,Fm(p))
Observe that
F(N∩CrF′)=t⋃{t}×Gt(CrGt)
By the induction hypothesis, we have the statement when the source dimension is n−1. Hence the (m-1) Lebesgue measure of Gt(CrGt) is zero. By Fubini theorem
μm(F(N∩CrF′))=∫μm−1(Gt(CrGt))dt=0.
Step 2: Set CrF(k):=CrFk−CrFk+1, and suppose p∈CrF(k). We may choose local coordinate s1,⋯,sn around p and y1,⋯,ym around F(p), such that si(p)=0 for all i and yj(F(p))=0 for all j. And furthermore, we assume that
∂s1k+1∂k+1y1(p)=0
Then, we define x1=∂s1k∂ky1 and x2=s2,⋯,xn=sn. We choose N to be a small enough neighborhood around p, such that (xi) forms a coordinate. Then CrFk∩N is contained in the hyperplane x1=0. (indeed, if x1=0, then one k-th derivative of F is non-zero, hence the point is not in CrFk by definition).
Define
G:N∩{x1=0}→Rm,G(p)=F(p)∀pN∩{x1=0}
Then
CrFk∩N=CrGk,F(CrFk∩N)=G(CrGk)
By induction hypothesis, G(CrGk) is negligible in Rm, hence F(CrFk∩N) is neglibile. By covering CrF(k) by such open cover, and take countable refinement, we can conclude that F(CrF(k)) is negligible.
Step 3 (the key step): Suppose k>n/m−1. We will show that F(CrFk) is neglibile. More precisely, for every compact subset S⊂U, we will show that F(S∩CrFk) is negligible.
From Taylor expansion around points in S∩CrFk, we know there exists 0<r0<1 and λ0>0, depending only on S, such that if C is a cube with sides r<r0 and intersects CrFk∩S. Then
diam(F(C))<λ0rk+1
where for any set A⊂Rm, the diameter is defined as
diam(A)=sup{∣a1−a2∣,a1,a2∈A}.
( Recall the Taylor expansion formula, if f:Rn→R is a smooth function, then for any k≥0, there exists r>0, and C>0, such that for any ∣x∣<r, we have
f(x1,⋯,xn)=f(0)+1≤∣α∣≤k∑f(α)α1!⋯αn!x1α1⋯xnαn+Rk(x)
where the remainder
∣Rk(x)∣<C∣x∣k+1. Apply this remainder estimate to each function Fi, one can get the diameter)
Hence, the Lebesgue measure of the image is
μm(F(C))<C1rm(k+1)=C1μn(C)m(k+1)/n
Now, we cover CrFk∩S by finitely many cubes {Cl}l=1N, of edges r<r0, with disjoint interiors. For each positive integer P, we may subdivide each Cl into Pn many subcubes of equal sizes. For every sub-cube Clσ that intersects CrFk, we have
μm(F(Clσ))≤C1μn(Clσ)m(k+1)/n=Pm(k+1)C1μn(Cl)
Hence
μm(F(Cl∩CrFk))=σ∑μm(F(Clσ∩CrFk))≤Pn−m(k+1)μn(C).
Now, we may send P to ∞, and conclude that μm(F(Cl∩CrFk))=0.