The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

ASSESSING THE ACCURACY OF THE LANDSAT-BASED CROP ACREAGE ESTIMATES E. M. Hsu, Lockheed Electronics Company, Inc.* A. G. Houston, National Aeronautics ...

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2016-02-27 03:18:03

Assessing the Accuracy of the Landsat-Based Crop Acreage ...

ASSESSING THE ACCURACY OF THE LANDSAT-BASED CROP ACREAGE ESTIMATES E. M. Hsu, Lockheed Electronics Company, Inc.* A. G. Houston, National Aeronautics ...

ASSESSING THE ACCURACYOF THE LANDSAT-BASED CROP ACREAGE ESTIMATES

E. M. Hsu, Lockheed Electronics Company, I n c . *
A. G. Houston, National Aeronautics and Space Administration

ABSTRACT 2. FUNDAMENTALAPPROACHTO ACREAGE ESTIMATION

The Landsat-based crop acreage estimates are The Landsat-based crop acreage estimates are
derived by computer-aided analysis of images of derived from computer-aided analyses of Landsat-
Landsat-acquired segment areas. These segment acquired images of segment areas. A segment is
images are interpreted by the image analyst to a 5 by 6 nautical mile area that is used as a
obtain t r a i n i n g information and are then c l a s s i - sampling u n i t on which estimates of crop acreage
fied to obtain crop acreages for the areas. for a large area are based. Each segment image
consists of 22 932 pixels. (The term " p i x e l "
This paper presents an overview of the funda- refers to the basic resolution element in the
mental approach to acreage estimation and dis- image, which corresponds to a l . l - a c r e area on
cusses in some detail the LACIE c l a s s i f i c a t i o n the ground.)
procedure. The i n v e s t i g a t i o n s of the error
sources in the Landsat-based acreage estimation The use of Landsat data to perform crop acre-
are described, as well as the study of the age estimation depends on the recognition of
dependence of the errors on various causative pixels by the analyst and the c l a s s i f i e r . The
factors. Some of the results of this accuracy analyst selects t r a i n i n g samples by i n t e r p r e t i n g
assessment are also presented. the Landsat image through the aid of ancillary
data such as cropping practice data, soil data,
I. INTRODUCTION crop calendars, and h i s t o r i c a l crop percentages
for the p o l i t i c a l regions. The c l a s s i f i e r uses
The Landsat is a land observatory s a t e l l i t e the computer to process the training information,
with an onboard multispectral scanner, which assigning the pixels to the various crops. The
records reflected radiance values in four c l a s s i f i c a t i o n is treated as a s t r a t i f i c a t i o n of
channels. To f a c i l i t a t e the a p p l i c a t i o n of a scene into potential crop classes. This
Landsat data, digital values are represented s t r a t i f i c a t i o n is then used to perform a s t r a t i -
as colors and are presented in imagery form. fied areal estimate. The crop acreage estimation
The Landsat-based acreage estimates are derived flow for a segment is shown in Figure I.
by a complex computer-aided analysis of these
Landsat images. 3. CLASSIFICATION PROCEDURE

Various c l a s s i f i c a t i o n techniques have been When the crop signature ( c h a r a c t e r i s t i c re-
developed and discussed in the many a r t i c l e s flectance of a crop) has been determined by an
related to remote sensing and pattern recog- analyst, t r a i n i n g samples are defined for computer
n i t i o n . However, the procedure for which this c l a s s i f i c a t i o n of a segment. The following sec-
accuracy assessment is conducted, p a r t i c u l a r l y tions describe the approach to defining training
the t r a i n i n g sample determination, has been samples that has been developed and implemented
uniquely developed in the Large Area Crop in LAClE.
Inventory Experiment (LAClE), and only LAClE
numerical examples are used in discussing the 3.1 Determination of Training Samples
results.
The image display of a segment has I17 row
The LAClE is an interagency experiment in the spacings ( l i n e s ) and 196 column spacings ( p i x e l s ) .
use of Landsat and meteorological data to iden- A t o t a l of 209 p i x e l s , which coincide with a grid
t i f y and inventory crop acreage, y i e l d , and spacing every l Oth row and every l Oth column of
production. The p a r t i c i p a t i n g agencies include a segment, are selected as candidates for image
the U.S. Department of A g r i c u l t u r e (USDA), the i n t e r p r e t a t i o n . Two independently preselected
National Aeronautics and Space Administration random subsamples of these 209 pixels are speci-
(NASA), and the National Oceanic and Atmospheric fied to be i n t e r p r e t e d (labeled) by the analyst,
Administration (NOAA). as follows:

This paper presents an overview of the funda- I . The type-I subsample consists of a minimum of
mental approach to acreage estimation and dis- 30 pixels. These labeled pixels are used to
cusses in some detail the LAClE c l a s s i f i c a t i o n i n i t i a t e the c l u s t e r i n g process and to help
procedure. The i n v e s t i g a t i o n s of the error in determining crop labels for clusters.
sources in the Landsat-based acreage estimation
are described, as well as the study of the de- 2. The type-2 subsample consists of approximately
pendence of the errors on various causative 60 pixels. These pixels are used to obtain a
factors. Some of the results of this accuracy s t r a t i f i e d areal estimate (see Section 3.3).
assessment are also presented.

*This material was developed under NASA Con- When the type-I and type-2 subsamples are se-
t r a c t NAS 9-15200 and prepared f o r the Earth lected and labeled, c l u s t e r i n g (computer process-
Observations Division, National Aeronautics ing) is performed to group the pixels according to
and Space Administration, Lyndon B. Johnson a certain distance measure (in p a r t i c u l a r , the ~I
Space Center, Houston, Texas. metric). In LAClE, the radiance value of each
pixel in a segment is compared with the value of
each of the type-I starting pixels, which are used

160

to i n i t i a t e clustering. Each pixel is then as- NW (3.4)
signed to the group of the closest s t a r t i n g
A_
pixel. After all the pixels are clustered, the
mean and standard deviation of each cluster are PW N
computed, and the cluster map is genera ted by
and
the computer. The mean value of each cluster
is again distance-compared to the radiance value NN (3.5)
of each of the type-I pixels. The cluster is
given a crop label according to the label of the A_
closest type-I pixel. All the pixels in a given
cluster are treated as observations from a crop PN N

class (or subclass) and are used to estimate where N is the total number of pixels in a
class (or subclass) statistics. These statistics segment ( i . e . , 22 932), NW is the number of
are then used as estimators of the c l a s s i f i c a t i o n
parameters (means and covariances). pixels c l a s s i f i e d as small grains, and NN is
the number of pixels c l a s s i f i e d as nonsmall
grains. Consequently, the acreage estimate
for small grains, ~W, can be approximated by

3.2 Bayes Decision Model ~W = I . I acres x NW (3.6)

The c l a s s i f i c a t i o n procedure of LACIE is a 3.3 Stratified Areal Estimation
Bayes procedure (Anderson 1958) with the assump-
tions that the loss of each class due to mis- Since experience has shown that there is bias

c l a s s i f i c a t i o n is equal and that the underlying in PW (the computer-classified areal estimate),
distributions are normal. the s t r a t i f i c a t i o n technique is applied after

Let pixel X be the random variable (or random the c l a s s i f i c a t i o n . The s t r a t i f i e d areal e s t i -
vector) having probability density functions mate for small grains is obtained by combining
the results of the type-2 sample with the direct
fw(x) and fN(x) in populations ~W and ~N, respec- computer c l a s s i f i c a t i o n results, using the
t i v e l y , where x is the observation of X and i s following formula"
drawn from the mixed population {~W, ~N}_ The
Bayes procedure is to minimize the expected loss : hW~W + hN~N (3.7)
due to the costs of m i s c l a s s i f i c a t i o n . For a
given observed x, i f

qwfw(X)L(N/W) _> qNfN(x)L(W/N) (3.1) where ~W is the number of type-2 pixels called
small grains by the analyst and c l a s s i f i e d small
~W is chosen; otherwise, ~N is chosen; where qN grains by the computer divided by the number of
is the a p r i o r i probability that x belongs to type-2 pixels c l a s s i f i e d by the computer as
~N' qw is the a p r i o r i p r o b a b i l i t y that x belongs small grains, and hN is the number of type-2
to ~W, ~(N/W) is the loss due to misclassifying pixels called small grains by the analyst and
x from ~W into ~N, and ~(W/N) is the loss due to c l a s s i f i e d as nonsmall grains by the computer
misclassifying x from ~N into ~W. divided by the number of type-2 pixels c l a s s i f i e d
by the computer as nonsmall grains.
I f ~(N/W) = ~(W/N), as is assumed in LACIE,
the decision model expressed in (3.1) becomes It is clear that the s t r a t i f i c a t i o n is not
made according to the true ground-observed in-
qwfw(X) _> qNfN(x) (3.2) formation. That is, the bias correction factors
~W and ~N w i l l not correct the mislabeling of
pixels by the analyst.

I f ~W and ~N have subpopulations ~Wi (i = I , 4. FIELD DATA ACQUISITION
2, . - . , k) and ~Nj (J = I , 2, . . . , m), respec-
tively, the decision model in (3.2) is In order to assess the accuracy of a Landsat-
based crop acreage estimate at the segment level,
~ q w fw i ( x ) _>~ q N fN (x) (3.3) the f i e l d observation data for the segment must
be acquired. The segments which are designated
1i j. j j for f i e l d data acquisition are chosen at random
from the sample segments. The i d e n t i t y of these
where qwi and fwi are, respectively, the a p r i o r i selected segments is withheld from the analysts
so that these segments can be treated the same
p r o b a b i l i t y and the p r o b a b i l i t y density function as the other segments. Then, the high-resolution
that correspond to ~Wi; s i m i l a r l y qNj and fNj color-infrared aerial photography over these
segments is acquired before harvesting of the
correspond to ~Nj" In the LAClE application, crop of primary interest. Simultaneously, field
teams c o l l e c t ground observation data for these
~Wi (i = I , 2 , - . . , k)and ~Ni (j = I , 2, . . . , m) segments. The f i e l d team labels all the f i e l d s
on a pre-prepared f i e l d overlay according to the
represent the subpopulations ~subclasses) for ground-observed crop types and makes a general
small grains and nonsmall grains, respectively. appraisal of crop conditions in the segment. The
The subclass s t a t i s t i c s are generated in the application of ground-observed data is described
clustering process as described in Section 3.1. in Sections 5 and 6. Figure 2 shows the f i e l d
data acquisition flow.
When the computer c l a s s i f i c a t i o n is made using
the decision model in (3.3) for a l l the pixels in
a segment, the small-grain proportion estimate
PW and the nonsmall-grain proportion estimate PN
are calculated by

161

5. RESULTSOF COMPARISONSBETWEEN PROPORTION 6. RESULTSOF LABELING EVALUATION
ESTIMATES AND GROUND-OBSERVEDPROPORTIONS
The accuracy of the s t r a t i f i e d areal estimate
When the f i e l d data become available, the is c r i t i c a l ly dependent on correct labeling of
ground-observed crop proportions can be obtained the type-2 pixels by the analyst. The s t r a t i f i e d
through the use of computer d i g i t i z a t i o n or man- random sampling technique provides an optimal
ual planimetry. A study comparing the LAClE estimate provided that the analyst can c o r r e c t l y
small-grain proportion estimates obtained from i d e n t i f y the type-2 pixe.ls. Thus, this labeling
the s t r a t i f i e d areal estimator and the corre- evaluation study concentrates on the type-2
sponding ground-observed proportions was con- pixels.
ducted for both winter and spring small grains
based on the ground-observed segments. Ninety- Each segment is evaluated by checking for con-
one ground-observed segments for winter small sistency and discrepancy between analyst and
grains were randomly selected from the states ground-observed labels. The confusion matrix is
that are the major producers of winter small calculated from type-2 labeled pixels for a set
grains; namely, Colorado, Kansas, Nebraska, of ground-observed segments. This matrix is used
Oklahoma, Texas, Montana, and South Dakota. A for quantitative examination since it contains
total of 53 ground-observed segments for spring the p r o b a b i l i t i e s of correct labeling and mis-
small grains were randomly selected from the labeling for both small grains and nonsmall
U.S. northern Great Plains states, which consist grains. Specifically, the confusion matrix M is
of Minnesota, Montana, North Dakota, and South defi ned by
Dakota. The ground-observed proportions were
compared with both the early-season and l a t e - M = [p(W/W) p(N/W)]
season proportion estimates for each type of the p(W/N) p(N/N)
small grains. Only 77 early-season proportion
estimates were available for the 91 winter where p(W/W) is the probability of correctly
small-grain ground-observed segments; 31 were
available for the 53 spring small-grain ground- labeling small-grain pixels as small grains,
observed segments. p(N/N) is the p r o b a b i l i t y of c o r r e c t l y labeling
nonsmall-grain pixels as nonsmall grains,
Table 1 and Figure 3 show that on the average p(W/N) is the p r o b a b i l i t y of mislabeling nonsmall-
the proportions were s i g n i f i c a n t l y underestimated grain pixels as small grains, and p(N/W) is the
for both winter and spring small grains at the p r o b a b i l i t y of mislabeling small-grain pixels as
lO-percent level. However, the bias due to clas- nonsmall grains. When the areal estimate is made
s i f i c a t i o n , D, decreased f r o m - 9 . 8 percent in for small grains, p(W/N) is called the commission
the early season to -2.4 percent in the late error, and p(N/W) is the omission error.
season for the winter small grains and decreased
from-3.8 percent to -3.3 percent for the spring The following results were obtained from the
small grains. The primary reason for this large type-2 samples of 18 ground-observed segments in
underestimation during the early season is that North Dakota for crop year 1976-77.
the small-grain signatures are not well-developed
at this time, and as a r e s u l t , some small-grain 341 = 0.750 415145 - 0.25 Ol
t r a i n i n g pixels were mislabeled as other crops by M = 30 533 94-"
the analyst. When a s i g n i f i c a n t improvement in 563 - 0
detectable small-grain signatures is noted in the 5--6-3 = 0.053
l a t e r season or some backup acquisitions become
available for multitemporal interpretation, the This matrix shows that 75 percent of the type-2
LAClE estimates begin to approach the ground- small-grain pixels and 94.7 percent of the type-2
observed proportions. nonsmall-grain pixels were c o r r e c t l y labeled by
the analyst. In other words, in the State of
The plots in Figure 3 also indicate an over- North Dakota, the labeling error for small grains
all trend toward a negative value of PW PW as is larger than that for nonsmall grains. The
PW increases. In other words, LAClE tends to causative factors for labeling errors in North
underestimate the true small-grain proportion Dakota are investigated by evaluating such data
when that proportion is large. as the acquisition h i s t o r y , individual f i e l d size,
pixel location, and signature q u a l i t y . Table 3
In another interesting study, a comparison is presents the error sources and the percents of
made between small-grain areal estimation errors commission error and omission error caused by
that resulted from the computer c l a s s i f i c a t i o n each error source.
estimation, s t r a t i f i e d areal estimation, and
random sample estimation from the type-2 pixels. 7. MAJORERRORSOURCES IN ACREAGE ESTIMATION
The mean square errors for these three types of
areal estimates are tabulated in Table 2 for the The major error sources in the acreage e s t i -
States of North Dakota and Montana. This table mates at the segment level have been i d e n t i f i e d
shows that the mean square error of the s t r a t i f i e d through the study in various experiments. These
areal estimate is much smaller than that of the major error sources include:
computer c l a s s i f i c a t i o n estimate. However, the
mean square error of the s t r a t i f i e d areal estimate I. Missing key acquisitions, which generally
is very close to that of the random sample esti- cause the analys~ to be unable to separate
mate from the type-2 labeled pixels. the signatures of competing crops and thus
lead to mislabeled training data.

162

. Abnormal signature development, which is Multivariate Statistical Analysis, New York"
often caused by such problems as late John Wiley and Sons, Inc.
planting, drought, crop rotation, disease,
and soil v a r i a b i l i t y . Duda, R. 0., and Hart, P. E. (1973), Pattern
Classification and Scene Analysis, New York"
3. Inadequacy of the Landsat scanner in re- John Wiley and Sons, Inc.
solving small fields.
Fukunaga, K. (1972), Introduction to Statistical
ACKNOWLEDGEMENTS Pattern Recognition, Academic Press.

Thanks are given to J.'Clinton and E. Cooper National Aeronautics and Space Administration,
for providing information on labeling evaluation. Lyndon B. Johnson Space Center (1978), Large
Area Crop Inventory Experiment Phase I and
REFERENCES Phase II Accuracy Assessment Final Report.

Anderson~ T. W. (1958), An Introduction to

TABLE I. COMPARISONOF PROPORTION ESTIMATES FIGURE I. LANDSAT-BASEDCROP ACREAGE
AND GROUND-OBSERVEDPROPORTIONS FOR ESTIMATION FLOW
CROP YEAR 1976-77
11
IMAGE ~ CLUSTERING AND
90-percent AREAL
Type of Acquisition M PW D Sg confidence INTERPRETATION~CLASSIFICATION ESTIMATION

small grains PW l i m i t s f o r PD

Early (prior 2__2
to February) 77 15.6 25.3 -9.8 1.7 (-12.6, -7.1 )*

Winter Late (prior 91 22.2 24.6 -2.4 0.7 ( - 3 . 6 , - 1 . 2 ) * ANCILLARY DATA
Spring to October)

Early (prior

to August) 31 15.3 19.1 -3.8 1.5 ( - 6 . 3 , - 1 . 3 ) *

Late (prior 53 15.7 18.9 -3.3 0.9 ( - 4 . 8 , - 1 . 8 ) *
to October)
H EVALUATION
LEGEND:
M - Number of ground-observed segments a v a i l a b l e l"
PW - Average of small-grain proportion estimates obtained from the
l SEGMENT
stratified areal estimator ACREAGE
PW - Average ground-observed small-grain proportion ESTIMATE

U - PW - PW FIGURE 2. FIELD DATA ACQUISITION FLOW
S~ - Standard e r r o r of
PD - Population
* - PD not s i g n i f i c a n t l y d i f f e r e n t from zero at lO-percent level

TABLE 2. MEANSQUARE ERRORS FOR COMPUTER SEGMENT DATA
CLASSIFICATION ESTIMATION, STRATIFIED SELECTION ACQUISITION
AREAL ESTIMATION, AND RANDOMSAMPLE
ESTIMATION FROM TYPE-2 PIXELS GROUND
TRUTH

Computer Stratified Random
areal sample
State I~umber of c l a s s i f i c a t i o n estimate
estimate
segments used estimate 170.5
168.2
Nortil Dakota 22 235.4 42.2
55.4
Montana 12 8G. 5

TABLE 3. NORTH DAKOTALABELING ERRORS COMPLETE INVENTORIES
I. COVERTYPE
Error source Omission error Commission error 2. FALL AND SPRING
3. GENERALAPPRAISALOF
(percent) (percent)
CROP CONDITION
Insufficient acquisitions 13.2 33.3
ACCURACY ]
Fields too narrow If.4 3.3
ASSESSMENT
Border pixel 28.9 23.3

Abnormal signature 21.1 16.7

163

FIGURE 3. PLOT OF PROPORTION ESTIMATION ERRORS
VERSUS GROUND-OBSERVED PROPORTIONS

(a) Winter ~mall Grains (b) Spring Smal Grains

October August October

February 25

25 • ~..." .
c-
c~ • .... . c= 0 • , . •. •
r~ 0
~,.., ~.. z~
13= "~.:.':. • . . c= - 2 5
-25
-.~ . : -. -25
. .. -50
-25
1 -50 (c=
-50
13=

v -50

°

IIII IIII IIII II II
0 20 40 60 80 lO0 0
20 40 60 80 0 20 40 60 80 lO0 0 20 40 60 80 lO0

PW' percent PW' percent

164


Click to View FlipBook Version