The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

www.StickyMinds.com . JULY/AUGUST 2009 BETTER SOFTWARE . 17. What of . third-order measurement? That, says Jerry, is the kind of precise, highly quantitative ...

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2016-04-23 04:36:03

WHY AUTOMATE? 6 good (and not-so-good) objectives

www.StickyMinds.com . JULY/AUGUST 2009 BETTER SOFTWARE . 17. What of . third-order measurement? That, says Jerry, is the kind of precise, highly quantitative ...

July/August 2009 $9.95 www.StickyMinds.com

WHY AUTOMATE?
6 good (and not-so-good)

objectives
SOUND THE ALARM!
Burn charts for tracking

project progress

The Print Companion to

Test Connection

Three Kinds of Measurement and
Two Ways to Use Them

by Michael Bolton

People often quote Lord Kelvin: “I often ISTOCKPHOTO
say that when you can measure what
you are speaking about, and express it on the right and a car up ahead with its Second-order measurement, says
in numbers, you know something about brake lights on.” First-order measure- Jerry, is the kind of measurement that
it; but when you cannot express it in ment suggests answers to the questions engineers use to tune relatively stable
numbers, your knowledge is of a meager What seems to be happening? and What systems, making them cheaper, stronger,
and unsatisfactory kind; it may be the should I do now? In this situation, if lighter, more reliable, faster—or slower,
beginning of knowledge, but you have you feel like you’re driving too fast, you if that’s what’s desired. Second-order
scarcely, in your thoughts, advanced to probably are driving too fast. If so, first- measurement focuses on questions like
the stage of science, whatever the matter order measurement is enough to inform What’s really happening? and How is it
may be.”[1] But, few note the sentence an immediate and appropriate action: changing? tending to be more quantita-
that precedes the passage: “In physical slow down. tive, subject to more refined models, and
science the first essential step in the di- generally busier than first-order mea-
rection of learning any subject is to find Because it’s based on ongoing experi- surement. It is often assisted by external
principles of numerical reckoning and ence and feelings, rather than on careful instruments to supplement or refine di-
practicable methods for measuring some experiments and controlled data intake, rect sensory intake. In particular, met-
quality connected with it.” The missing wise use of first-order measurement re- rics—mathematical functions that relate
sentence prompts some questions: Are quires us to consider a number of pos- objects or events to numbers via a mod-
software development and testing sci- sible interpretations of the meaning and el—are second-order measurements.
ences subject to the same kind of numer- significance of what we see. Suppose you
ical measurement that we use in physics? feel like you’re driving fast, but not too Back in the car, second-order mea-
If not, what kinds of measurements fast. Now you observe a set of red and surement is the kind of information that
should we use? How could we think blue lights on the top of the car ahead. you obtain from looking at the dash-
more usefully about measurement? The extra data suddenly prompts you to board. You note that your speed is forty-
realize that you’re uncertain about your three miles per hour and that the posted
Gerald M. (Jerry) Weinberg suggests relationship to the speed limit. The situa- limit is thirty-five miles per hour. Your
thinking in terms of three broad catego- tion and first-order measurement prompt quantitative, second-order measurement
ries. [2] First-order measurement, he says, a different response in the form of ques- tells you that you’re above the legal
is what we need to get started—“just ad- tions: What else do I need to know? and limit. The apprehensive feeling in your
equate to the task of getting something Where should I look? At this point, you gut, triggered by the combination of po-
built.” First-order measurement tends move into second-order measurement lice car and the second-order measure-
to be qualitative, fast, and inexpensive; and refer to the speedometer. ment, informs a decision to slow down.
it generally doesn’t require mechanisms
or devices to enhance or extend the ob-
servation. In a recent conversation, Jerry
told me that first-order measurements
“are unobtrusive, or minimally obtru-
sive, and can be used without a whole
lot of fuss. They help give you a lot of
important information that can lead to
other information or, in the best case, to
immediate action if needed.” [3]

First-order measurement is what
we’re doing most of the time as we’re
driving a car. We look through the win-
dows, listen to the engine, and feel the
acceleration and deceleration. We make
observations and comparisons without
getting hung up on quantification. “The
road is dry. It’s cloudy. There’s traffic

16 BETTER SOFTWARE JULY/AUGUST 2009 www.StickyMinds.com

Test Connection

What of third-order measurement? changes from last month’s numbers— Jerry observes that in software engi-
That, says Jerry, is the kind of precise, especially when things had gotten worse. neering we seem obsessed with higher-
highly quantitative measurement that At this company, schedules frequently order measurements. Why? He suggests
supports the physicist’s search for new slipped and shipments were often de- that decisions about quality are political
natural laws. It helps us answer the ques- layed. Yet when I asked testers the simple and emotional, based on discussions and
tion What happens? in a universal and question: What slows you down? I got decisions about whose values count and
general sense. But third-order measure- a wealth of information. They told me how much they count relative to one an-
ment can be precise only because it tends about broken and buggy builds, inad- other. [5] Such issues are often distasteful
to be about very simple systems (such as equate test environments, excessive em- to people who want to appear rational
two interacting masses) or very simple phasis on scripts that were out of date and “scientific,” so we try to avoid those
models of complex systems (in which we by the time the product arrived, and a issues with appeals to higher-order mea-
choose to ignore many dimensions of the lack of information about real customer surement.
system, but analyze a very small number needs. They also said they were wasting
of dimensions very thoroughly). Perhaps time collecting data that wasn’t being Each new software project involves
most significantly, third-order measure- used to help speed up development or a human context—interaction between
ment emphasizes and depends upon testing, and they offered dozens of ways different sets of clients, developers,
keeping messy human traits—variability, in which the numbers could be gamed. tasks, and problems to solve, with high
subjectivity, and values—out of the way. variability, contending values, and small
As noted in an important paper by Cem A different client, also working on sample sizes. In those environments,
Kaner and Walter P. Bond, [4] using one-year project cycles, focused on ques- third-order measurement isn’t achiev-
metrics and higher-order measurement tions like: What happened this week? able; it’s an expensive distraction. That
wisely depends on construct validity— What did we get done? What problems leaves us with cycles of first- and sec-
critical rigor in evaluating the models did we run into? Managers used per- ond-order inquiry measurement—not
and the functions that form the basis for sonal contact—direct observation of physics, but easily good enough to build
the measurement. and conversation with people—as their and tune our systems. {end}
primary approach to assessing the proj-
In Rapid Testing, we define a control ect’s status. They took a good number References
metric as any metric that drives a deci- of quantitative measurements, but used [1] Thomson, William (Lord Kelvin). “Electrical
sion. Some development groups stan- them only as indicators to refine their Units of Measurement.” Popular Lectures and
dardize the decision to ship the product initial assessments and to inform new Addresses I (London, 1981-94).
when it contains a low-enough threshold first-order questions. The team made [2] Weinberg, Gerald M. Quality Software
number of high-severity bugs. Others rough long-term estimates and more Management, Vol. 2: First-Order Measurement.
consider a program adequately tested if precise short-term estimates, dividing Dorset House Publishing, New York, 1993.
there’s one positive and one negative test two-week cycles into tasks of two days [3] Weinberg, Gerald M. Personal correspon-
per “requirement” (meaning per line in or less, with clear deliverables that sig- dence with the author, May 18, 2009.
a requirements document). Still others naled completion. When tasks weren’t [4] Kaner, Cem and Walter P. Bond. “Software
deem a test group “successful” if there is finished in the estimated time, no one Engineering Metrics: What Do They Measure
a low-enough percentage of rejected bug was punished; instead, everyone consid- and How Do We Know.” 10th International
reports. By contrast, an inquiry metric is ered what he hadn’t understood earlier, Software Metrics Symposium. Chicago, IL,
one that prompts a question: We have what he had learned, and what might 2004. www.kaner.com/pdfs/metrics2004.pdf
three open high-severity bugs—What’s inform a better estimate next time. Team [5] Weinberg, Gerald M. Quality Software
the story there? Jim and Mark are two members didn’t collect metrics on things Management, Vol. 1: Systems Thinking. Dorset
days behind where we thought they’d that weren’t immediately interesting House Publishing, New York, 1991.
be—Do they need help? The program and important to them. They were in-
managers are deferring a lot of problem terested in understanding the situation What’s your experience with
reports—Are the problems insignificant, and optimizing the quality of the work, observation and measurement
or do we need more training because we not in the appearances afforded by the
don’t understand the product? metrics. They emphasized the game and in your organization?
the season over the box scores. And they
One of my recent clients rated the consistently shipped high-quality prod- Follow the link on the StickyMinds.com
quality of its products and customer ucts on time. homepage to join the conversation.
satisfaction with a basket of five second-
order metrics. Each measurement col- They did use one—and only one—
lapsed months of work and tons of data control metric. When the amount of open
into a single number. “Better” numbers problems exceeded a certain number,
earned praise; “worse” numbers earned they stopped working on new features
a reprimand, so management meetings and fixed problems until the list was
dragged on while people tried to explain comprehensible and manageable again.

www.StickyMinds.com JULY/AUGUST 2009 BETTER SOFTWARE 17


Click to View FlipBook Version