The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

A bold new course gives every undergraduate the power to harness data science • Data research: Crossing the lines to crack open insights in health care, energy, the environment & more

Discover the best professional documents and content resources in AnyFlip Document Base.
Published by Duke Pratt School of Engineering, 2017-03-02 10:26:51

Data & the Duke Engineer

A bold new course gives every undergraduate the power to harness data science • Data research: Crossing the lines to crack open insights in health care, energy, the environment & more

data &


6 A bold new course gives every
undergraduate the power to harness
data science
12 Data research: Crossing the lines to crack
open insights in health care, energy, the
environment & more


Dear friends,

The explosion of information from technological advances and global connectivity is
giving rise to both a promise and a challenge. If we can plumb this sea of data to sift
out significant patterns, humanity will gain an unprecedented ability to understand
and control complex systems—from the spread of disease to the future of the climate.
The potential impact is huge—and so is the need for people who can lead the way.

That’s why we’ve decided to
make sure every Duke Engineer
knows how to leverage data

We are investing in our students,
our faculty and our infrastructure to
create a signature undergraduate
experience that will give every
individual comfort with data analysis
early in their education—and to expand
our team of faculty thought
leaders to inspire and mentor them. I’m
proud to share our progress, plans and
successes with you in these pages.

Ravi Bellamkonda
Vinik Dean of Engineering

2 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111


p.7 8 9 11 13 14 14

15 15 16 16 17 18



18 18 19 19 19 Learn about these and other
Duke Engineering faculty
engaged in data-driven research,
plus our open positions at


Statistical signal processing; leading Vice Provost for Research; applied statistics Machine learning, data mining, applied
development of new Data & and machine learning statistics and knowledge discovery
Decisions undergraduate course
8 | ROBERT CALDERBANK, CS & ECE High performance computational Information theory, statistical signal
bioengineering processing and machine learning
Director, Information Initiative
at Duke; Member, National 14 | RICARDO HENAO, ECE 18 | STEFANO CURTAROLO, MEMS
Academy of Engineering for Applied genomics and Advanced computation for
leadership in communications population health research discovery of novel materials
research, from advances in algebraic
coding theory to signal processing 15 | GEOFFREY GINSBURG, MD, 18 | HENRY PFISTER, ECE
MEDICINE & BME Information theory, communications
9 | LESLIE COLLINS, ECE Translating genomic information and probabilistic graphical models
into medical practice
Physics-based statistical signal 18 | INGRID DAUBECHIES,
processing algorithms 15 | CLAUDIA GUNSCH, CEE MATH & ECE
Exploring the microbiome
11 | DANIEL EGGER through bioinformatics Member, National Academy of
Engineering for contributions to the
Director, Master of Engineering mathematics and applications of wavelets
Management Center for
Quantitative Modeling

BME: Biomedical Engineering Computer vision, computer GUGLIELMO SCOVAZZI, CEE
CEE: Civil & Environmental Engineering graphics, medical imaging, image
ECE: Electrical & Computer Engineering analysis and machine learning Heavy data analytics for complex
infrastructure systems

MEMS: Mechanical Engineering & data & THE DUKE ENGINEER 3
1001 001000M00ate0r1i1al0s0S0c0ie1nc0e1101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011

CS: Computer Science


(p.8) took home gold
for the Information Initiative at Duke (iiD) will boost in the 2016 Reimagine
the work of a landmark cross-campus effort harnessing Education global
massive amounts of information to tackle society’s competition recognizing
biggest challenges. “projects that enhance
learning and employability
Launched in 2013, iiD brings together faculty and students from and are both innovative
engineering, math, social sciences, medicine and beyond to make and scalable.”
sense of “Big Data”—­ information characterized by tremendous
volume, variety and rapid change—to address a broad range of is-
sues facing our world. The gifts from an anonymous donor, plus
challenge funds from philanthropists Anne T. and Robert M. Bass
through Duke’s interdisciplinary Bass Connections program, will
endow iiD professorships, graduate fellowships in engi-
neering, and educational programs on data-driven prob-
lem-solving, both in the classroom and in the field.


teamed up with faculty to conduct 25 real-world, data-driven
research projects through Duke’s Data+ program in 2016 (p.8)


in undergraduate courses (p.10)—from analyzing international
carbon dioxide emissions to applying linear algebra to autism
detection in infants

4 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111

250,000+ NEW
enrolled in Duke’s “Data Analytic THE DATA
Techniques for Business” online SCIENCES
course sequence via Coursera (p.10)

—making it the platform’s #3 most popular
specialization worldwide in 2016

30+ PARTNERS Duke Engineering is expanding
its faculty expertise in the data
from industry, government, and sciences. Currently we are
nonprofits have collaborated with recruiting:
the Information Initiative at Duke on
research, workshops, internships and 1 Professor of the Practice in
other hands-on learning for students
Data Science
“I think the emergence of
information and big data will be 1 Rhodes Family Professor with
as transformational for our
economy as the Industrial expertise in information, com-
Revolution was during its time.” puting and data science to ad-
vance interdisciplinary learning
for the information age

Learn more & read about Duke
Engineering’s commitment to
diversity in hiring at


Member, Duke Engineering Board of Visitors

Michael and Maureen Rhodes gave $1.667 million (matched through
Duke’s Bass Connections program) to create the new Rhodes Family
Professorship at Duke Engineering.
1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011

data &


6 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111


A new course aims to make data competency a signature
strength of every Duke Engineering undergraduate

Imagine what we could do with the streams of data rushing from Fitbits, Nests,
satellites, microscopy, smart buildings, mobile phones, auto dashboards, credit card
purchases and just about every other source of information that’s captured, tracked
and shared. Stacy Tantum wants every Duke engineer to be able to envision the
possibilities—and to turn them into reality.

“With all this rich data, how can we interpret it to learn new things, see unexpected
patterns and make better decisions?” she says. “These are skills we want all of our
students to have.”

Professor of the practice in Duke ECE, Tantum is leading development of a creative
new course designed to give every engineering undergraduate a foundation in the
fast-growing field of data science. Piloting this fall, “Introduction to Data & Decision
Sciences” will soon be a sophomore-year staple.

It’s part of a triad of new or revised courses set to debut over the next

year that will transform the Duke Engineering undergraduate experi-
ence. Also including a first-year, team-based engineering design course and an
applied computing course, the requirements will create “a signature undergraduate

curriculum that gives all students project- and problem-based engineering experi-

ences early in their studies,” according to Dean Ravi Bellamkonda.

With a focus on team-based, authentic problem-solving, the new data course builds
on a highly successful summer program called “Data+.” The award-winning program
led by the Information Initiative at Duke (iiD) brings small groups of undergraduates,
graduate students and faculty together to design data-driven solutions to interdisci-
plinary challenges, most provided by external clients (p.8).

The approach has proven enormously popular, and may even aid recruitment
and retention of students underrepresented in the field, according to iiD
director Robert Calderbank. “Data+ had over 300 applications for 70 spots in 2016—

over half from women,” he says. “In a world where women are still pursu-

ing STEM majors at lower rates than men, that is encouraging.”

 INTRO TO DATA & Plus, team-based labs that challenge students to solve real-world, poten-
DECISION SCIENCES tially messy challenges, combined with structured teaching, give the core
concepts a kind of “stickiness” that by-the-book approaches just can’t
New course for all Duke Engineering sophomores match, says Tantum.
Piloting 2017-18 academic year
“We want our students to really dig into data science—to learn to define
FORMAT: problems, determine what data they need to answer questions, explore
• 1 lecture, 2 labs each week and visualize the data using concepts from probability and statistics,”
• Includes both structured exercises Tantum says.

and real-world problems “Our goal is for each student to emerge with an understanding of how to
• Student teams work on authentic problems use data to inform and guide their work as engineers.”

from clients in industry, academia & beyond “Comfort with data will be an integral skill for engineers
of the future,” adds Bellamkonda. “We see Duke as the place that
LEARNING OBJECTIVES: will lead the way.”
• Defining the problem
• Experiment design (gathering data)
• Data visualization, exploration, and presentation
• Probability and statistics concepts
• Decision-making in the presence of uncertainty

All students learn how to extract information from
1001 0010000re0al-0w1o1r0l0d0d0a1ta 0to1d1e0r1i1v0e1ins0ig1h1t0s0, s0u1p0po0r1t 1in0f1o0r0m1ed01110100 01101001 01101111 01110101 01110011

decisions and guide their work as engineers


Data+: Duke’s award-winning approach to education for the data age

Duke’s nationally recognized Data+ program brings teams of faculty and students together each summer
to explore data-driven approaches to real-world problems—including many from clients such as Accenture,
Fidelity Charitable and Duke Health. Impactful projects like the ones below will soon become part of every
Duke Engineering undergraduate’s experience with the debut of the new Data Science course.

How can we optimize parking What’s the best location for
on a crowded campus? Zika vaccination clinics?

Working with Duke Parking and Transporta- Lindsay Hirschhorn ME’19 was part of a team charged with deter-
tion, a student team examined parking patterns mining optimal vaccination clinic locations in Durham County for a
across the campus and built an interactive “re- simulated Zika virus outbreak. Working with researchers at RTI In-
direction” tool that could help students and em- ternational to construct models of disease spread and health impact,
ployees figure out the best place to park if their the team developed an interactive visualization tool to show results.
preferred lot is full. The partners are now dis-
cussing ways to operationalize the tool at Duke. “Duke has done a tremendous job in creating a learning environment
that promotes collaboration, drives innovation and creative prob-
“Data+ was immensely beneficial for me,” said participant Mitchell lem-solving, and provides enough structure and mentoring to guide
Parekh, ECE’19. “I gained an understanding of what big data really is, students to successful outcomes,” said RTI’s Thom Miano. “Data+
along with the amount of work required to extract usable informa- gives students an opportunity for valuable experience and employ-
tion from it.” More » ers an avenue for potential recruitment.” More »

Can we build a better
mouse map?

BME major Pablo Ortiz and team worked
with Duke’s Department of Biostatistics and
Bioinformatics to develop a tool that helps re-
searchers more effectively utilize LungMAP, an
open-access database of images of developing
mouse lungs. They built an image segmenta-
tion pipeline to allow biologists and clinical
researchers to quantify changes in lung structure during fetal devel-
opment, and improve understanding of normal lung structure and
function. More »

“Medical applications, robots, infrastructure, new materials—it’s hard to think of an area of
engineering that’s not going to be driven by data. At Duke, we’re training students to navigate
in the digital economy not just by teaching them how to analyze data,
but how to ask the right questions. The world needs people who know
how to put the chain together.”


Charles S. Sydnor Professor of Computer Science
Professor, ECE, Math and Physics
Director, Information Initiative at Duke

8 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111

HANDS-ON LEARNING “We get a lot of amazing students who come
in wanting to change the world. With these
Solar Counts data-focused programs, it’s not just about the
academics of data analysis. They’re getting
With enough solar installations to power 6 million to work on actual challenges being faced by
homes, the U.S. energy infrastructure is rapidly industries and corporations on a daily basis, and
evolving. But while individual states track solar that is a valuable, rewarding and
locations and capacities on a broad scale, there is motivational experience.”
little information about exactly where solar energy
is emerging on a city or neighborhood level. LESLIE COLLINS
Professor, ECE
With this more precise data, officials could predict where to install Member, Duke Energy Initiative
new technology to meet changing demands, social scientists could
better understand how policy affects solar energy adoption and The group then passed the baton to Bass Connections—a program at
economists could better value the future of the 8,000 solar compa- Duke supporting interdisciplinary collaborations between faculty
nies employing more than 200,000 American workers. and students. Led by the same faculty, undergraduates expanded the
dataset to include three more cities in California, annotating 20,000
It’s a gigantic undertaking of great importance, but Duke has just the individual solar panels from 1.5 billion pixels of satellite imagery
people for the job—our undergraduates, who recently began figuring from the U.S. Geological Survey.
out how to tally solar capacity from satellite imagery through
Duke’s Data+ Program. The students recently published the massive dataset in an open, on-
line journal, providing other researchers worldwide with ground-
A small group of four students, mentored by graduate students and truth data that could help program algorithms to spot all types of
faculty from Duke Engineering and Duke’s Energy Initiative, spent objects from the sky.
the summer building a dataset by meticulously annotating 58 square
miles of satellite imagery of Fresno, California. The team then coded
their own proof-of-principle machine learning algorithm that was
able to identify solar panels with over 90 percent accuracy.

In ongoing interdisciplinary research led by Leslie Collins of Duke ECE, Kyle Bradbury of Duke’s Energy Data Analytics Lab and Tim Johnson of the Nicholas School
of the Environment, Duke undergraduate teams use machine learning to assess U.S. solar capacity and energy consumption—providing valuable data to inform
smart grid infrastructure planning.

1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011 data & THE DUKE ENGINEER 9

At Duke, engineering doctoral student Chris Tralie discovered a passion for analyzing the topology of music—and for teaching.
Above, Tralie with advisors John Harer (Math) and Guillermo Sapiro (ECE).


A Mind—and an Ear—for Big Data

It was the newly launched Information Initiative at Data Expeditions are projects proposed and taught by graduate stu-
Duke that showed Chris Tralie he had the technical dents within the context of an existing undergraduate course. “Data
skillset to reveal structures and patterns where Expeditions and Data+ both benefit our undergraduates by making
others saw chaos. technical subjects more relevant and exciting, but they’re also pro-
fessional development opportunities for our graduate students,” said
“The initiative was brilliant because it brought everyone together Robert Calderbank, director of iiD, which sponsors both programs.
and let them learn from each other’s work,” recalls Tralie, an ECE PhD “Industry and academia both need people who can lead projects and
student and National Science Foundation Graduate Research Fel- manage multidisciplinary teams, so these experiences can provide a
low, of iiD’s 2013 launch. “There was real and sudden excitement in competitive advantage for Duke graduates.”
the air.”
“Leading a Data Expedition really helped me grow
Tralie found his niche while learning about topology with John Har- as a mentor. I got to work with talented students
er, a professor in math and engineering. The class boiled down to un- who were still learning the basics and yet had
derstanding the “shape” of data. Tralie thought, “Why can’t we do this amazing new ideas that I could learn from too,”
with music?”
said Tralie, who also developed a new course for graduate students
He designed a program that analyzes musical parameters of songs on data analytics for video recognition—a topic he studies with advi-
and mathematically reduces each time point into 3D space. The re- sor Guillermo Sapiro (p.16).
sulting shape can help determine which genre a song belongs to and
can even recognize covers of songs by other bands. “Those skills will translate to my future career, where I hope to be a
faculty member advising graduate students of my own someday in
Tralie took his own academic journey and used it to turn other Duke engineering or applied math.”
students on to big data—creating a “Data Expedition” using his meth-
od for visualizing songs as a fun and approachable way to teach
undergraduates how to design data-crunching algorithms.

10 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111


Data and Decision-Making Duke’s bringing data analytics education to professionals
everywhere with a four-course “Analytic Techniques
Master’s students take a practical approach for Business” specialization through Coursera. Free to
audit ($395 for certification), the six-week online course
For Master of Engineering Management (MEM) series is a worldwide hit: Outfits such as Business Insider
students in the fast-growing Data & Decisions track, consistently rank the class as one of the best and most
companies are becoming classrooms—thanks to the popular available on the platform, with more than 250,000
program’s capstone consulting practicums. active learners enrolled to date.

“Companies get help with their problems while students get experi- Learn more at
ence consulting. It’s a great bargain,” said faculty member John Nich- analytic-techniques-business
olson, who has arranged over 25 practicums with companies such
as IBM, Cisco and Lenovo. “With more and more projects focused on Not only have a quarter-million people to date accessed the online
data-driven solutions, Duke is really becoming a place that students course, dozens of online learners liked the teaching so much they
are seeking out for data analytics.” have applied to Duke’s full degree program. The demand is no sur-
prise, says Brad Fox, associate dean and executive director for Duke
In the semester-long practicums, small teams of students work with Engineering’s Professional Masters Programs.
a faculty mentor and industry representative to deliver solutions “As you look at what engineering managers are expected to do in
corporate partners need—typically investing more than 800 cumu- today’s business world, using data to make informed decisions is a
lative hours to complete a project’s research, analysis and strategic crucial piece of their job,” he said. “It’s imperative that students under-
elements. stand how they can leverage data to glean new insights.”

“I’ve worked in teams in other courses, but the practicum was a
chance to work on a real live initiative in a professional environ-
ment,” said MEM student Rounak Mehta, whose team turned to data
analytics to help a midsize company differentiate itself in the mar-
ketplace. “It was incredibly satisfying.”

Such real-world experience is one reason the program’s Data & De-
cisions track is quickly growing in popularity. Another is the over-
whelming success of a data analytics course on Coursera co-created
by Daniel Egger, director of the Duke MEM Center for Quantitative
Modeling (see box), which offers online learners exposure to some of
the topics and strategies taught in the MEM program.

“Anyone working in business these days has to understand the im-
pact of Big Data and data analytics,” Egger said. “Between the Infor-
mation Initiative at Duke and the MEM program, Duke is ideally posi-
tioned to offer a course like this.”

“All of our lives are being transformed by data—and “There is tremendous opportunity to unleash new
the impact on industries will be even bigger. By 2019, solutions and productivity by reimagining electricity
data from the industrial internet is expected to be as well as all industries ... and there will be big benefits
49 times greater than the consumer internet. At the from companies working together with universities
same time, industries globally are undergoing historic and students to solve new problems
transformations through the intersection of engineering with big data and analytics.”
hardware and digital platforms. As an example, the
electricity industry has arguably more data than any STEVE BOLZE E’85
other, yet only 2 percent is currently being utilized.
President and CEO, GE Power
SVP, General Electric
Member, Duke Engineering Board of Visitors

1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011 data & THE DUKE ENGINEER 11

data &


12 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111


Ask faculty what’s special about Duke Engineering, and
one word comes up again and again: “Collaboration.”

In a place where engineers can (and do) simply cross the street or even the hall-
way to connect with Medical Center clinicians, statisticians, or energy experts,
it’s perhaps not surprising that Duke’s data-science research is literally all over
the map—with teams coming together from across disciplines to apply analytics
to knotty problems in healthcare, environmental sustainability, materials dis-
covery and more.

“In the information age, data science is key to advancing progress on a number of
globally important fronts,” says Ravi Bellamkonda, dean of engineering. “We rec-
ognize that strength in data science is a strategic advantage in many disciplines,
which is why we’re expanding our already outstanding corps of faculty both in
engineering and across the university.”

“Duke faculty are known for working across department and school lines, but
over the past five years we’ve created a number of mechanisms specifi-
cally to connect our data scientists with faculty who can tap into their
expertise to solve problems in other fields,” adds Lawrence Carin, profes-
sor in ECE and vice provost for research at Duke.

Duke Engineering professors have played key roles in these efforts, including
the Information Initiative at Duke (p.4)—a hub of research and teaching
where “superb data science can be thoughtfully focused to transform every dis-
cipline in the university,” as director Robert Calderbank puts it—and the Duke
Quantitative Initiative, launched to hire up to 10 new faculty in the broad area
of quantitative sciences with the express goal of strengthening cross-university
connections, particularly with Duke Health. Duke MEDx, a new joint initiative
of Duke’s engineering and medical schools, is also spurring innovation through
seed funding for collaborative data-driven research.

“Duke is in an unusually strong position to make advances in the data
science of health,” Carin says, pointing to collaborations with the universi-
ty’s top-10 medical school and nationally recognized health system on projects
ranging from early disease diagnosis to the social science of population health

Engineering students are intimately involved in the work, too—for example, a
new Mobile Health Application Prototyping Group launched this semester by
the Duke Institute for Health Innovation is pairing physicians with undergradu-
ates to develop digital apps for real-world mobile health care and clinical research

Broader afield, Duke engineers are diving into data-based explorations of the
microbiome with biologists and statisticians, examining solar capacity with col-
leagues from Duke’s Energy Initiative, and sifting data to discover new materials,
among others. Read on for just a sampler of how our faculty are bringing big
data expertise to bear upon big questions.

1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011 data & THE DUKE ENGINEER 13

Simulating Heartbeats
Less than 100 yards away from one of the
In the early 17th century, William Harvey made the Southeast’s busiest medical centers, Duke
first known complete, detailed description of the body’s engineers Lawrence Carin and Ricardo
circulatory system. Roughly 300 years later, biomedical Henao are working to keep people out of
engineers at Duke are changing the definition of “detailed.” those inpatient beds. By designing algo-
rithms to sift through reams of data in clin-
Stephanie Musinsky, a junior double-majoring in BME and ECE, is working on an ical records, the team is helping clinicians
evolving supercomputer code that models blood flow down to the individual cell. find better ways to predict which patients
Aptly named “HARVEY,” the code could help physicians determine cardiovas- will experience complications and then in-
cular risks and predict which treatments would return the best outcomes. tervene to keep them healthy.

HARVEY is the brainchild of Amanda Randles, assistant professor of BME at Duke, Carin Henao
who has already shown the code capable of modeling blood flow through the human
aorta and other complex vascular regions. In a recent project with Duke Health col-
leagues, the researchers analyzed data
As an undergraduate research fellow at Duke Engineering, Musinsky has been from five years’ worth of electronic health
working with Randles for the past year to complete three major upgrades, including records gathered by the Southeastern Di-
making the simulation more intuitive, introducing a virtual patient-specific “pulse” abetes Initiative (taking care to protect in-
and enabling much longer simulations to be run. In fact, she is the first undergraduate dividual privacy). With more than 16,000
to introduce changes to the code that are now being used by the entire group. records and thousands of potential param-
eters to consider, they enlisted help from a
“I never really enjoyed working in a traditional wet laboratory, so working on this
project has been great for me,” said Musinsky, who has found a home in data
analytics. “I’ve become better at coding and problem-solving, and I’ve learned a lot I
wouldn’t have otherwise.”

“Our students can benefit by getting firsthand
knowledge of how the material they learn in class
can be applied to translational research and
potentially influence future treatment for some
of the highest-burden diseases today,” said Randles.

“I think large-scale simulations like the ones Stephanie is working
on will play a key role in the future of biomedical engineering.”


participate in faculty-mentored research—
such as junior Stephanie Musinsky, who works
with Duke BME’s Amanda Randles on an evolving
supercomputer code that models blood flow down
to the individual cell.

14 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111

student Data+ team to reduce the data to “Pioneering research universities like Duke are harnessing the power
a count of the number of times patients vis- of cross-campus collaborations to drive discovery and innovation.
ited the hospital, took a medication or had We are bringing together engineers, clinicians, computer scientists and
a procedure. other researchers to pursue bold ideas for improving human health.
By doing so, we believe we are creating a dynamic model
“It turns out we can do pretty well, even for research universities to achieve even greater heights of
with such a simplified dataset,” said Henao. excellence and impact.”
“With those metrics, our machine learning
algorithm could predict any one of 13 po- A. EUGENE WASHINGTON, MD, MSC
tential comorbidities over a six-month pe-
riod better than existing tools.” Chancellor for Health Affairs, Duke University
President and CEO, Duke University Health System
Now, with colleagues Katherine Heller in
Statistics and Erich Huang in Biostatistics & Data-Based Diagnosis Teaming the Microbiome
Bioinformatics, they’re taking a similar ap-
proach to help Duke Health improve overall As overuse of antibiotics gives rise to new If ever a match were made in scientific
care quality and reduce medical costs. Us- generations of resistant bacteria, clinicians heaven, it’s between Big Data and Bioinfor-
ing a massive database of insurance claims, face an important decision—hold back the matics. Machine learning techniques have
the team built a model that combines data drugs in case the patient turns out to have already begun to revolutionize our under-
on diagnostics, medications and procedures a viral infection, or prescribe them just in standing of how bacteria influence human
with demographic information to predict case? Duke engineers are working closely and environmental health and can effi-
which patients are at risk of admission and with genome scientists and clinicians to ciently produce a wide variety of products.
readmission within six months. The ear- make the decision a snap.
ly results proved promising, encouraging Getting biologists and statisticians to
Henao to begin refining the model. A few years ago, ECE’s Lawrence Carin speak the same language, however, is not
teamed up with Dr. Geoffrey Ginsburg, as straightforward. Funded by $3 million
“If we can identify the patients most at risk professor of medicine and BME and direc- from the National Science Foundation, the
and prioritize their follow-ups, we can raise tor of Duke MEDx, to combine increasingly
the level of care while decreasing hospital efficient gene-expression technology with Integrative Bioinformatics Graduate
costs for everyone,” said Henao. rapid statistical analysis. A series of DAR- Training Program to Investigate and
PA-funded pilot studies proved it possible Engineer Microbiomes (IBIEM) promotes
to read signals of gene expression activity in team science by bringing Duke and NC
response to illness before symptoms set in— A&T University graduate students in engi-
an advance that could speed up treatment neering, microbiology, and other disciplines
and recovery, and help prevent the spread together to work on real-world projects and
of disease across populations. engage with local companies conducting
microbiome-related research.
Subsequent collaborative projects between
the Center for Applied Genomics and Preci- “All too often scientists only think about
sion Medicine and ECE have improved the statistics at the end of an experiment, which
methodology as the researchers fine-tuned can create issues for the data analysis and
statistical methods to analyze the data. In a rigor in how good the experiment actually
recent study conducted with ECE’s Ricardo is,” said IBIEM director Claudia Gunsch of
Henao, the team proved their technology Duke CEE. “Having students and scientists
could rapidly discern the difference be- that are able to communicate across these
tween gene expression caused by viruses disciplines and think about the statistics
and bacteria in blood samples—paving the from the start saves a lot of time in the long
way for a new diagnostic test. run and helps produce higher-quality data.”

Today, the researchers are working with
NIH’s Antibacterial Research Leadership
Group and industry partners to translate
their innovation into practice, and design-
ing a clinical trial with Duke Health to val-
idate the diagnostic tool in a larger, more
diverse population.

1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011 data & THE DUKE ENGINEER 15

“If you look at the papers coming out from Duke in big-data science,
in mobile health, you’ll see undergrads, grad students, developers,
physicians, psychologists, social scientists and engineers all listed
as contributors. That kind of integration is very unusual—
and very Duke.”


Edmund T. Pratt, Jr. School Professor, ECE
Pictured left with collaborator Helen Egger, MD

An App for Autism Interpretable Machine Cynthia Rudin
Screening Learning

As a toddler watches videos on a new Machine learning programs are great at
Duke-developed app, sophisticated algo- helping computers crunch raw data and
rithms analyze “selfie”-camera recordings spit out recommendations. Unfortunately,
of her head movements and facial expres- it’s often impossible for the human user to
sions for small but significant signs of au- understand how the program came up with
tistic behavior, such as a lack of emotions, the answer—and that can be a problem in
social-reference, or delayed response. fields such as health care.

To date more than 2,000 people from the US “Doctors need to be able to explain their
to South Africa have downloaded the Ap- decisions and recommendations to their
ple ResearchKit “Autism & Beyond” app, patients,” said Cynthia Rudin, associate
which is being studied as a possible tool to professor in ECE and computer science. “So
expand autism screening and enable earli- they have to be able to interpret the results
er diagnosis and intervention. It’s a critical themselves.”
need given that most children with autism
aren’t diagnosed until age 5 or later. Rudin works on problems where the mod-
els clearly show how they produce their
Developed by Duke ECE’s Guillermo Sa- conclusions. For example, a Falling Rule
piro, Duke Health’s Dr. Helen Egger and List (developed with Rudin’s student Fulton
Dr. Geraldine Dawson, and a team of over Wang) is a logical model that categorizes pa-
20 programmers, scientists and students tients in order of decreasing risk.
in close partnership with Apple, the new
research app is not a diagnostic tool, but A Falling Rule List could use data from elec-
provides parents with information and en- tronic health records to predict how likely
couragement to seek consultation with a individuals are to be readmitted to a hospi-
care provider if the data appear to indicate tal after they were released. The top catego-
their child may be at risk. ry might contain patients who have serious
problems and do not follow their doctor’s
“Our goal is to develop a screening, like instructions, listing their chances of be-
checking kids’ hearing or eyesight at ing readmitted as 92 percent. The bottom
schools,” said Sapiro. “They don’t get glasses; category might have the most compliant,
they get a referral.” healthiest patients at 10 percent.

After validating the app’s feasibility as a Rudin’s team has also created practical
screening tool, the team is starting large- applications for predicting criminal re-
scale clinical testing at sites in South Africa, cidivism, diagnosing sleep apnea, and im-
Argentina, Singapore and the US. They’re proving surgical recovery, using different
also using the software as a springboard flavors of interpretable machine learning
to develop similar screening tools for and statistical algorithms.
post-traumatic stress disorder and picky

16 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111

Neural Networks
Modeling Traffic with
Sometimes datasets are too small for stan-
dard analytics tools to find trends and pat- Self-Driving Cars
terns in the data. Henry Pfister, associate
professor of ECE and mathematics, designs Imagine having your car drop you at the office and then
novel graphical models and algorithms to go park itself several miles away. Not only would it make
infer answers to the questions being asked your morning that much more bearable, it’d open up
when data are sparse. parking spots on otherwise congested city streets.

For example, deep neural networks are a As the concept of driverless cars speeds from science fiction to reality, researchers are
type of machine learning that only per- making many promises about potential improvements in safety, traffic and conve-
forms very well with very large datasets for nience. But is there evidence to support those claims?
training, but nobody knows why. The pro-
grams work by finding patterns in raw data Efe Aras, a Duke Engineering senior, is using ideas from statistics and mathematics to
through a series of mathematical transfor- determine just how big of a traffic boon self-parking cars could be. A Pratt Undergrad-
mations and manipulations. uate Research Fellow, Aras has teamed up with Galen Reeves, an assistant professor
in ECE known for his work in information theory, statistical signal processing and
Though used for applications like image machine learning.
recognition, voice identification, speech
transcription and text classification, many Aras and Reeves developed a model incorporating variables such as the density of cars
questions remain about their fundamental on the road, how long cars stay in roadside parking spots and how picky both drivers
workings. and driverless cars might be about those spots, among others. The model represents
a small, generic grid of roads, though Aras said it could be scaled up to represent an
“With more insights into how these pro- actual city’s streets.
grams arrive at their answers, we could ex-
tend their usefulness into fields that don’t The duo ran simulations to determine how traffic patterns change depending on the
have the luxury of enormous datasets,” said percentage of driverless cars on the road. And to be fair, they accounted for the fact
Pfister. “Smaller datasets for training would that driverless cars could actually increase traffic congestion.
also translate into valuable savings.”
That was not, however, the case—they found that as the number
of future autonomous cars goes up, the level of potential road rage
goes down.

“You also get more traffic relief depending on what city you’re
modeling,” said Aras. “As the population and congestion goes up,
the effects of driverless cars become more apparent.”

“It’s exciting to see how old ideas from physics
and math can be used to address difficult
questions about how new technologies, such as
self-driving cars, will impact society,” added Reeves.

Self-parking cars isn’t the only data analytics experience Aras is
getting at Duke. Through his participation in the National Acad-
emy of Engineering Grand Challenge Scholars Program, he is
also working on making big data experiments more protective of
individual privacy.

“What makes Duke so special is all of the different connections
in different departments,” said Aras. “Engineers look at problems
differently than mathematicians, so it’s sometimes harder for us to
deal with complex models. We may think we have a robust model
and solution, but to understand a model completely, you have to
understand the mathematical beauty of it.”

Galen Reeves and Efe Aras data & THE DUKE ENGINEER 17
1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011

“It has always been my belief that ‘applicability’ doesn’t have to come
at the expense of mathematical rigor or beauty.”

James B. Duke Professor, Math & ECE

Analyzing Art Materials Discovery A Global Approach to
Ingrid Daubechies is renowned for her In the search for new materials, research-
creative approach to applied mathematics. ers are abandoning hunches and intuition Since 2008, Duke has led an international
Inventor of the mathematical constructs for theoretical models and pure comput- consortium of researchers studying how
called wavelets that are widely used to ing power. The research is part of the industrial nanoparticles affect the environ-
compress digital image files, she’s now us- White House Materials Genome Initiative, ment. Other scientists are doing the same—
ing math to preserve the creative works of launched in 2011 to accelerate the pace of and the global effort has produced an enor-
others. discovery and deployment of advanced ma- mous amount of valuable data.
terial systems crucial to achieving global
In a recent collaboration with the North competitiveness in the 21st century. But in such a nascent field, experimen-
Carolina Museum of Art, Daubechies tal procedures are inconsistent, as are the
combined mathematical techniques and At Duke, this effort takes the shape of the types of measurements recorded. The re-
image analysis to guide restoration of the AFLOW Library built and maintained by sulting data are difficult to merge into a uni-
14th-century St. John Altarpiece, including Stefano Curtarolo, director of the Center fied set for deeper analysis and insight.
the reconstruction of a panel lost long ago. for Materials Genomics, which is funded by
By studying crack patterns and other pro- an $8.6 million grant from the Department Duke is leading a global charge to unify the
cesses on the older panels, she can digitally of Defense’s Multidisciplinary University field through the Center for the Environ-
“learn” how they affect the piece and then Research Initiative (MURI) program. The li- mental Implications of NanoTechnology
transpose that to the recreated piece, so the brary contains experimental data of known Nanoinformatics Knowledge Commons—
new art closely resembles its historic coun- binary and tertiary compounds and allows CEINT-NIKC for short. With buy-in from
terparts. users to predict properties of theoretical the Environmental Protection Agency and
new materials by building models of similar European Union, the center’s goal is to stan-
In addition, “We have developed new math- compounds atom-by-atom. dardize the methods used and measure-
ematical techniques that can not only re- ments taken in nanoparticle experiments.
verse the observed effects of aging, but also “Physically going through potential com-
untangle and remove the effects of well-in- binations would take tens of thousands of In addition to building a custom cyberin-
tentioned but now-regretted conservation hours,” said Curtarolo. By employing data frastructure allowing researchers to upload
efforts,” said Daubechies, the James B. Duke analytics, “We help identify targets for new their data in a consistent manner, CEINT-
Professor of Mathematics and Electrical & compounds much faster and more cheaply.” NIKC is creating apps that can sort through
Computer Engineering. “These techniques the database, pull requested data and run
are now available for other art conserva- For example, Curtarolo used the approach different types of data analytics­—building
tors around the world to apply to their art- to identify several dozen theoretical ma- a one-stop shopping experience for future
works.” terials that could potentially replace the researchers.
expensive and rare platinum found in ap-
A Renaissance woman who also applies plications such as catalytic converters and “Nanoinformatics is trying to draw from
data analysis to problems in neuroscience, cancer therapies. He also recently discov- the wins we’ve been seeing in bioinformat-
archaeology and geology, Daubechies in ered a way to predict which alloys will form ics and other Big Data fields, but adapted
2016 won the $1.5 million Math + X Inves- metallic glasses found in electrical applica- for application to a still emerging, relatively
tigator grant from the Simons Foundation, tions, nuclear reactor engineering, medical immature field,” said Christine Hendren,
which supports novel collaborations be- industries, structural reinforcement and executive director of CEINT and co-chair of
tween mathematicians and other scientists razor blades. the National Cancer Informatics Program
or engineers. She was named a member of Nano Working Group, which has been vi-
the National Academy of Engineering in “Experimentalists are constantly using our tal to the effort’s success. “We’re building a
2015. data to guide them in new experiments,” community and providing a path forward
said Curtarolo. “We’re planning to develop to creating depth to our data.”
a common format so that we can compare
our theoretical calculations with actual ex-
perimental data.”

18 PRATT.DUKE.EDU 01101111 01110101 01110100 01110010 01100001 01100111 01100101 01101111 01110101 01110011 01101100 01111

Understanding Complex Learn more about Duke Engineering’s
Infrastructure Systems data science initiatives

From prospecting for fossil fuels to implant- at
ing synthetic heart valves, human engi-
neering is fast transforming natural envi- To get involved, please contact:
ronments. But how are our interventions
affecting these systems—and vice-versa? Ravi Bellamkonda
Vinik Dean of Engineering
Duke engineers are tackling those ques-
tions in a new initiative using “heavy data” [email protected]
methodologies to analyze complex infra-
structure systems that change over time Jim Ruth
and space, such as those in the biological or Associate Dean for Development
geological realms.
& Alumni Relations
“We can use data analytics techniques to [email protected]
predict how these systems will behave, but
the standard datasets are based on a rela- Kirsten Shaw
tively small number of observations and Director of Corporate
measurements,” said John Dolbow, profes- & Industry Relations
sor in CEE. “We want to augment the col- [email protected]
lected data with model-based simulations
of the physical mechanisms and processes
involved in the system being studied.”

By incorporating both real-world data and
computer simulations, the resulting “heavy
data” offers more robust information, lend-
ing a greater degree of certainty to predic-


“We believe this will be a pow-
erful approach to help engineers
understand, design and manage
complex systems across a range
of important areas, from climate
mitigation to energy explora-
tion,” said Guglielmo Scovazzi, associate

professor of CEE.

“This is really a new paradigm for engineer-
ing,” said CEE professor Wilkins Aquino.
“Duke has considerable strengths in both
data analytics and model-based simulation,
and we anticipate this initiative will serve
as a model for how methodologies from dif-
ferent fields can be integrated to solve big
problems in new ways.”

1001 00100000 01100001 01101101 01100010 01101001 01110100 01101001 01101111 01110101 01110011 data & THE DUKE ENGINEER 19

BOX 90271
DURHAM, NC 27708-0271

Click to View FlipBook Version