"It's kind of like finding a needle in a haystack."
Jignesh Patel is sitting in a Madison café talking about big data. Between sips of coffee, the University of Wisconsin computer sciences professor uses the familiar expression to explain just what this buzzy tech phrase is all about before launching into a remarkable story about Madison's connection to its past, present and future.
The tech trend du jour, "big data" refers to the vast amount of information that's out in the world, made possible by the digital trails we leave with the ever-increasing portion of our routine activities occurring online—email, web searches, online banking and shopping, social media posts and so on. All of this digital activity, coupled with the miniscule costs of storing this information, enables those who can harness and analyze large sets of data to learn all kinds of valuable information about the world and those living in it. It's big data to thank (or blame) when you see a web ad for a store you were just browsing online, and what makes it possible for the weather app on your phone to tell you if it's going to snow today. In Patel's hay and needle narrative, the enormity of all the data that's out there is the hay; the tiny portion that someone actually wants is the proverbial needle. That weather app can tell you it's going to snow in Madison today at noon, yes, but it also has information on the temperature in New York, the wind speed in San Diego and the overnight low in Little Rock.
And while the term big data has been around for a decade, crossing over from geekspeak to mainstream media usage over the past year or two, what it stands for—having an infinite amount of information available at the swipe of a finger—is what defines this digital revolution we're in.
Patel, forty-three, has spent the last twenty years, most of them in Madison, studying data. His work has led him to receive such prestigious honors as the National Science Foundation Career Award and faculty awards from companies like Google, IBM and Microsoft. Research from his PhD, conducted at UW in the 1990s, was commercialized by computer hardware giant NCR Corp, where Patel worked as a software engineer consultant for a year. He's published more than eighty academic papers and has been courted by all the major social networking companies, including Twitter, which acquired a startup Patel co-founded, called Locomatix, last August. Locomatix utilizes large sets of data to power real-time mobile analytics for businesses. While Twitter's plans for the Locomatix platform remain veiled, it's clear that the pithy—and now public—social messaging company is after Patel's kind of brainpower.
Originally from Mumbai, India, Patel first came to UW in 1991 as a graduate student to study computer hardware. He took one fateful course in databases, a sub-field of computer sciences that deals with organizing and analyzing data sets, and was hooked. While the topic of databases itself was interesting, it was UW's reputation in this field and the people pushing it forward that won him over.
"A lot of data processing stuff that now runs the world, the ideas for that were invented here," Patel says. "The influence of Wisconsin, going back to its legacy as being the pioneer in ... building database technology, is at the heart of technology today."
The groundbreaking research Patel references is the stuff of the Wisconsin Database Systems Group, a research collective launched by then-associate professor David DeWitt in the late 1970s. "It was one of the very few departments around the world at the time that actually saw the value of data," says Patel, who received his master's and PhD from UW in 1993 and 1998, respectively, with DeWitt as his advisor.
DeWitt, now professor emeritus after a thirty-two-year tenure in the UW computer sciences department, including four years as its chair, is big stuff. He is thought of as the father of parallel database systems—a technical term that Patel says is the precursor to big data. Parallel database systems changed the game. It was like going from a single-lane highway plagued by traffic jams to a six-lane freeway where the cars are all still headed in the same direction but are able to move faster and more efficiently. "The world's economy runs on databases," DeWitt says.
With DeWitt at the helm, the group quickly grew to five members and became one of the premier academic research collectives of its kind right from the start. "Nobody else had five people in database systems," DeWitt says. Considering that the "nobody else" here refers to Berkeley, MIT and the University of Michigan—the only other institutions with similar groups at the time—it's no wonder Wisconsin's early dedication to database systems secured its prominence in the field early on.
Jeff Naughton, current chair of UW's computer sciences department, left his faculty post at Princeton to return to his hometown of Madison because of UW's expertise in databases, his area of research.
"This was like the center of the universe for database systems," he says.
And because UW's computer sciences department played such a large role in developing this core technology, many higher-ups at technology companies are UW alumni. "Google has a strong influence of Wisconsin folks on the back-end," Patel says. "Many were graduate students with me in the '90s. Same thing with Facebook and Twitter." Companies like Microsoft, Yahoo, Oracle and IBM have also had former members of the Wisconsin Database Systems Group within their senior leadership teams.
After receiving his PhD and consulting with NCR Corp. for a year, Patel moved to Ann Arbor to teach at the University of Michigan. He stayed there for nine years, eventually coming back to Madison in 2008 in a rare move for UW's computer sciences department, which, according to DeWitt, seldomly hires back its own PhD graduates as a means to get new blood into the department. But they wanted Patel back. He's that good.
Back with the Wisconsin Database Systems Group, Patel developed a particular interest in how much energy is consumed in running database systems, especially when he thought about the energy needs his two young children's generation will inherit. The problem? "You can't put in more power than we already put in," he says. The amount of data in the world doubles every eighteen months to two years. "But to match that we obviously can't start to double our power budget every two years," Patel says. "That's unsustainable." This power limit appears across the spectrum of computing, from a personal tablet or phone to hundreds of thousands of corporate- and government-owned servers across the globe.
Think of it this way: If you wanted your iPad to be twice as fast—take half as much time to load a Netflix movie, for example—the only way to get that iPad the power it needs to do that would subsequently heat up the iPad, making it twice as hot. "You probably wouldn't like that very much," Patel says. More than just making the device painful to the touch, such a heat increase would melt its inner parts.
So that's what Patel's working on now. His current research at UW, a project launched in 2011 called Quickstep, is working toward developing a next-generation database system that will organize and process big data more efficiently to make the best use of the power that's available. If you can't put more power in, you have to make use of what you have, doing more with less.
Patel and his ten graduate research assistants are taking a nuanced approach with the Quickstep project, developing both computer hardware and software together to complement each other and make this new database system as efficient as possible.
Craig Chasseur, a current PhD student and one of the Quickstep research assistants, says that the advances in software and hardware over the past few decades have not matched up. "Current database systems are leaving a lot of potential ... on the table. A big thing we're addressing with Quickstep is unlocking all that potential."
Quickstep is an ambitious project, with funding from Google, Oracle and the National Science Foundation, among others. It's also a tad ironic. Currently, only big firms with deep pockets and hundreds of thousands of servers—like the aforementioned funders—have the ability to fully leverage what such a vast pool of data has to offer. It's too expensive, time-consuming and complicated for the little guys and non-techies to tackle. But if Patel's Quickstep project succeeds in developing a more efficient system, that could change. The world's data—ninety-eight percent of which is now stored digitally—could be captured and processed by more organizations, a result in sync with a recently published study from Intuit that forecasted as-yet-untapped benefits big data can bring to small businesses and individual consumers.
"One outcome of Quickstep is just being able to do more with less," says Chasseur. "Having smaller organizations that have fewer resources being able to take advantage of big data."
The implications here are huge. More people leveraging large data sets could impact everything from marketing campaigns to health care trends to hiring practices, continuing modern society's shift away from experience-based decision making and toward data-driven decision making. Understanding demographic trends and predicting purchasing behaviors would no longer be limited to the big shots. Want to know which time of day is most common for women in their thirties to be searching online for health news? Easy. Want to narrow the focus to women in their thirties in south-central Wisconsin searching for common cold remedies? No problem.
Patel says Quickstep is a five- to ten-year project, and though the team has started seeing the first signs of tangible results, it may take another year or two for private industry to commercialize it. When the outcomes from Quickstep eventually do make it into products on the market, Patel says consumers will see better services at a lower price, with richer interactions. This could mean more powerful and interactive apps for phones and tablets and faster processing on laptops and desktop computers, all for less than what we're paying now. Online activities like paying bills and booking flights—things that run on databases—would come with fewer headaches.
Quickstep is still solely a UW project, but as Patel brings in collaborators from around the country, this research could make for a striking shift in who and what benefits from big data in the future.
Traditionally thought of as a field of introverts and brainiacs, computer science itself is benefitting from the way big data is changing the world.
"Twenty-five years ago, I thought we were the people in the background—we weren't on people's radar," says department chair Naughton, talking in his office about the industry's gradual shift into the mainstream. "Now people are interested in what we're doing. You have a connection to what's going on in society now."
Naughton thinks this could partially explain why his department has seen such growth in recent years. He pegs the number of declared undergrad computer sciences majors at seven hundred. The graduate program in particular has become extremely competitive, with 1250 applications and only 219 students admitted last year, he says. "There's tremendous interest from students. The wave isn't even close to cresting."
Moving forward, Naughton wants to see his department collaborate even more with the local tech community, strengthening the pipeline between the university and local startups and larger companies like Epic. The department also maintains its ties with tech firms across the globe, regularly hooking up students with internships and jobs at the Googles and the Facebooks of the world.
Microsoft has a strong and particularly unique connection with UW as well. When DeWitt retired from academia in 2008, he took on the role of director of a new Microsoft lab in Madison. Called the Jim Gray Systems Lab, the office inhabits fourth-floor space in a gentrified warehouse on West Main Street, right next to the Southwest Commuter Bike Path.
"It was really a thank-you from Microsoft to the university," DeWitt says. The lab now employs nine full-time Microsoft staffers who collaborate with select UW computer sciences graduate students and their advisors. Both Patel and Naughton serve as affiliated faculty. DeWitt says that as far as he knows, this partnership between Microsoft and the university is totally unique, noting that the kind of sharing of intellectual property between Microsoft and UW that goes on at the lab is rare—even unheard of—in typical industry–academia relationships. So rare, and legally complicated, that it almost didn't happen. It came down to a final, personal push from Bill Gates and then-chancellor John Wiley to make the lab a reality.
This forward-looking type of research is important to the department, Naughton says, but he also wants to maintain a high level of instruction to prepare the next generation of Patels and DeWitts. "We're very serious about teaching in this department."
Patel shares this commitment to education, adding that if he wanted to (and he doesn't) he could drop teaching and focus all his time at UW on research. Instead Patel continues to lead both undergrad and graduate-level courses, and he even started an annual software competition for undergraduates on campus called NEST. "The idea is to provide an environment outside the regular curriculum to allow students to take an adventurous leap to a software startup-ish type of an idea," Patel says.
One local startup that has competed in NEST is the online food ordering business EatStreet. Eric Martell, one of EatStreet's three co-founders and a computer sciences alumnus, remembers Patel reaching out to his team when they signed up to compete. "Jignesh immediately scheduled up a few times when he wanted to talk, check in and kind of mentor us through the contest," he says. Martell and company received third place in the 2010 contest, but their ties to Patel never severed; Patel now serves on EatStreet's board of advisors. "As a young company learning to scale infrastructure, we continued to stay in touch with him. He provided mentorship on a technical side and a business side," says Martell.
While indicators like a growing department, a campus software contest and good alumni relations aren't exclusive to UW, Patel and DeWitt each explain, in their own way, that UW's program stands out. "I really believe there's something unique here in Wisconsin and in Madison," Patel says. "It's a very nurturing, intellectually creative environment." He particularly notes how the university supports bold new ideas—like those of David DeWitt in the '70s and '80s. "Somehow we allow individuals to flourish, and the whole ecosystem comes around to let them flourish."
DeWitt also credits university leaders such as former dean of the College of Letters and Science Gary Sandefur, who served from 2004 until just last year. "The administrations have treated computer sciences very generously," he says.
Whatever the reason, groundbreaking research in the field of computer sciences marches on over on West Dayton Street. And while the West Coast is usually thought of as the hotbed of the digital revolution, people like Patel, DeWitt and Naughton are putting Madison on the map and showing that places like UW are not only in the race to harness big data—they're at the head of the pack.
Grace Edquist is associate/web editor of Madison Magazine.