Science in a Complex World: Big Data: Opportunity or Threat?

Chris Wood/ Courtesy Santa Fe Institute

What do the National Security Agency, the National Science Foundation, Google, Netflix, Amazon and even your local grocery have in common?

Big Data, that’s what.

Big Data is a loose term for the collection, storage and sophisticated analysis of massive amounts of data, far larger and from many more kinds of sources than ever before. Organizations like those above, and more every day, are collecting and analyzing the myriad electronic bread crumbs we generate in our daily activities, and they’re exploiting that data to predict our actions and behaviors to help accomplish their objectives.

The Economist recently enthused: “Big data is the electricity of the 21st century — a new kind of power that changes everything it touches in business, government and private life.”

In the biggest Big Data effort of all, the NSA’s goal is to be able to acquire intelligence data from “anyone, anytime, anywhere.” The classified documents leaked by whistle-blower Edward Snowden make clear that NSA’s penetration of the telecommunications and computer industries is far broader and deeper than even the agency’s most extreme critics imagined.

In addition to being the hottest new trend in business and government, Big Data is fast becoming a pervasive force in modern science. Last year, the Obama administration launched a $200 million Big Data in Science initiative, with the goals of enhancing economic growth and job creation, education and health, clean energy and environmental sustainability, public safety and global development.

What are we to make of all this? Are Big Data and predictive analytics truly a gold mine for business, science and government? Or are they a serious threat to our privacy and freedom?

This is a highly complex problem with enormous consequences for both science and society. That’s why my colleagues and I at the Santa Fe Institute and its Business Network recently invited more than 100 experts from industry, science and government to Bishop’s Lodge in Santa Fe to give careful thought to Big Data’s opportunities and threats. Here are some highlights of what we learned:

• Kenneth Cukier, data editor for The Economist, made the case stated in the title of his recent book Big Data: A Revolution That Will Transform How We Live, Work, and Think.

• Computer technologist and author Jaron Lanier summarized the key idea of his recent book, Who Owns the Future?, that the Internet is an engine of increasing inequality in wealth and power. Without acting quickly to stem this trend, he contends our economy and society will grow increasingly extreme, polarized and dysfunctional.

• Dan Wagner, CEO of the startup Civis Analytics and data analytics lead of President Barack Obama’s 2012 campaign, described how he and his colleagues helped transform political campaigning from a focus on traditional voting blocks based on age, gender and ethnicity to campaigns targeted to specific individual citizens.

• Astrophysicist Alex Szalay of Johns Hopkins University said science is moving rapidly toward a “Fourth Paradigm: Data-Intensive Scientific Discovery.”

• In “Big Data, from Galileo to Gödel,” Simon DeDeo, a former SFI Omidyar Postdoctoral Fellow, showed how the constructive interplay of Big Data, theory and computation can reveal underlying truths, not only in the physical and biological sciences, but also in the social sciences and even the humanities.

• Noted historian of the NSA James Bamford addressed the “Dangerous Duo: When Big Brother and Big Data Come Together.” Without needed legal constraints and congressional and court oversight, he argued, the NSA’s ever more sophisticated data collection, analysis and code-breaking capabilities pose serious threats to privacy and freedom.

So, what should we conclude? Is Big Data the opportunity its proponents contend? Or is it a threat whose costs outweigh its potential benefits?

Based on our assessment, Big Data is quite clearly both, depending upon the specific application being considered.

In business, the mix can vary across the type of business and the degree to which customers perceive Big Data to be in their own interest or just the interests of those trying to sell them something. For example, some of us will find that the “free” services and the convenience of “you might be interested in” offered by Google, Facebook and Amazon are well worth the costs of providing them extensive information about ourselves or viewing the ads they relentlessly deliver us. Others will decide the benefits are not worth those costs and will “just say no.” But at least in the cases of Google, Facebook, Amazon and their kin, we have the opportunity to choose. In other cases (auto insurance, credit histories, law enforcement, the NSA), we do not.

In science, the mix of opportunity and threat varies, too. A number of our speakers, including SFI Distinguished Professor and past president Geoffrey West, emphasized the essential role of theory in using and understanding Big Data. In a world where scientists are drinking from the data fire hose, the data are of little use without theory. And the data need to be the right data for the scientific questions at hand. For example, the availability of large-scale social network data from Twitter and Facebook has captured the attention of social scientists. But are the conclusions drawn from studies of our behavior on social media networks likely to generalize to the real world of everyday interpersonal interactions? We shall see. What is clear is that the scientific questions need to drive the collection of data and not vice versa.

The tension between opportunity and threat is most acute for the NSA. Gen. Keith Alexander, NSA director, has argued that hunting for terrorists in the deluge of telecommunications and Internet data is like trying to find a needle in a haystack, and “you need the haystack to find the needle.” This “collect it all” strategy ignores the fact that as the total amount of data increases without bound, the ratio of true-positive “needles” to “false-positive” chaff decreases accordingly. A data collection approach targeted at suspected individuals and groups is likely to be more productive, not to mention more constitutional.

Telecommunications and Internet companies are starting to push back against unfettered data collection by the NSA, and Congress on both sides of the aisle is beginning to question the “you need the haystack” rationale. Whatever your own views on Big Data and the NSA, I believe we can achieve a better balance between our government’s legitimate role of protecting its citizens and its equally important role of ensuring the constitutional guarantees of privacy and freedom.

I also believe that the proliferation of Big Data and predictive analytics in all their manifestations is an urgent matter requiring our immediate attention. In the Stephen Spielberg film Minority Report, individuals could be arrested for crimes they had not yet committed but were deemed likely to commit by psychics called “pre-cogs.” Without adequate legal and regulatory protection, Big Data and predictive data analytics threaten to become the Minority Report of the all-too-near future.

ABOUT THE AUTHOR

Chris Wood is vice president for administration and director of the Business Network at the Santa Fe Institute. He received his doctorate from Yale University in 1973 and was on the Yale faculty with joint appointments in the departments of psychology, neurology and neurosurgery until 1989. Wood led the biophysics group at Los Alamos National Laboratory from 1989 until he became SFI’s vice president in 2005. His research interests include imaging and modeling the human brain, computational neuroscience and biological computation.

ABOUT THE SERIES

The Santa Fe Institute is a private, not-for-profit, independent research and education center founded in 1984 where top researchers from around the world gather to study and understand the theoretical foundations and patterns underlying the complex systems that are most critical to human society — economies, ecosystems, conflict, disease, human social institutions and the global condition. This column is part of a series written by researchers at the Santa Fe Institute and published in The Santa Fe New Mexican.

Show what you're thinking about this story

You must be logged in to react.
Click any reaction to login.
0
0
0
0
0

(4) comments

Chris Wood

Three significant events regarding NSA surveillance have happened in the short time since "Big Data: Opportunity or Threat?” was published:

1. Federal Judge Richard Leon ruled that the NSA’s program to collect metadata on US phone calls "almost certainly" violates fourth amendment constitutional guarantees.

2. Leaders of major tech companies such as Google, Microsoft, Twitter, Apple, Netflix, Yahoo, and AT&T met with President Obama to urge restraint on NSA surveillance.

3. An independent panel of tech and security experts established by President Obama released its report containing dozens of recommendations to limit NSA programs. According to the New York Times “Taken together, the recommendations would remove from the N.S.A.'s hands the authority to conduct many of its operations without review by the president, Congress or the courts.”

These developments do not, by themselves, eliminate the threats to privacy and freedom posed by the NSA surveillance programs. But at least the public discussion has begun.

Chris Wood

Readers interested in more technical aspects of NSA's electronic surveillance programs may wish to see recent blog entries by cryptographer Matthew Green of Johns Hopkins ("On the NSA") and computational complexity theorist Scott Aaronson of MIT ("NSA: Possibly breaking US laws, but still bound by laws of computational complexity").

The SFNM comment system does not allow internet addresses (URLs) so I can't provide the direct links, but just search for each author's name and "NSA" using your favorite search engine. The titles of relevant blog posts are given above.

Chris Wood

Hi Kim,

Thanks for your comments.

The question you raise about strategy for political change is a topic in which we all have a significant stake as citizens, but it’s well beyond the scope of my already broad article.

I agree with you that change is an uphill fight given the record of the FISA court, congressional gridlock, and public apathy. But if we use pessimism about change as a reason to do nothing, that pessimism will surely become a self-fulfilling prophecy.

With respect to the FISA court, there was an interesting piece on NPR's Morning Edition yesterday. One of the interviewees was Jennifer Granick, Director of Civil Liberties at the Stanford Law School Center for the Internet and Society:

Interviewer: "Granick says adding new oversight, more technology, and better rules might help bring those programs in line. But she's not convinced that surveillance this complicated can ever be controlled."

Granick: "You may be seeing evidence that there is no way to make bulk collection and mass surveillance work in a democracy - period. That it is just incompatible with a democracy."

That's not far from the position I was trying to articulate.

With respect to Smith v. Maryland (1979), I'm of course no constitutional authority, but the readings of that decision I’m familiar with limit it to the then-present landline telephone metadata, not the pervasive internet metadata the NSA is collecting. It will be interesting to see whether the question of internet metadata, or any NSA case, makes it to the Supreme Court.

Kim McCoy

I'll accept that this is the first article in a series, so Mr. Wood may just have intended to give an account of the problem, but I would have been interested to know what solutions Mr. Wood has to offer. Specifically, I want to know how he thinks any changes to our current system of surveillance could possibly be implemented considering 1.) the FISA court's powers and opinions are largely secretive, 2.) the federal legislature is not unified on the issue and is incompetent, and 3.) and the American people, Snowden excluded, are apparently disinterested in protesting such invasions of privacy.

Furthermore, legal precedent has already been set with Smith v. Maryland in 1979, in which the court established that people do NOT have a "reasonable expectation" of privacy for electronic metadata. With so much invested in the system of surveillance, how can we possibly expect change? Spy agencies and the FISA court don't want to give up their power and secrecy, and I'm not sure anyone has the power to compel them to.

Mr. Wood, I would love to see specific answers to the tough questions next time!

Welcome to the discussion.

Thank you for joining the conversation on Santafenewmexican.com. Please familiarize yourself with the community guidelines. Avoid personal attacks: Lively, vigorous conversation is welcomed and encouraged, insults, name-calling and other personal attacks are not. No commercial peddling: Promotions of commercial goods and services are inappropriate to the purposes of this forum and can be removed. Respect copyrights: Post citations to sources appropriate to support your arguments, but refrain from posting entire copyrighted pieces. Be yourself: Accounts suspected of using fake identities can be removed from the forum.