Big Data, Correlation Or Causation?

Big Data, Correlation Or Causation?

Gordon Crovitz wrote about Big Data in theย Wall Street Journal (25 March 2013) this week.

He cites from a book called “Big Data: A Revolution That Will Transform How We Live, Work, and Think,” an interesting notion that in processing the massive amounts of data we are capturing today, society will “shed some of its obsession for causality in exchange for simple correlation.”

The idea is that in the effort to speed decision processing and making, we will to some extent, or to a great extent, not have the time and resources for the scientific method to actually determine why something is happening, but instead will settle for knowing what is happening–through the massive data pouring in.

While seeing the trends in the data is a big step ahead of just being overwhelmed and possibly drowning in data and not knowing what to make of it, it is still important that we validate what we think we are seeing but scientifically testing it and determining if there is a real reason for what is going on.

Correlating loads of data can make for interesting conclusions like when Google Flu predicts outbreaks (before the CDC) by reaming through millions of searches for things like cough medicine, but correlations can be spurious when for example, a new cough medicine comes out and people are just looking up information about it–hence, no real outbreak of the flu. (Maybe not the best example, but you get the point).

Also, just knowing that something is happening like an epidemic, global warming, flight delays or whatever, is helpful in situational awareness, but without knowing why it’s happening (i.e. the root cause) how can we really address the issues to fix it?

It is good to know if data is pointing us to a new reality, then at least we can take some action(s) to prevent ourselves from getting sick or having to wait endlessly in the airport, but if we want to cure the disease or fix the airlines then we have to go deeper, find out the cause, and attack it–to make it right.

Correlation is good for a quick reaction, but correlation is necessary for long-term prevention and improvement.

Computing resources can be used not just to sift through petabytes of data points (e.g. to come up with neighborhood crime statistics), but to actually help test various causal factors (e.g. socio-economic conditions, community investment, law enforcement efforts, etc.) by processing the results of true scientific testing with proper controls, analysis, and drawn conclusions.

Alienware Rocks

Alienware Rocks

So this is the nicest looking laptop I have ever seen by far–and it’s made by Alienware, a subsidiary of Dell (acquired in 2006).

Apple, I never thought I’d be saying it.

But Alienware rocks!

The sci-fi style with beautifully lit keyboard and advanced features for gaming make this one awesomely powerful piece of hardware.

I can’t believe that kids are actually carrying these into school now a days.

See video review of premier M18X Alienware gaming laptop here.

If you want unbelievable graphics display, memory, sound, processing power, storage, and style–this is it in laptop computers.

Plus it comes with the cute alien figure etched on the cover.

I want one! ๐Ÿ˜‰

How Good Is Our DNA

Dna

Where do we store the vast and expanding information in our universe?

These days it’s typically in 0 and 1s–binary code–on computer chips.

But according to the Wall Street Journal(18 August 2012), in the future, it could be encoded in the genetic molecules of DNA.

DNA has “vastly more capacity for their size then today’s computer chips and drives”–where a thumb size amount could store the entire Internet–or “1.5 milligrams, about half the weight of a house ant could hold 1 petabyte of data, which equals to 1,000 1-terabyte hard drives.”

As opposed to binary code, DNA will store information as strands made up of four base chemicals: adenine (A), guanine (G), cytosine (C) and thymine (T).

Just like letters in the alphabet make up words, sequencing of these 4 base chemicals can store biological instructions (e.g. 3 billion for a person) or any other information.

Using DNA for storage involves 4 key steps:

1) Encoding information into binary code

2) Synthesizing the chemical molecules

3) Sequencing them in a string to hold the information

4) Decoding the molecules back into information

Overall, DNA is seen as a “stable, long-term archive for ordinary information”–such as books, files, records, photos, and more.

Researchers have actually been able to store an entire book of genetic engineering–with 53,426 words–into actual DNA, and “if you wanted to have your library encoded in DNA, you could probably do that now.”

With the cost declining for synthesizing and sequencing DNA, this type of data storage may become commercially practical in the future.

And with the amount of information roughly doubling every 2 years, large amounts of reliable and cost-effective memory remains an important foundation for the future of computing.

Frankly, when we talk about storing so much information in these minute areas, it is completely mind-boggling–really no different than the corollary of imaging all the stars in vastness of sky.

It is almost incredible to me that we have people that can not only understand these things, but make them work for us.

With NASA’s Curiosity Rover exploring Mars over 34 million miles away, and geneticists storing libraries of information in test tubes of DNA coding, we are truly expanding our knowledge at the edges of the great and small in our Universe.

How far can we continue to go before we discover the limitations to our quest or the underlying mysteries of life itself?

What is also curious to me is how on one hand, we are advancing our scientific and technological knowledge as a society, yet on the other, as individuals, we seem to be losing our knowledge for even basic human survival.

How many people these days, are proficient on the computer in an office setting, but couldn’t survive in the wilderness for even a few days.

Our skills sets are changing drastically–this is the age of the microwave, but knowing how to cook is a lost art to many.

So are we really getting smarter or just engaging our minds in a new direction–I hope we have the DNA to do more than just one! ๐Ÿ˜‰

(Source Photo: adapted from here with attribution to Allen Gathmen)