Monday, April 17, 2017

Data Analytics

What is Data Analytics?

According to techopedia:
Data analytics refers to qualitative and quantitative techniques and processes used to enhance productivity and business gain. Data is extracted and categorized to identify and analyze behavioral data and patterns, and techniques vary according to organizational requirements.
Data analytics has come to be a huge part of the business and tech world and as computers and algorithms become more refined the uses of data analytics is endless to create meaning in huge amounts of data, that may otherwise be considered useless.

What is "Big Data"?
Big data simply refers to huge amounts of data that is analyzed using data analysis to create insights.

Why is this a Big Deal?
Recently, Wired posted an article titled, "Big Data and Analytics: The Hero or the Villain". Within the article, they go into the details of how big data analytics is one of the most controversial topics in the tech world today, and its uses with the NSA.  The NSA is able to analyze petabytes of user data in order to gain national security intelligence. The fact that the NSA is able to analyze PETABYTES of data in order to draw strong connections and analysis is absolutely incredible. 

Also, the use of data through online advertising and retailers in which they can target advertising based on your gender, age, likes, dislikes, geographic location, and hundreds of other data points just based on the websites you frequent and items you have purchased online. Remember Amazon's use of "if you buy x item, you might want y item" recommendations.

The Uses of Data Analysis?
Data analysis occurs in almost any industry. Most prominently, it's used in marketing in order to target new consumers for products and services, as well as in the healthcare industry in order to make medical breakthroughs(not Google Flu Trends), simply to help make business operations more efficient, or even in financial trading. Data analytics is even used in the sports industries. For example, in tennis, they use players previous match data to predict the likelihood of them winning the match if they keep their first serve percentage above X percent. The same can be said for dozens of other data points. The use of data analysis is also heavily utilized in basketball and other professional team sports, however, Charles Barkley may disagree.

Why is it considered Controversial?
There are two main controversies surrounding data analytics:  Privacy and causation vs correlation. Privacy has become a huge issue in recent years do to the unfolding of government surveillance programs as well as online data being sold for marketing and advertising purposes.

In regards to correlation and causation: Correlation does not equal causation and causation does not equal correlation. This was an issue that Google faced when they created Google Flu Trends. The premise of the operation was to analyze Google search data in order to predict, better then the CDC, where the flu had hit and will hit. To say that people Googling certain searches means they are sick/about to be sick with the flu is very difficult to prove. Data scientists have to be very careful not to take data and create certain inferences that may or may not be true.



6 comments:

  1. Hi there, Caroline. I was wondering if in your research done about the Google Flu Trends if there was any statistical data given out about the correlation speculated by Google vs what it actually turned out to be in the future when time passed. I did not notice it saying anything about it in the article. But just curious if there were any studies to see the accuracy of Google's prediction of where the flu will hit?
    Thank you!

    ReplyDelete
    Replies
    1. Hey Matt, https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/ this article helped explain it a lot. the major issue for Google Flu trends was "big data vs good data". One point of info was that google missed the peak of flu season by 140%.

      Delete
  2. Hi Caroline,
    Do you think that all the new IOT devices are complicating the data analytics or helping? In other words, has the new generation of IOT and automation helping the NSA collect the data they want like everyone says?

    ReplyDelete
    Replies
    1. Its an interesting idea. I think IOT allows for more information to be collected and continuously , for example from Alexa it records everything you say to it etc. etc. etc. I think that in terms of complicating the data analytics, the issue is still the same. Even with huge amounts of data, the Analysts still need to be careful of the inferences they make. One example would be data brokers. Our information is constantly being collected from our web traffic and it builds a consumer profile, in which we are bucketed with people with similar browsing or online shopping habits. Some of these may not be completely accurate, which can be problematic for advertisers when they want to pinpoint consumers. The same can be said for the NSA data collection. I hope this answers your question.

      Delete
  3. I thought you outlined the different uses of data analytics very well! It's interesting to see how one specific practice can be applicable in so many different industries and fields. I also thought the articles you referenced are great resources to learn more about data analytics.

    ReplyDelete
  4. Data Analytics is a huge apart of how the world works today and needs to be emphasized more, especially with the amount of data that is being put out onto the internet and the amount that is going to be in the future!

    I thought you outlined every aspect of data analytics very well and gives a great description on how this career field works as well as how it is important to the future and how our world is going to function especially through the eyes of IT.

    ReplyDelete