Data Science – Introduction to Data Science

by The WebGate

 

From social media to IoT devices, an immeasurable amount of data is generated. It’s been said that by 2025, 1.76MB of data will be generated every second. Data Science is the process of using data to find solutions or to predict outcomes for a problem statement or an event. A company like Uber uses a surge pricing algorithm to get more drivers on the busiest locations.

Data Science Processes

Business requirement or understanding the problem you are trying to solve: with Uber, they aim to build a dynamic pricing model that takes effect when a lot of people in the same area are requesting rides at the same time

  1. Data Collection
  2. Data Cleaning: sometimes unnecessary data are collected. This will only increase the complexity of the problem.
  3. Data exploration and analysis: this is where patterns in the data are understood.Data Modelling: This includes, building a machine learning model that predicts the surge at a given time and location. The model is trained by feeding it thousands of records and events so that it can predict the outcome more precisely.
  4. Data Validation: in this stage, the model is tested when a new consumer or the user books a ride. The data of the new booking is compared to the historic data to check if there are any anomalies in the search prices or false predictions.
  5. Deployment and Optimization: After testing the model and improving its efficiency, it is deployed on all the users. 
Application of Data Science

Data Science is used in e-commerce platforms like Amazon and Flipkart. It is also used in Netflix. Netflix uses Data Science to recommend videos to its users. 

Data Science’s application ranges from Credit card fraud detection to self-driving cars and is also used to create virtual assistants like Alexa, Cortana, Google Assistant, and Siri.

Example:

Suppose you look to check a shoe on amazon but you did not buy it then in there. Now the next day you are watching a video on YouTube and you see an ad on the particular shoe you check on amazon, you switch to Facebook; you see the same ad, how did this happen? This happens because your search results are tracked by the search engine you use for example Google and recommends ads based on your search results or history. This is one of the applications of Data Science. 35% of amazon’s revenue is generated from product recommendations whose backbone is Data Science.

Another Example, Apple’s watch that monitors the health of an individual. This watch collects data such as the person’s heart rate, sleep cycle, breathing rate activity, blood pressure, and so on, and keeps a record of these measures every day. This collected data is then processed and then analyzed to build a model that predicts the risk of a heart attack.

Why Should You become a Data scientist?

A Data Scientist is a professional who possesses the ability to transform raw data into useful insight to make better business decisions. The average salary of a Data Scientist ranges from $ 100K to $ 182K

Skills needed to become a Data Scientist

  1. Be skilled in programming languages like Python and R
  2. Understanding of processes like Data extraction, wrangling, and exploration
  3. Be versed with the different machine learning algorithms and how they work.
  4. Know Advanced machine learning concepts, like deep learning
  5. Good understanding of different big data frameworks like Hadoop and Spark
  6. Know how to use Data Science tools like Tableau and power bi

Related Posts

Leave a Comment