100% Practical, Personalized, Classroom Training and Assured Job Book Free Demo Now
App Development
Digital Marketing
Other
Programming Courses
Professional COurses
Data analysis is the process of examining data in order to identify trends and draw conclusions about the information it contains. Data analysis is done by specialized systems and software and is widely used across industries. It enables organizations to analyze the market’s needs and make more-informed business decisions. It is also widely used for research purposes to verify or disprove theories, hypotheses, and various scientific models.
Data analytics is a broad field but we can categorize Data Analytics into four major categories: descriptive, predictive, diagnostic, and prescriptive analysis. These are also the primary applications of data analytics in business.
Â
Â
A data analyst uses programming skills to analyze and fetch relevant information out of large amounts of complex data. Basically, an Analyst derives meaning from messy data. A Data analyst needs to have the following skills:
Â
Â
Â
Â
The work of a data analyst involves working with data throughout the data analysis pipeline. The primary steps in data analysis are data mining, data management, statistical analysis, and data presentation.
Â
Â
Â
Below, I am going to walk you through some libraries in python which are very helpful for Data Analysis.
Â
Numpy & Pandas are your two evergreen friends on this journey. Both of these libraries are extremely important and the logic developed while studying these two libraries is also helpful in various other languages like SQL.
Â
Â
import pandas as pd
import numpy as np
x1 = np.array(['a', 'a', 'b', 'b', 'b'])
x2=np.array([50, 250, 100, 400, 350])
x3=np.array([10, 20, 20, 50, 40])
x4=np.array([2, 3, 1, 4, 3])
df1 = pd.DataFrame({'bucket':x1,'quantity':x2, 'risk':x3, 'weight':x4})
df1
Output:
Â
We have a simple table with 4 columns (one nominal and three numerical)
Â
Let’s try to solve below mentioned problems:
df1.groupby('bucket').agg({'bucket': len}).rename(columns={'bucket':Â 'elements'})
Â
df1.groupby('bucket').apply(lambda g: np.average( g.quantity, weights=g.weight)).to_frame('W_AVG_QTY')
This can also be done by sum(weight*quantity)/ sum(weight).
Â
Moving ahead, we’ll learn how to gather data, visualize it, and make sense out of it.
Error: Contact form not found.