1. Univariate¶

import matplotlib
%matplotlib inline
import seaborn as sns

Let's see how data is distributed as per 'hours-per-week'

adult_data['hours-per-week'].plot.hist(bins=5)

<matplotlib.axes._subplots.AxesSubplot at 0x183278c6cf8>

Let's check the same for age, but with seaborn library

sns.distplot(adult_data['age'], kde=True)

<matplotlib.axes._subplots.AxesSubplot at 0x18327971da0>

Let's see the same for 'marital-status' which is categorical variable

adult_data['marital-status'].value_counts().plot( kind='bar')

<matplotlib.axes._subplots.AxesSubplot at 0x1832793f748>

2. Bivariate¶

Contingency Table - Let's check how data is ditributed among 'workclass' and 'marital-status' columns

pd.crosstab(index= adult_data['workclass'], columns= adult_data['marital-status'])

Scatter Plot

adult_data.plot.scatter(x='age', y='hours-per-week')

<matplotlib.axes._subplots.AxesSubplot at 0x18327a85320>

It does give us information here that these 2 variables are not related with each other but sometimes when they are related it will give us the info if they are proporatinal or inversaly proportional.

And then again chi square(categorical variables) or correlation(numeric variables) can be used to find relationship between two variables

marital-status	Divorced	Married-AF-spouse	Married-civ-spouse	Married-spouse-absent	Never-married	Separated	Widowed
workclass
Federal-gov	5	0	9	0	7	0	0
Local-gov	11	0	29	0	19	5	4
Private	96	1	274	10	270	14	21
Self-emp-inc	4	0	25	0	2	1	0
Self-emp-not-inc	9	0	53	1	11	5	1
State-gov	3	0	21	1	8	1	1