hist

The hist function allows histogram visualizing DataFrame data. At a minimum, the hist function requires the following keywords:

  • df: a pandas DataFrame

  • x: the name of the DataFrame column containing the x-axis data

The y-axis will display the histogram counts for the specified data set.

Other optional keywords for this function are described in Keyword Arguments.

Setup

Imports

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
import fivecentplots as fcp
import pandas as pd
import numpy as np
import os, sys, pdb
osjoin = os.path.join
st = pdb.set_trace

Sample data

In [2]:
df = pd.read_csv(osjoin(os.path.dirname(fcp.__file__), 'tests', 'fake_data_box.csv'))
df.head()
Out[2]:
Batch Sample Region Value ID
0 101 1 Alpha123 3.5 ID701223A
1 101 1 Alpha123 2.1 ID7700-1222B
2 101 1 Alpha123 3.3 ID701223A
3 101 1 Alpha123 3.2 ID7700-1222B
4 101 1 Alpha123 4.0 ID701223A

Set theme

(Only needs to be run once)

In [3]:
#fcp.set_theme('gray')
#fcp.set_theme('white')

Other

In [4]:
SHOW = False

Simple histogram

Vertical bars

A simple histogram with default bin size of 20:

In [5]:
fcp.hist(df=df, x='Value', show=SHOW)
_images/hist_15_0.png

Horizontal bars

Same data as above but with histogram bars oriented horizontally:

In [6]:
fcp.hist(df=df, x='Value', show=SHOW, horizontal=True)
_images/hist_18_0.png

Legend

Add a legend:

In [7]:
fcp.hist(df=df, x='Value', show=SHOW, legend='Region')
_images/hist_21_0.png

Kernal density estimator

In [8]:
fcp.hist(df=df, x='Value', show=SHOW, legend='Region', kde=True, kde_width=2)
_images/hist_23_0.png

Row/column plot

Make multiple subplots with different row/column values:

In [9]:
fcp.hist(df=df, x='Value', show=SHOW, legend='Region', col='Batch', row='Sample', ax_size=[250, 250])
_images/hist_26_0.png

Wrap plot

By column values

In [10]:
fcp.hist(df=df, x='Value', show=SHOW, legend='Region', wrap='Batch', ax_size=[250, 250], horizontal=True)
_images/hist_29_0.png

By column names

In [11]:
df['Value*2'] = 2*df['Value']
df['Value*3'] = 3*df['Value']
fcp.hist(df=df, x=['Value', 'Value*2', 'Value*3'], wrap='x', show=SHOW, ncol=3, ax_size=[250, 250])
_images/hist_31_0.png