Index, Select, And Filter pandas Dataframes

Want to learn more? I recommend these Python books: Python for Data Analysis, Python Data Science Handbook, and Introduction to Machine Learning with Python.

Import modules

import pandas as pd

Create a dataframe

data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
        'year': [2012, 2012, 2013, 2014, 2014],
        'reports': [4, 24, 31, 2, 3],
        'coverage': [25, 94, 57, 62, 70]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df
coverage name reports year
Cochice 25 Jason 4 2012
Pima 94 Molly 24 2012
Santa Cruz 57 Tina 31 2013
Maricopa 62 Jake 2 2014
Yuma 70 Amy 3 2014

5 rows × 4 columns

View a column of the dataframe

df['name']
Cochice       Jason
Pima          Molly
Santa Cruz     Tina
Maricopa       Jake
Yuma            Amy
Name: name, dtype: object

View two columns of the dataframe

df[['name', 'reports']]
name reports
Cochice Jason 4
Pima Molly 24
Santa Cruz Tina 31
Maricopa Jake 2
Yuma Amy 3

5 rows × 2 columns

View the first two rows of the dataframe

df[:2]
coverage name reports year
Cochice 25 Jason 4 2012
Pima 94 Molly 24 2012

2 rows × 4 columns

View all rows where coverage is more than 50

df[df['coverage'] > 50]
coverage name reports year
Pima 94 Molly 24 2012
Santa Cruz 57 Tina 31 2013
Maricopa 62 Jake 2 2014
Yuma 70 Amy 3 2014

4 rows × 4 columns

View a row

df.ix['Maricopa']
coverage      62
name        Jake
reports        2
year        2014
Name: Maricopa, dtype: object

View a column

df.ix[:, 'coverage']
Cochice       25
Pima          94
Santa Cruz    57
Maricopa      62
Yuma          70
Name: coverage, dtype: int64

View the value based on a row and column

df.ix['Yuma', 'coverage']
70