Dropping Rows And Columns In pandas Dataframe

Want to learn more? I recommend these Python books: Python for Data Analysis, Python Data Science Handbook, and Introduction to Machine Learning with Python.

Import modules

import pandas as pd

Create a dataframe

data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
        'year': [2012, 2012, 2013, 2014, 2014],
        'reports': [4, 24, 31, 2, 3]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df
name reports year
Cochice Jason 4 2012
Pima Molly 24 2012
Santa Cruz Tina 31 2013
Maricopa Jake 2 2014
Yuma Amy 3 2014

5 rows × 3 columns

Drop an observation (row)

df.drop(['Cochice', 'Pima'])
name reports year
Santa Cruz Tina 31 2013
Maricopa Jake 2 2014
Yuma Amy 3 2014

3 rows × 3 columns

Drop a variable (column)

Note: axis=1 denotes that we are referring to a column, not a row

df.drop('reports', axis=1)
name year
Cochice Jason 2012
Pima Molly 2012
Santa Cruz Tina 2013
Maricopa Jake 2014
Yuma Amy 2014

5 rows × 2 columns

Drop a row if it contains a certain value (in this case, "Tina")

Specifically: Create a new dataframe called df that includes all rows where the value of a cell in the name column does not equal "Tina"

df = df[df.name != 'Tina']
df
name reports year
Cochice Jason 4 2012
Pima Molly 24 2012
Maricopa Jake 2 2014
Yuma Amy 3 2014

4 rows × 3 columns