Excel files and CSV into Python Pandas#

Loading data into Python from standard Excel files and comma-separated value (CSV) data is fundamental for many data scientists and analysts.

Excel and CSV files are among the most common data storage formats. Python provides various tools to read, manipulate, and analyze this data. In this process, Excel files can be read using libraries like pandas, xlrd, and openpyxl. In contrast, CSV files can be imported using the built-in csv module or the pandas library.

Understanding how to load data from these file formats is essential for data analysis and machine learning tasks in Python. This article will explore different methods to read data from Excel and CSV files and analyze some everyday use cases.

How To#

import pandas as pd
df = pd.read_excel("data/housing.xlsx", engine="openpyxl")
df.head(5)
longitude latitude housing_median_age total_rooms total_bedrooms population households median_income median_house_value ocean_proximity
0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 3252-08-01 00:00:00 452600.0 NEAR BAY
1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 3014-08-01 00:00:00 358500.0 NEAR BAY
2 -122.24 37.85 52.0 1467.0 190.0 496.0 177.0 2574-07-01 00:00:00 352100.0 NEAR BAY
3 -122.25 37.85 52.0 1274.0 235.0 558.0 219.0 6431-05-01 00:00:00 341300.0 NEAR BAY
4 -122.25 37.85 52.0 1627.0 280.0 565.0 259.0 8462-03-01 00:00:00 342200.0 NEAR BAY
pd.read_csv("data/housing.csv").head()
longitude latitude housing_median_age total_rooms total_bedrooms population households median_income median_house_value ocean_proximity
0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 8.3252 452600.0 NEAR BAY
1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 8.3014 358500.0 NEAR BAY
2 -122.24 37.85 52.0 1467.0 190.0 496.0 177.0 7.2574 352100.0 NEAR BAY
3 -122.25 37.85 52.0 1274.0 235.0 558.0 219.0 5.6431 341300.0 NEAR BAY
4 -122.25 37.85 52.0 1627.0 280.0 565.0 259.0 3.8462 342200.0 NEAR BAY
pd.read_csv("https://raw.githubusercontent.com/freeCodeCamp/open-data/master/medium-fCC-data/data/medium_titles%20-%20rawdata.tsv", sep="\t", )
order inv-ord selection CollTitle CorrectedTitle OrigRecommends classfication Recommends class_WD class_WD&CS
0 541 112 0.0 Angular 2 versus React: There Will Be BloodIn ... Angular 2 versus React: There Will Be Blood 2.5K TECH 2500.0 0.0 0.0
1 18 635 1.0 The mind-blowing AI announcement from Google t... The mind-blowing AI announcement from Google t... 2.4K TECH 2400.0 0.0 0.0
2 650 3 0.0 Things I Wish Someone Had Told Me When I Was L... Things I Wish Someone Had Told Me When I Was L... 3.2K MOT 3200.0 0.0 0.0
3 94 559 1.0 How to encrypt your entire life in less than a... How to encrypt your entire life in less than a... 7.4K SEC 7400.0 0.0 0.0
4 409 244 0.0 Being A Developer After 40In freeCodeCampView ... Being A Developer After 40 6K MOT 6000.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ...
647 636 17 NaN December 2014 December 2014 NaN date NaN NaN NaN
648 644 9 NaN November 2014 November 2014 NaN date NaN NaN NaN
649 647 6 NaN October 2014 October 2014 NaN date NaN NaN NaN
650 649 4 NaN November 2013 November 2013 NaN date NaN NaN NaN
651 651 2 NaN October 2013 October 2013 NaN date NaN NaN NaN

652 rows × 10 columns

Exercise#

Change the data loading to contain the correct data types and explore the keywords for reading CSV files and Excel files using Shift + Tab.

pd.read_csv("https://raw.githubusercontent.com/freeCodeCamp/open-data/master/medium-fCC-data/data/medium_titles%20-%20rawdata.tsv", 
            sep="\t", )
order inv-ord selection CollTitle CorrectedTitle OrigRecommends classfication Recommends class_WD class_WD&CS
0 541 112 0.0 Angular 2 versus React: There Will Be BloodIn ... Angular 2 versus React: There Will Be Blood 2.5K TECH 2500.0 0.0 0.0
1 18 635 1.0 The mind-blowing AI announcement from Google t... The mind-blowing AI announcement from Google t... 2.4K TECH 2400.0 0.0 0.0
2 650 3 0.0 Things I Wish Someone Had Told Me When I Was L... Things I Wish Someone Had Told Me When I Was L... 3.2K MOT 3200.0 0.0 0.0
3 94 559 1.0 How to encrypt your entire life in less than a... How to encrypt your entire life in less than a... 7.4K SEC 7400.0 0.0 0.0
4 409 244 0.0 Being A Developer After 40In freeCodeCampView ... Being A Developer After 40 6K MOT 6000.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ...
647 636 17 NaN December 2014 December 2014 NaN date NaN NaN NaN
648 644 9 NaN November 2014 November 2014 NaN date NaN NaN NaN
649 647 6 NaN October 2014 October 2014 NaN date NaN NaN NaN
650 649 4 NaN November 2013 November 2013 NaN date NaN NaN NaN
651 651 2 NaN October 2013 October 2013 NaN date NaN NaN NaN

652 rows × 10 columns

Additional Resources#