| Apr | MAY | Jun |
| 08 | ||
| 2021 | 2022 | 2023 |
COLLECTED BY
Collection: Save Page Now Outlinks
crosstab function in pandas so you can summarize
and group data.
Calculating streaks in pandas
shows how to measure and report on streaks in data, which is where
several events happen in a row consecutively.
How to Convert a Python Dictionary to a Pandas DataFrame
is a straightforward tutorial with example code for loading and adding
data stored in a typical Python dictionary into a DataFrame.
This two-part series on loading data into a pandas DataFrame presents
what to do
when CSV files do not match your expectations
and
how to handle missing values
so you can start performing your analysis rather than getting frustrated
with common issues at the beginning of your workflow.
Building a financial model with pandas
explains how to create an amortization schedule with corresponding table
and charts that show the pay off period broken down by interest and
principal.
Efficiently cleaning text with pandas
provides a really great practical tutorial on different approaches
for cleaning a large data set so that you can begin to do your analysis.
The tutorial also shows how to use the
sidetable library, which
creates summary tables of a DataFrame.
tabula-py: Extract table from PDF into Python DataFrame
presents how to use the Python wrapper for the
Tabula library that makes it easier to
extract table data from PDF files.
Time Series Forecast Case Study with Python: Monthly Armed Robberies in Boston
walks through the data wrangling, analysis and visualization steps
with a public data set of murders in Boston from 1966 to 1975. This
particular data problem may not be your thing but by going through
the process you can learn a lot that can be applied to any data set.
A Gentle Visual Intro to Data Analysis in Python Using Pandas
presents spreadsheet-like pictures to show conceptually what
pandas is doing with your data as you apply various functions like
groupby and loc.
Data Manipulation with Pandas: A Brief Tutorial
uses some example data sets to show how the most commonly-used functions
in pandas work.
Analyzing Pronto CycleShare Data with Python and Pandas
uses Seattle bikeshare data as a source for wrangling, analysis and
visualization.
Stylin' with pandas shows how
to add colors and sparklines to your output when using pandas for data
visualization.
Python and JSON: Working with large datasets using Pandas
is a well-done detailed tutorial that shows how to mung and analyze
JSON data.
Fun with NFL Stats, Bokeh, and Pandas
uses National (American) Football League data as a source for
wrangling and visualization.
Analyzing my Spotify Music Library With Jupyter And a Bit of Pandas
shows how to grab all of your user data from the Spotify API then
analyze it using pandas in Jupyter Notebook.
Scalable Python Code with Pandas UDFs
explains that pandas operations can often be parallelized for better
performance using the Pandas UDFs feature in PySpark version 2.3
or greater.
How to use Pandas read_html to Scrape Data from HTML Tables
has a bunch of great code examples that show how to load
data from HTML directly into your DataFrames.
How to download fundamentals data with Python
shows how to obtain and use financial data, such as balance sheets,
stock prices, and various ratios to perform your own analysis on.
How to convert JSON to Excel with Python and pandas
provides instructions for creating a spreadsheet out of JSON file.
Loading large datasets in Pandas
explains how to get around the MemoryError issue that occurs
when using read_csv because the data set is larger than the
available memory on a machine. You can use chunking with
the read_csv function to divide the data set into smaller parts that
each can be loaded into memory. Alternatively, you can use a
SQLite database to create a relational database
with the data then use SQL queries or an
object-relational mapper (ORM)
to load the data and perform analysis in pandas.
Real-world Excel spreadsheets are often a mess of unstructured data, so
this tutorial on
Reading Poorly Structured Excel Files with Pandas
gives example code for extracting only part of a file as well
as reading ranges and tables.
The automatic transcription API loved by Python developers.