This is to get started with pandas and try few concrete examples. pandas is a Python based library that helps in reading, transforming, cleaning and analyzing data. It is built on the NumPy package.
pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
https://pandas.pydata.org
Key data structure in pandas is called DataFrame – it helps to work with tabular data translated as rows of observations and columns of features.
Download or fork entire Jupiter notebook from my GitHub to play around: https://github.com/sandeep-mewara/python-examples
pandas basics includes:
- Series
- Dataframes
- Create
- from list of tuples
- from a dictionary
- from a CSV
- from built-in dataset (eg: from sklearn.datasets)
- Data retrieval
- Modifying data
- Group by operation
- Custom Functions – apply method
- Pre-Processing
- drop, mean, mode
- ordinal feature
- nominal feature
- Reshaping
- CrossTab
- Merge
- Melt
- Pivot
- Create
#
.info(),.head(),.sampleare handy method to use first off with dataframe to get a high level details# index may be not unique – can return multiple values
# boolean indexing (masking) can help select certain set of rows
#
.isin()is a useful when building a boolean index#
.where()is useful to retain shape of the original table# Column names & Indexes can be set if needed
# to modify the table right away, use
inplace=True# aggregate operations can be applied on a
groupbyobject#
Key learning’s …dropna(),mean() ormode() are handy ways for pre-processing missing data
Examples notebook includes:
- Uber taxi drivers
- Apple stock price
- Day or Night
- Students marks
- Balance Calculator
#
.describe()is a handy method to get the statistical summary of numerical columns#
one-hot-encodingis really helpful for nominal features (that cannot be ordered)# converting the columns into right datatype helps
# converting data into meaningful numbers help for analysis
#
Key learning’s …groupbyis a powerful tool with dataframes for analysis
Cheat sheet
Download cheat sheet pdf from here
For more details about pandas, look at the documentation reference.
Keep learning!







