Site icon Learn by Insight…

pandas – get started with examples

This is to get started with pandas and try few concrete examples. pandas is a Python based library that helps in reading, transforming, cleaning and analyzing data. It is built on the NumPy package.

pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

https://pandas.pydata.org


Key data structure in pandas is called DataFrame – it helps to work with tabular data translated as rows of observations and columns of features.

Download or fork entire Jupiter notebook from my GitHub to play around: https://github.com/sandeep-mewara/python-examples

pandas basics includes:

# .info(), .head(), .sample are handy method to use first off with dataframe to get a high level details

# index may be not unique – can return multiple values

# boolean indexing (masking) can help select certain set of rows

# .isin() is a useful when building a boolean index

# .where() is useful to retain shape of the original table

# Column names & Indexes can be set if needed

# to modify the table right away, use inplace=True

# aggregate operations can be applied on a groupby object

# dropna(), mean() or mode() are handy ways for pre-processing missing data

Key learning’s …

Examples notebook includes:

# .describe() is a handy method to get the statistical summary of numerical columns

# one-hot-encoding is really helpful for nominal features (that cannot be ordered)

# converting the columns into right datatype helps

# converting data into meaningful numbers help for analysis

# groupby is a powerful tool with dataframes for analysis

Key learning’s …

Cheat sheet

Credit: Pandas website

Download cheat sheet pdf from here
For more details about pandas, look at the documentation reference.

Keep learning!