The Easy Way to Extend Pandas API

Writing your own flavor of Pandas

In this article, you’ll learn how to tailor pandas API to your business, research, or personal workflow using by using pandas_flavour.
Pandas-flavor is a library that introduces API for extending Pandas. This API handled the boilerplate code for registering custom accessors onto Pandas objects.
There are plenty of examples of extensions in the wild including:
  • GeoPandas: Pandas for geographic data and information.
  • pyjanitor: data “cleaning” API for Pandas DataFrames.
  • Pandas’ plot API: yes, this is part of Pandas’ core library, but acts as an extension.
  • Many many more
But in this article, I will show it’s capabilities on dummy dataset:
import pandasdf = pandas.DataFrame({"value": [5, -5, 45, 65, 30],
                       "gains_and_losses": [5, -10, 50, 20, -35]})

Registering Methods

Pandas-flavor also adds the following decorators to register custom methods directly onto Pandas’ DataFrame/Series:
  • register_dataframe_method()
  • register_series_accessor()
These two decorators allow you We could adjust the example above to attach the “get_losses” method directly to the DataFrame.
from pandas_flavor import register_dataframe_method@register_dataframe_method
def get_losses(df):
   losses = df[df["gains_and_losses"] < 0]
   return lossesdf.get_losses()
But If everyone starts monkey-patching DataFrames in their libraries with custom methods, it could lead to confusion in the Pandas community. The preferred Pandas approach is to namespace your methods by registering an accessor that contains your custom methods.)
Source: https://www.famouslogos.us/images/funny-logos/money-patch-logo.jpg

Registering Accessor

Entering accessor registration to pandas’ DataFrame/Series using:
  • register_dataframe_accessor()
  • register_series_accessor()
As an example, here’s a simple “finance” accessor that has a “get_losses” method:
from pandas_flavor import register_dataframe_accessor@register_dataframe_accessor("finance")
class FinanceAccessor:
    def __init__(self, df):
        self._df = df
        
    def get_losses(self):
        df = self._df
        losses = df[df["gains_and_losses"] < 0]
        return lossesdf.finance.get_losses()
Source: https://www.famouslogos.us/images/funny-logos/money-patch-logo.jpg

Last Words

Pandas flavor is really useful when you want to extend the capabilities of Pandas and to make your code much better.
I hope you found it interesting and useful. I am open to any kind of constructive feedback.

Towards Data Science

Sharing concepts, ideas, and codes.

You're following Towards Data Science.

You’ll see more from Towards Data Science across Medium and in your inbox.

Eyal Trabelsi
WRITTEN BY

Comments

Popular posts from this blog

Flutter for Single-Page Scrollable Websites with Navigator 2.0

A Data Science Portfolio is More Valuable than a Resume

Better File Storage in Oracle Cloud