Tuesday, December 22, 2020

Python Jupyter Notebooks in Excel

 

Embed Jupyter into Microsoft Excel and write Python instead of VBA




It used to be an “either/or” choice between Excel and Python Jupyter Notebooks. With the introduction of the PyXLL-Jupyter package now you can use both together, side by side.

In this article I’ll show you how to set up Jupyter Notebooks running inside Excel. Share data between the two and even call Python functions written in your Jupyter notebook from your Excel workbook!

Jupyter Notebooks in Microsoft Excel. Video by Author.

Getting Started

First off, to run Python code in Excel you need the PyXLL add-in. The PyXLL add-in is what lets us integrate Python into Excel and use Python instead of VBA. To install the PyXLL Excel add-in “pip install pyxll” and then use the PyXLL command line tool to install the Excel add-in:

>> pip install pyxll
>> pyxll install

If you’re new to PyXLL then take a look at the online documentation for first time users to help get you started.

Once you have the PyXLL Excel add-in installed the next step is to install the pyxll-jupyter package. This package provides the glue between PyXLL and Jupyter so that we can use our Jupyter notebooks inside of Excel.

The pyxll-jupyter package is installed using pip:

>> pip install pyxll-jupyter

Once both the PyXLL Excel add-in and the PyXLL-Jupyter package are installed start Excel and you will see a new “Jupyter” button in the PyXLL tab.

Image for post
Jupyter Notebooks in Microsoft Excel. Image by Author.

Clicking this button opens the Jupyter notebook in a side panel in your Excel workbook. This panel is part of the Excel interface and can be un-docked or docked in a different location by dragging it.

In the Jupyter panel you can select an existing notebook or create a new one. To create a new notebook select the “New” button followed by “Python 3”.

Image for post
Jupyter Notebooks in Microsoft Excel. Image by Author.
Image for post
Jupyter Notebooks in Microsoft Excel. Image by Author.

How is this useful?

Now you have a complete Jupyter notebook running inside of Excel! But what is this good for? How is this better than running a notebook outside of Excel?

Well, now you can use Excel for working with your data and use Python to work on the same data. Use Excel as an interactive playground for organizing and visualizing your data, seamlessly switching to Python for more sophisticated tools.

Use a Jupyter notebook as a scratch-pad for trying out Python code. Write Excel functions entirely in Python in a Jupyter notebook and test them out in real-time. Once you’ve developed a useful re-usable function add it to your PyXLL Python project. That way you can use the same function every time you use Excel.

In the rest of this article I’ll show you how to:

  • Share data between Excel and Python using your Jupyter notebook
  • Write Excel worksheet functions (UDFs) in your notebook
  • Script Excel with Python instead of VBA

Getting data from Excel into Python

Because PyXLL runs Python in the same process as Excel, accessing Excel data in Python and calling between Python and Excel is fast.

To make things as easy as possible, the pyxll-jupyter package comes with some IPython “magic” functions for you to use in your Jupyter notebooks.

%xl_get

Image for post
Excel data to Pandas DataFrames in Jupyter Notebook. Image by Author.

Use the magic function “%xl_get” to get the current Excel selection in Python. Have a table of data in Excel? Select the top left corner (or the whole range) and type “%xl_get” in your Jupyter notebook and voila! the Excel table is now a pandas DataFrame.

The %xl_get magic function as several options:

  • -c or --cell. Pass the address of the cell(s) to get the value of, eg %xl_get --cell A1:D5.
  • -t or --type. Specify a data type to use when getting the value, eg %xl_get --type numpy_array.
  • -x or --no-auto-resize. Only get the data for the selected or given range. Don’t expand to include the surrounding range of data.

PyXLL has other ways of interacting with Excel to read data into Python. The “%xl_get” magic function is just a shortcut to make things easier! As the Jupyter notebook is running in Excel, all other methods (eg using the XLCell class, Excel’s COM API or even xlwings) are still available.

TIP: You can assign a variable to the result of a magic function! For example, try “df = %xl_get”.

Moving data in Python back to Excel

Transfering data the other way around, from Python to Excel, works just as well. Whether you’ve used Python to load a dataset and want to transfer it to your Excel workbook, or if you’ve manipulated a data set from Excel and want the results back in Excel, copying data to Excel from Python is easy.

%xl_set

Image for post
Pandas Dataframe to Excel range. Image by Author.

The magic function “%xl_set” takes a Python object and writes it to Excel. Have a dataframe “df” that you want in Excel? No problem, just use “%xl_set df” and it will be written to the current selection in Excel.

Like %xl_get, %xl_set has a range of options to control its behaviour. You can even use PyXLL’s cell formatting feature to automatically apply formatting at the same time as writing results to Excel.

  • -c or --cell. Address of cell(s) to write the value to, eg %xl_set VALUE --cell A1.
  • -t or --type. Datatype specifier to use when writing the value to Excel, eg %xl_set VALUE --type dataframe<index=False>.
  • -f or --formatter. PyXLL cell formatter object, eg %xl_set VALUE --formatter DataFrameFormatter(). See cell formatting.
  • -x or --no-auto-resize. Don’t auto-resize the range to fit the data. Only write values to the current selection or specified range.

As with %xl_get, %xl_set is meerly a shortcut and all the other ways of writing back to Excel that you might have used with PyXLL will still work in a Jupyter notebook.

Use Python plots (matplotlib/plotly etc) in Excel

One of the great things about working with data is the powerful plotting packages available. Being able to plot a pandas DataFrame with a simple “df.plot()” is awesome!

PyXLL has integration with all of the main plotting libraries so you can make the most of these in Excel too. This includes matplotlib (used by pandas), plotlybokeh and altair.

%xl_plot

Image for post
Plotting charts in a Jupyter Notebook and Excel. Image by Author.

Use “%xl_plot” to draw any Python chart in Excel. Pass it any figure object from one of the supported plotting libraries, or use the last pyplot figure. Using pandas plot works great, eg. %xl_plot df.plot(kind='scatter').

The %xl_plot magic function has some options to control how it works:

  • -n or --name. Name of the picture object in Excel. If using a name of a picture that already exists that picture will be replaced.
  • -c or --cell. Cell address to use as the location for the new picture. If the picture already exists this has no effect.
  • -w or --width. Width of the picture in Excel in points. This has no effect if updating an existing picture.
  • -h or --height. Height of the picture in Excel in points. This has no effect if updating an existing picture.

%xl_plot is a shortcut for the pyxll.plot function.

Call Python functions from Excel

Rather than constantly moving data between Excel and Jupyter and then running some Python code, you can call Python function directly from the Excel workbook!

One of the main use-cases for PyXLL is writing custom Excel worksheet functions (or “UDFs”) in Python. This is used for building models in Excel built from Python functions, which can of course themselves use other Python toolkits like pandas and scipy.

How to Call Python Functions in Excel. Video by Author.

You can write Excel worksheet functions in your Jupyter notebook too. This is a really great way of trying out ideas without leaving Excel to go to a Python IDE.

Try it out for yourself. Write a simple function and then add the “pyxll.xl_func” decorator to your function:

After you’ve entered the code and run the cell in Jupyter that Python function will immediately be available to call from the Excel workbook.

It’s not just for simple functions. You can pass whole ranges of data to your function as pandas DataFrames and return any Python type, including numpy arrays and DataFrames! You can tell PyXLL what types to expect by giving the @xl_func decorator a signature string.

For example, try out the following:

Now you can write complex Python functions to do data transformation and analysis, but orchestrate how those functions are called or sequenced in Excel. Changing the inputs results in the functions be called and the calculated outputs update in real-time, just as you would expect!

Script Excel with Python instead of VBA

Did you know that everything you can do in VBA can also be done in Python? The Excel Object Model is what you use when writing VBA, but the same API is available in Python as well.

See Python as a VBA Replacement from the PyXLL documentation for details of how this is possible.

In a Jupyter notebook running in Excel the entire Excel Object Model is available and so you can script Excel in exactly the same way you might do in the Excel VBA Editor.

Because PyXLL runs Python inside the Excel process there is no performance penalty for calling into Excel from Python. It is also possible to call into Excel from an external Python process, but this is generally much slower. Having a Jupyter notebook running in Excel makes everything more convenient too!

Use PyXLL’s xl_app function to get the “Excel.Application” object, which is equivalent to the Application object in VBA. Try something like getting the current selection and changing the cell interior color. A great way of figuring out how to do something with the Excel Object Model is to record a VBA Macro and then translate that macro into Python! The PyXLL docs page Python as a VBA Replacement has some tips on how to do that.

Image for post
Scripting Excel using Python in a Jupyter Notebook. Image by Author.

Summary

Python makes a poweful alternative to VBA. With PyXLL you can write fully featured Excel add-ins entirely in Python. Excel is an amazing tool for interactive computation. Adding Python and Jupyter takes Excel to a whole new level.

Code written in Jupyter notebooks can easily be refactored into standalone Python packages to create Excel tool-kits to power intutitive workbooks and dashboards. Any Excel user will be able to take advantage of Python tools written using PyXLL without needing any knowledge of Python.

If you want to find out more then download and try PyXLL for yourself or contact me. I’ll be happy to answer any questions you might have and find out how PyXLL can help you!

Links

No comments:

Must Watch YouTube Videos for Databricks Platform Administrators

  While written word is clearly the medium of choice for this platform, sometimes a picture or a video can be worth 1,000 words. Below are  ...