Many people use Python with notebooks, so let’s take a look at one possible way to enable a Jupyter external tool for Power BI Desktop. The following stepwise approach begins with simply opening Jupyter. It then progresses to creating and opening a notebook that includes Power BI’s server and database arguments. Finally, it works its way toward downloading a notebook definition contained in a GitHub gist and connects to Power BI’s tabular model to start to make this approach more useful.
This post continues a series of posts related to Python and Power BI. The first three parts of this blog series introduced some possible uses for Python connected to a Power BI model, how to setup a basic Python external tool, and how to both use it with a virtual environment and connect to the Tabular Object Model.
Open Jupyter: Foundational But Not Useful
I use the Anaconda distribution in these examples, and let’s consider how to simply start and open a notebook (vanilla Jupyter and nothing with JupyterLab). We only need a pbitool.json file at this stage because the goal is simply launch Jupyter–not even a notebook file. It’s not a useful example but provides a foundation on which we’ll build.
As discussed in Part 2, pbitool.json files define Power BI Desktop’s external tools. If you look at the Jupyter shortcut, it provides the inspiration for how to define Power BI’s external tool.
The JSON in this gist would need to be saved as a [tool].pbitool.json file and placed in C:\Program Files (x86)\Common Files\Microsoft Shared\Power BI Desktop\External Tools.
Note the path is for Python.exe, and arguments load the default Python environment and run the jupyter-notebook-script.py script to launch Jupyter.
There’s no value in simply opening Jupyter though. At this stage, it’s nothing more than a replacement for a shortcut. The real value is integrating Power BI Desktop more closely with the notebook.
Create and Open a Notebook (Hardcoded)
Taking it a step further, let’s create an empty notebook and write the currently open Power BI Desktop connection information to cells in the notebook. In this case, we need a Python script along with a new [tool].pbitool.json file.
The [tool].pbitool.json file has a few differences from the previous version. Most notably, it executes a Python script that I’ve named powerbinotebook.py and placed in my C:\Temp directory. You’ll also see the server and database arguments that Power BI Desktop uses to pass the current local connection (server:port and database GUID).
C:/Temp/powerbinotebook.py \”%server%\” \”%database%\”
The rest of the arguments start the default Python environment and launch Jupyter, but it’s important that the powerbinotebook.py script runs before Jupyter launches. Why is that?
The Python script defines a nearly empty notebook for Jupyter and then launches it in Jupyter. Pay close attention to notebook_core, which contains the JSON definition for the notebook and writes sys.argv and sys.argv into the notebook for the Power BI server and database respectively. Each cell in the cells array will appear as a separate cell in the sample notebook. As required by the pbitool.json definition above, the Python script is located in C:\Temp\powerbinotebook.py, and you could obviously name and save your own to your location of choice.
The following video shows the basic script execution, where the Python script creates the notebook, launches Jupyter, and Jupyter launches the notebook.
Create and Open a Notebook Defined in a GitHub Gist
Taking it a step further, let’s shift the notebook definition to a GitHub gist instead of hardcoding it into the Python script. The gist contains the JSON and also placeholders for <<PowerBIServer>> and <<PowerBIDatabase>>, which the Python script will replace with the actual sys.argv and sys.argv values.
Sample gist (not embedded in the post): https://gist.github.com/deldersveld/fdada06b82621233d6e40fac13cceb55
I used the same [tool].pbitool.json file since only the contents of the powerbinotebook.py script change. Note how the script is much shorter now that it no longer contains the notebook definition. This uses the requests library to get the JSON notebook template from the URL defined in gist_url. Note as well how replace swaps in the current values of the Power BI Server and Database.
Extending the Notebook Definition
Finally, let’s build a slightly more complex notebook using an updated powerbinotebook.py script. As before, no modification is required for [tool].pbitool.json. Only the Python script updates.
Here is the updated notebook Gist for reference (not embedded in post):
As with the last example, the Python script downloads the new notebook definition contained in a different GitHub gist. Also as before, it replaces the <<PowerBIServer>> and <<PowerBIDatabase>> placeholders with the actual values passed into sys.argv from Power BI Desktop.
What’s new here is that this script downloads the ssas_api.py module mentioned in the Part 3 post, which is one among many possible ways to connect Python to AMO / TOM and other .NET libraries. The updated sample notebook loads the module and demonstrates basic model connectivity.
NOTE that the script intentionally does not overwrite a notebook once it’s created for a specific port number. To get a new blank template if you are testing this out simply requires closing and re-opening Power BI Desktop. Connecting to TOM in this sample also requires a Power BI Desktop connection with at least one existing table.
Here’s a look at the sample notebook in action. It illustrates two things: a basic connection to TOM, and the evaluation of a DAX query against the model open in Power BI Desktop.
With a live connection to the Power BI Desktop model from Python, you can unlock a lot of possibilities. As described in Part 1, there are many examples ranging from data visualization to updating the model. I’m curious to hear your thoughts on how you might want to use Python in Power BI.
Let me know in the comments, and don’t forget to subscribe to the DataVeld blog.