Python as an “External Tool” for Power BI Desktop: Part 3 – Virtual Environments and TOM Connection

Python as an “External Tool” for Power BI Desktop: Part 3 – Virtual Environments and TOM Connection
Reading Time: 5 minutes

In an earlier post, I introduced some possible uses for Python as an external tool for Power BI Desktop and stepped through the core setup. This post extends the basic External Tool setup found in Part 2.

There are two goals for this post:
1) Show how to run a Python script inside of a virtual environment
2) Connect to a .NET library called the Tabular Object Model (TOM) to get data about a currently open Power BI Desktop data model

Modify [tool].pbitool.json to run in a Python environment

Part 2 showed how to execute Python.exe and run a .py script, but it did not run in an environment. This limitation allows basic scripts, but more complex scripts use code from other modules and packages.

To extend the basic example found in Part 2, I created a sample [tool].pbitool.json file (gist) that shows how to run a Python script within a given environment.

Base (root) environment

I’m using Anaconda, which has a base environment location of C:\ProgramData\Anaconda3.

{
  "version": "1.0",
  "name": "PyTOM",
  "description": "Use Python with Tabular Object Model",
  "path": "C:\\ProgramData\\Anaconda3\\python.exe",
  "arguments": "C:/ProgramData/Anaconda3/cwp.py C:/ProgramData/Anaconda3 C:\\ProgramData\\Anaconda3\\python.exe C:/[PATH TO PYTHON SCRIPT].py \"%server%\" \"%database%\"",
  "iconData": ""
} 

Note that nothing needs to change from the example in Part 2 except for the addition of some arguments prior to the main Python script, server, and database values. This took me some time to work through with a lot of trial and error, so pay attention!

What Changes?

  1. The core path remains your python.exe
  2. Instead of running your intended script first, the arguments define cwp.py first, then the location of your environment, then another python.exe (I tried many variations but never got it to work properly without this structure).
  3. After that sequence, the arguments continue with your main script and the Power BI external tools references for server and database.

Named virtual environment

For a virtual environment apart from base, you need to modify the arguments to use the correct environment. This ensures that the correct packages that are part of that environment are available to you for use with your Power BI script.

Here’s an example for a conda environment named PowerBI. Note that the location of python.exe doesn’t have to change to the conda environment (although it could), but changing the environment portion of the argument is critical.

{
  "version": "1.0",
  "name": "PyTOM",
  "description": "Use Python with Tabular Object Model",
  "path": "C:\\ProgramData\\Anaconda3\\python.exe",
  "arguments": "C:/ProgramData/Anaconda3/cwp.py  C:/Users/DavidEldersveld/.conda/envs/PowerBI C:\\ProgramData\\Anaconda3\\python.exe C:/[PATH TO PYTHON SCRIPT].py \"%server%\" \"%database%\"",
  "iconData": ""
} 

Connect to TOM

The Tabular Object Model (TOM) library for .NET opens Power BI’s data model to external tools. Two pieces are required to allow Python to interface with .NET:
1) Pythonnet package for .NET CLR (pip install pythonnet)
2) Python-SSAS module (ssas_api.py placed in the same folder as the main script you’d like to run)

The python-ssas (ssas_api.py) Python module that facilitates the TOM connection is all the work of Josh Dimarsky–originally for querying and processing Analysis Services. I simply repurposed it for use with Power BI Desktop or the XMLA endpoint in Power BI Premium and extended it with some relevant examples. Everything relies on Josh’s Python module, which has functions to connect to TOM, run DAX queries, etc.

This particular module is only one possible way to use the .NET CLR with Python. For example, you might try alternatives such as .NetBridge to run the pyDotNet package and code your own solution to access the TOM DLL. I have not taken inventory of different methods available to know what’s best or most effecient. The ssas_api.py module worked directly for what I needed though, so I’ve stuck with it.

As long as Power BI Desktop is installed, you should not have to manually obtain the required DLLs. You could get them from Microsoft directly though if needed.

Josh’s code in ssas_api.py loads the Microsoft.AnalysisServices.Tabular.dll and Microsoft.AnalysisServices.AdomdClient.dll from the GAC.

Get Power BI model metadata from TOM

With the various prerequisites in place, you can code against TOM. The complete code is at the bottom of this post and this gist, but the main elements in the first 32 lines:
1) import the ssas_api.py module
2) define a connection to the Power BI Desktop model using the server argument passed from clicking on the icon in External Tools (conn = “Provider=MSOLAP;Data Source=” + str(sys.argv[1]) + “;Initial Catalog=”;” )
3) define a “fake” DAX query that helps load .NET (dax_string = ‘EVALUATE ROW(“Loading .NET assemblies”,1)’
4) runs the DAX query (df = powerbi.get_DAX(connection_string=conn, dax_string=dax_string)
5) completes imports of TOM and AMOMD for Python once they’re loaded and connects to TOM

global System, DataTable, AMO, ADOMD

import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as TOM
import Microsoft.AnalysisServices.AdomdClient as ADOMD

TOMServer = TOM.Server()
TOMServer.Connect(conn) 

Once connected to the model, the remainder of my sample Python script shows how to load various objects from the model metadata into pandas DataFrames. It’s only meant to be a sample to illustrate what could be possible, but it could be extended to document all of the available metadata if desired.

Coming in Part 4: Executing DAX queries using Python

You may have noticed the get_DAX function used earlier to help load the required .NET assemblies. It’s much more powerful than what’s shown here though. You can evaluate any DAX query that you would want to write manually, copy and paste out of Power BI Desktop’s Performance Analyzer, or copy and paste out of DAX Studio.

The advantages and imaginative scenarios of using Python with Power BI start to increase with the ability to run DAX from Python.

************************************************************************

FULL CODE for PowerBITOMSample.py 
import sys

import ssas_api as powerbi
import pandas as pd

print("Hit Enter to connect and say 'Hi TOM!'")
input()

print('Power BI Desktop Connection')
print(str(sys.argv[1]))
print(str(sys.argv[2]))

conn = "Provider=MSOLAP;Data Source=" + str(sys.argv[1]) + ";Initial Catalog='';"
print(conn)
print()

dax_string = 'EVALUATE ROW("Loading .NET assemblies",1)'
df = powerbi.get_DAX(connection_string=conn, dax_string=dax_string)

print("Crossing the streams...")
global System, DataTable, AMO, ADOMD

import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as TOM
import Microsoft.AnalysisServices.AdomdClient as ADOMD

print("Reticulating splines...")
print()

TOMServer = TOM.Server()
TOMServer.Connect(conn)

print("Hi TOM...")
print()

# Database info
for item in TOMServer.Databases:
    print("Database: ", item.Name)
    print("Compatibility Level: ", item.CompatibilityLevel) 
    print("Created: ", item.CreatedTimestamp)

DatabaseId = str(sys.argv[2])
PowerBIDatabase = TOMServer.Databases[DatabaseId]

print()

# Define measure dataframe
dfMeasures = pd.DataFrame(
    columns=['Table',
             'Name', 
             'Description', 
             'DataType', 
             'DataCategory',
             'Expression',
             'FormatString',
             'DisplayFolder',
             'Implicit',
             'Hidden',
             'ModifiedTime',
             'State'])

# Define column dataframe
dfColumns = pd.DataFrame(
    columns=['Table',
             'Name'])

# Tables
print("Listing tables...")
for table in PowerBIDatabase.Model.Tables:
    print(table.Name)

    # Assign current table by name
    CurrentTable = PowerBIDatabase.Model.Tables.Find(table.Name)

    # print(type(CurrentTable))
    # print(type(CurrentTable.Measures))

    # Measures
    for measure in CurrentTable.Measures:
        new_row = {'Table':table.Name,
                'Name':measure.Name, 
                'Description':measure.Description, 
                'DataType':measure.DataType,
                'DataCategory':measure.DataCategory,
                'Expression':measure.Expression,
                'FormatString':measure.FormatString,
                'DisplayFolder':measure.DisplayFolder,
                'Implicit':measure.IsSimpleMeasure,
                'Hidden':measure.IsHidden,
                'ModifiedTime':measure.ModifiedTime,
                'State':measure.State}
        #print(new_row)
        dfMeasures = dfMeasures.append(new_row, ignore_index=True)

    # Columns
    for column in CurrentTable.Columns:
        new_row = {'Table':table.Name,
                'Name':column.Name}
        #print(column.Name)
        dfColumns = dfColumns.append(new_row, ignore_index=True)

print(dfMeasures)
print(dfColumns)

input()
Paid Links

As an Amazon Associate I earn from qualifying purchases.

2 Comments

  1. That’s cool ! David.
    So we can use python to get the metadata of Power BI Analysis Services model
    But what I’m curious about is how we can write-back the change to the model using python,
    just like what Tabular Editor doing, such as create a new measure or create a new calculation group.
    Any solution about this? 🙂

    1. Yes, still two posts away. Covering DAX querying in Part 4 and then model write in Part 5 to close out the series.

Leave a Reply

%d bloggers like this: