Step by Step: Using Power Query to Replace Text in Table Headers

Reading Time: 4 minutes

Recently, I came across a dataset involving traffic accidents that I really liked and wanted to clean up for a demonstration.  As with most datasets originally pulled from the web or from text files, there were various issues before it could be used.  Some items would have prevented the type of analysis that I wanted.  These included having units stored with potential measures, i.e. “70 miles per hour” instead of 70.  Other changes were preferred more out of convenience, such as the desire to remove spaces from the column headers.

While I have used Power Query for a number of projects, I had only previously replaced text in table records using Table.ReplaceValue() and various Text functions.  These functions do not operate on headers directly.  As an aside, I rue the lack of M support for regular expressions every time I replace text in Power Query.  In any case, I wanted to document a way to change text in table headers only.  Here is a walkthrough of my M query:

  1. Load the source data into Power Query.  Note the ultimate target of elimination–the space in the column headers.
  2. Demote the headers.
  3. At this point, there is a choice.  Many people may choose to transpose the table now and put the column headers into Column1.  This would be followed by a Table.ReplaceValue function to remove the whitespace and another transpose to put the column headers back.  On a particularly wide table, this route also happens to lead to an Out of Memory Exception.  Sometimes transposing an entire table with tens of thousands of records or more as well as 53 columns is a really bad idea.  Trust me.PQ-OutOfMemory
  4. Pending some magical M function that I am (hopefully) missing, what is a better route?  Instead of transposing the entire table, transpose only the row of headers that is needed.  To get there, create a copy of the original table.  While not ideal, it is a refreshing alternative to an out of memory exception.  In the Advanced Editor, simply add a new step referencing the previous one, i.e “HeaderTable = DemoteHeaders”.
  5. Next, keep only the target row of headers in the new table using the Table.FirstN() function.
  6. The header row is now ready to be transposed using the Table.Transpose() function.
  7. Here is the key step where the whitespace disappears using the Table.ReplaceValue() function.
  8. Once the whitespace has been removed, transpose the table again to restore the headers as a row.
  9. At this point, the table with the single header row needs to be combined with the original table.
  10. Once the tables are combined, you can see the altered headers and the original headers.  To remove the duplicate header, promote the first row to headers using the Table.PromoteHeaders() function.
  11. Finally, use the Table.Skip() function to remove the original headers that now appear in the first row.

There you have it!  Perhaps there is a better way.  Hopefully, there is a better way, and I am neglecting a simple function somewhere.  I would appreciate it if someone shared an improvement compared to these steps.

    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    DemoteHeaders = Table.DemoteHeaders(Source),
    HeaderTable = DemoteHeaders,
    HeaderFirstRow = Table.FirstN(HeaderTable,1),
    HeaderTransposed = Table.Transpose(HeaderFirstRow),
    HeaderRemoveWhitespace = Table.ReplaceValue(HeaderTransposed," ","",Replacer.ReplaceText,{"Column1"}),
    HeaderTransposed2 = Table.Transpose(HeaderRemoveWhitespace),
    CombinedTables = Table.Combine({HeaderTransposed2,DemoteHeaders}),
    PromoteHeaders = Table.PromoteHeaders(CombinedTables),
    RemoveFirstRowFormerHeaders = Table.Skip(PromoteHeaders,1)


1 Comment

Leave a Reply

%d bloggers like this: