Integrating Power BI

Expand your Power BI Report capabilities

Pulling Multi-Page Website Data into Power BI with Power Query

When I was building a report for a Microsoft Data Challenge, I wanted to analyze Esports data over a 10-year timespan. I found a website, esportsearnings.com, that had what I needed. The site tracked yearly tables of the top players and games, showing such data as player earnings, rankings, game tournaments, etc. The problem was that the data was split between different webpages for each year and data category.

By default, this could have meant going to each page to extract the data. Another option I considered was web scrapping but I wanted to avoid this. The site did have an API but time was of the essence. I found an easy and simple solution using Power Query and the web connector. This allowed me to connect to a specific page to pull the data. But by expanding the M code slightly, I could use a looping query that pulled in every year’s page automatically and combined the results into one clean table.

And this is where ChatGPT really came in handy. It did most of the work creating the looping query and I just adjusted it for each category I wanted to pull.

Exploring the Data Source

The website had a consistent structure which made looping with Power Query a perfect solution. This included:

  • A table for each year, showing the top 100 items by category (separated by players, games, countries, etc.).
  • A URL pattern that followed a predictable format, for example:
    • https://www.esportsearnings.com/history/2014/top_players
    • https://www.esportsearnings.com/history/2015/top_players
    • https://www.esportsearnings.com/history/2016/top_players

That predictable URL pattern was the key to automation. Below is an example from the page showing top players for 2024.

Connecting the First Page

I started by using the Web connector in Power Query and pasting the URL for a specific category and year. See image below:

Power Query will display the table(s) available. In my case, there were several options available, but the first one was the one I needed.

This will pull in the table of data which covered the first 100 players for 2024.

Looping Through All Years

I wanted to avoid having to repeat the process for ever year and for every category. I knew Power Query could loop through this and this is where ChatGPT really shined. I simply copied the full M code from my initial query (described above) and asked ChatGPT to create a loop for me. Here was my chat:

ChatGPT was able to quickly create a replacement query that would loop through all 10 years and then combine them into one dataset.

The result was a complete dataset covering Esports player earnings from 2014 through 2024 — all stacked into a single table. I could then duplicate this same process for the other categories that I needed.

Conclusion

In summary, what started as a potentially tedious task, pulling data across many web pages, turned into simple solution with just a few lines of M code in Power Query. And, leveraging the power of AI made this even easier to create.