Categories
Azure

Track Data Factory scheduled triggers inside PowerBI Report

I’m assuming you already have:
1. DevOps organization available with project’s service connection setup.(i.e. permissions to view your Data Factory)
2. PowerBI Desktop installed or license to use PowerBI online.

Currently there is no way to clearly see all scheduled triggers in Data Factory. You can check them one by one, but if you have very big number of triggers and they do change from time to time it’s best to have solution in place where you can monitor the schedules. This is especially important when you deploy your data factory, because deployment requires all triggers to be in stopped state. The solution is to get the data factory scheduled triggers information from powershell script ran on scheduled DevOps pipeline, save the information into a storage account in json format, then access this file from PowerBI, create a dashboard, publish it and schedule dataset refresh.

Alternatively if you’re not comfortable using the mentioned tools and services, you can use Bash instead of PowerShell, Automation Account instead of DevOps pipeline and Excel instead of PowerBI. The idea is the same regardless of toolkit choice.

1.1 Go to your DevOps organization, select a project, then go to Pipelines > Releases and click on create new Pipeline and choose Empty job.

1.2 Select “Add an Artifact” and select where your DevOps scripts are located. (Or if you don’t have specific place for that you can create new repo and start adding your scripts there)

And then click on “Schedule not set” and setup schedule on when you’d like the pipeline to run in order to refresh the information.

1.3 Now that you’re ready with the Artifact setup, go your Stage and add a new “Azure Powershell” job.

Now you need to give permissions to your Azure subscription (or if you’re not admin of the required subscription, you can ask your admin to create new service principal that would be able to access the required Data Factory (or data Factories), then you can go to the project’s service connection and add this service principal there. This way you’ll be able to use the service principal’s authorizations).

Select the Script File Path of the script that gets the required information, and for script arguments select the resource group name and the data factory name.

And then choose to use the latest Azure Powershell version.

2. Finally go add the PowerShell script to the select Script File Path.

param
(
    [parameter(Mandatory = $true)] [String] $resourcegroup, 
    [parameter(Mandatory = $true)] [String] $datafactory
)

Get-AzDataFactoryV2Trigger -ResourceGroupName $resourcegroup -DataFactoryName $datafactory | Where-Object {$_.RuntimeState -eq "Started"} | Where-Object {$_.Properties -like "*schedule*"} | ConvertTo-Json -Depth 5 | Out-File adftriggers.json
azcopy copy 'D:\a\r1\a\adftriggers.json' '{address to storage account + SAS token}'

Where your address would be the place where’d like to put the json file that contains the information about the data factory scheduled triggers. It should looks something like this:

https://ivotalkstech.blob.core.windows.net/mycointainer/adftriggers.json{SAS TOKEN}

If you’re unsure how to get the SAS token, go to your storage account, locate the directory/file, click on context menu (i.e. the 3 dots), locate generate SAS, give all permissions, configure expiry date and click on “Generate SAS token and URL”

The generated “Blob SAS URL” would be the location that you’ll use for your azcopy.

You can also add this as Key Vault secret and get it from there.

3. Open PowerBI Desktop and Get data as Other > Web. It’s important to use Web, and not the already available Azure Storage accounts connectors, because the available connectors are using access key for authorization, and we don’t want to give permissions to the whole storage account, just this one specific file, thus I highly recommend using SAS token with Web connection.

3.1 Now on connection configurations add the link to file then add the SAS token.

3.2 Now connect to the json file and once it’s loaded go to “Properties.Recurrance.Schedule.Weekdays”, click on Extract Values and select Comma delimiter.

3.3 Then double-click and select “Replace Values” and replace them, where 0 is Sunday, 1 is Monday and so on.

3.4 Now remove all the unnecessary columns depending on your preference, for instance I’m removing everything except

  • Name
  • RuntimeState
  • Properties.Recurrence.Frequency
  • Properties.Recurrence.Schedule.WeekDays
  • Properties.Recurrence.Schedule.Hours.0
  • Properties.Recurrence.Schedule.Minutes.0

3.5 Click on Close and Apply then go to any found Errors and then click on “Keep Rows > Keep Errors”

In my case I’m dashboarding the weekly triggers, but I have 3 triggers that are running on Daily and Monthly basis, instead of Weekly basis, so on Daily it’s all days, and on month is based on month date. So i’m ignoring those errors and I’m keeping the rest of the information. I don’t want to remove these rows.

3.6 Finally go ahead and click Publish. This will publish the report online in the PowerBI service.

4. For next step you’d like to refresh this dashboard after every DevOps pipeline run, so the information would be up-to-date. Sign in to your Power BI online service, find your report, select your dataset and click on settings:

Finally go to Dataset’s scheduled refresh and configure it. Here you’d like the scheduled refresh to be atl east few minutes after the scheduled run on your DevOps pipeline.

Now you have a PowerBI report that’s been refreshed on weekly basis and have all the information about the scheduled triggers in Data Factory.

Stay Awesome,
Ivelin

Leave a Reply

Your email address will not be published.