Missing RulesKey values when loading the data warehouse database from Microsoft’s Data Migration Assistant
I have been recently working on preparing a migration readiness assessment for a customer’s on-prem SQL servers to Azure. As is often the case, Microsoft provides a variety of tools that help our day-to-day job. In this case, we were using the Database Migration Assistant (DMA). The tool itself does what it says on the can – assesses the SQL databases for migration. It does give you the answers, but it has no visual representation of the SQL estate and its readiness for migration. One of the chapters related to DMA talks about consolidating assessment reports and presenting them in a form of PowerBI Dashboards.
The process itself is a bit more involved than just simply running the DMA itself, but once the individual parts are in place, it’s much more simple to re-run the assessments if needed to see the progress of the readiness. The initial tests looked good, and the result from the test database server provided the information as promised. But once we started to add production database servers, we came across an issue when trying to load the data from the reporting database to the warehouse database.
The problem we found was when we were stepping through the LoadWarehouse.sql script. In step 3, the INSERT command, which relates to a complex SELECT with JOIN query, is commented out, and you are supposed to review the results of the SELECT query, before enabling the INSERT command. In the notes for this step, it says to ensure there are no NULL keys (with the exception of one specific column). After adding some production databases, we found that we actually do have NULL values in the RulesKey column.
Values in the RulesKey column reference another table in the database – dimRules, which as the name suggests, contains a list rules the assessment tool is checking against. As there seem to be duplicate RuleIDs, the JOIN that looks up the records in the dimRules table uses both the ID and the Title.
The first step we did was to look into the ReportData table in the Reporting database and to list the different RuleId/Title combinations present in our assessment. We could see that truly there are duplicate RuleIds, but with different Title. As the following picture shows, for example, there are three different rules with Id 71627. Note that in this example, the Title of the rule is prepended with its Id.
Next, we looked into the dimRules table in the Warehouse database and looked up the rules with Id 71627. As soon as the results were displayed, it was clear where the problem lies – in the rules definition table, the Title does not contain the Id prefix, as shown in the following screenshot.
We tried to check the same database in the DMA, and could see that this is displaying the rules in the same way.
This would be an indication that when the DMA is generating the JSON file, it is saving the Title with the RuleId prefixed to it, and we were able to confirm it looking into the generated JSON output file.
Contrary to the JSON file, the createDMAWarehouse.psm1 PowerShell script that is part of the DMA modules, insert the values into the dimRules table without any pre (or post)-fix.
To get around the issue, we decided to clean up the data in the ReportData table of the Reporting database, using the following SQL command:
UPDATE ReportData SET Title = SUBSTRING(title,9,LEN(title)-8) WHERE Title LIKE '%] %'
After we reviewed the results, we found another Title that had similar features as the previous one, but in this case, the Id was appended to the Title.
Using a variation of the earlier SQL command to remove the Id prefix, we also updated records with the appended Id using the following command.
UPDATE ReportData SET Title = SUBSTRING(title,1,LEN(title)-8) WHERE Title LIKE '%]%'
Once we ran the second SQL command, we re-ran the warehouse select query, and this time the results looked much, better, with only a small handful of rows still having NULL as a value in the RulesKey column.
Addressing these few remaining records took a little bit more work, and we had to narrow down the search in the ReportData table using a combination of InstanceName, DatabaseName and ImpactedObjectName fields. Similar to the previous cases, once we had the list of records that had an incorrect Title, we were able to update the ReportData table and finally achieve that there are no more NULL values in the RulesKey column. Soon after we connected the PowerBI template to the data source, we could enjoy the end result of this work.