Under the radar: Unravelling Microsoft Defender’s impact on Azure SQL
Thoroughly testing security configurations before deployment is key to ensuring optimal performance when it comes to SQL Server.
By Iain Evans, Senior Reliability Engineer
Thoroughly testing security configurations before deployment is key to ensuring optimal performance when it comes to SQL Server. This was highlighted by a problem one of our customers had recently where a default setting in Microsoft Defender for SQL Server, designed to protect against advanced threats, inadvertently caused performance degradation across multiple Azure SQL Server instances.
This particular customer was grappling with timeouts on multiple SQL Server instances. We kicked into troubleshoot mode, following the standard DBA playbook, yet couldn’t find an obvious cause. Something was definitely off-kilter because:
- Page Latches were shooting up, stealing the spotlight from CXPACKET as the primary wait stat. There was no tempdb contention, and Azure-based disks weren’t showing any I/O bottlenecks or throttling.
- The SQL Plan hadn’t changed, according to Query store, but the average duration had risen by a significant margin.
- The servers’ average CPU utilisation had spiked, yet there was no uptick in traffic. This was strange, considering overall CPU utilisation was comfortably below 50%, and there were no signs of CPU contention.
- Even after capturing a profile with parameters and giving it a rerun, the performance was inconsistent, meaning parameter sniffing was ruled out as a cause.
All these servers were running a single stored procedure, and the code hadn’t been altered. It was soon clear to us that something beyond SQL was messing with performance.
Uncovering the root cause
To draw out the culprit, we compared server configurations against an Aireforge snapshot. We immediately detected an extended event session called SQLAdvancedThreatProtectionTraffic had recently been deployed, which is done by default with Azure Defender for SQL Server (more here).
The main problem with this extended event session is that it records every single SQL query executed on the server. While extended events are generally lightweight and asynchronous by design, meaning it’s not usually an issue, this approach proved to be overwhelming for a high-volume transaction system. The excessive load generated by capturing and transmitting every single SQL query to Azure ultimately overloaded the servers. In addition, because extended events do not appear in query plans or process explorer, the focus of our initial troubleshooting on reviewing query plans and parameters sent us down a rabbit hole.
Performance effects
According to the Microsoft playbook, Azure Defender for SQL normally takes a modest 3% of CPU during peak times. In this case, it was 12%. This seems to be a common performance effect when deploying Microsoft Defender for Azure SQL. What was more worrying was the increase in waits.
Disabling Defender and hitting pause on the extended event session worked like magic. Total waits in SQL server dropped from 1.12 s/s to 0.22 s/s, bringing the average execution time of the stored procedure from seconds down to a snappy 5 ms.
The friendly reminder
Now, before you get the idea that we’re advising a total shutdown on antivirus (we’re not), it’s more about embracing the importance of testing changes before rolling them out. If you’re diving into third-party tools, Microsoft has your back with a detailed list of exclusions to keep antivirus from nudging SQL Server. The cheat sheet can be found here.
To sum it up – a misconfigured antivirus can play tricks on your SQL server performance, even if it’s a Microsoft creation running on Microsoft hardware. Take a moment to scrutinise your antivirus setup and keep an eye out for any performance head-scratchers.
Question?
Our specialists have the answer