Directions EMEA 2019 – Session “NAV/SQL Performance in a Nutshell” (NSFW)

Title:
NAV/SQL Performance in a Nutshell

Warning:
This session is going to be rude, insulting and slightly pornographic!
NOT SAFE FOR WORK!

Skill:
100-200, Business

Abstract:
No worries, this session will not bother you with annoying technical details about performance optimization. I will try to describe – in simple terms – the general challenges, approaches and necessity of tuning and troubleshooting.
This should help you – Dynamics Partners, Resellers and Consultants – to explain to your customers, why there actually are “performance issues” and how they could be resolved (mostly).
I will also somewhat spoil your “Cloud-enthusiasm”, explaining the risks and limitations in terms of performance-optimization.

Slides:
NAVSQLPerformance_Nutshell_DirectionsEMEA2019

Directions EMEA 2019 – Workshop “SQL Server Configuration & The Magic 7”

Title:
SQL Server Configuration & The Magic 7

Skill:
200-300, Technical

Abstract:
First we will configure an SQL Server instance according to common best-practice, to establish a solid basis for good performance.
Then I will walk you through just 7 generic* TSQL scripts which will help you to easily identify bottlenecks and to fix several crucial problems.
*) these scripts work (more or less) from SQL version 2005 to 2017 independent of any application (no matter if NAV, BC, AX, CRM, … you name it)!

Prerequisites:
Bring your own Laptop! You need to have a locally pre-installed up-running SQL Server installation – any SQL Version (2005 to 2017) with an application of your choice (preferably NAV version 4.0 SP3 to BC).
The SQL “Edition” should be at least STANDARD, DEVELOPER or ENTERPRISE (not EXPRESS); incl. SQL Agent, Management Studio; maybe also SQL Profiler and “Books Online”.
You probably will not be able to access any remote installation through internet etc. but you could give it a try.

You’ll find all the required scripts here:
https://sourceforge.net/projects/magic7/
https://sourceforge.net/projects/ixadder/
https://sourceforge.net/projects/maintenancetrident/

Slides:
Maint3dent_and_Magic_7_DirectionsEMEA2019

Magic 7

Long time no see – it’s about time!

Today – 1st August 2019 – I decided to proudly publish the “Magic 7” scripts to the public!

7 Generic TSQL Scripts to Troubleshoot SQL Performance Problems

The M7 bunch of scripts should help the EXPERIENCED Troubleshooter to quickly identify bottlenecks and to develop/implement appropriate solutions.

Download Magic 7

The M7 scripts are published under GNU General Public License (V3, 29 June 2007)

It’s a collection of some basic but essential stuff I use daily (among other tools) for many years to get my job – troubleshooting & performance optimization – properly done.
Admittedly, these thingies are just doing the “grunt work”, of course there’s much more to do in terms of database tuning …
These scripts are more or less permanently “under construction” and constantly expanded and enhanced, e.g. targeting new issues or dealing with new releases of SQL Server.

I want to encourage you to share your experiences with the “Magic 7” and maybe contributing to it, making it better with every mission!

Cheers,
Jörg

NAV TechDays 2017: C/AL – Coding for Performance

Abstract:

There are several misunderstandings, confusions or “fake-news” around, when it is about C/AL programming and how it affects performance on SQL Server. Time to clean this up!

In this session we will explain some “DOs & DONTs” of programming NAV; showing best-practices in using certain methods and how to avoid common mistakes.

Further we will give some guidance about how to identify and troubleshoot query and code bottlenecks. Make your C/AL code rock!

Presented by Jasminka Thunes and Jörg Stryk

Presentation Slides & Scripts: NAV TechDays 2017.zip

Download Session Recording from mibuso.com: https://mibuso.com/downloads/nav-techdays-2017-c-al-coding-for-performance

Watch Session on YouTube:

Directions EMEA 2017: NAV/SQL Performance in a Nutshell

Please find attached the presentations-slides (“Safe for Work” Edition 😉 ):

Abstract:
No worries, this session will not bother you with annoying technical details about performance optimization! I will try to describe – in simple terms – the general challenges, approaches and necessity of tuning and troubleshooting. This should help you – Dynamics Partners, Resellers and Consultants – to explain to your customers, why there actually are “performance issues” and how they could be resolved (mostly); also highlighting the value of upgrading and potential risks with Cloud scenarios.

EMEA2017_PerformanceNutshell.zip

General network issues with Windows and VMXNET3

Some time ago I blogged about a strange CPU/Thread problem which affected various NAV versions (2013-2016) in virtual environments;

see: http://blog.stryk.info/2016/12/06/navsql-is-completely-stalled-because-of-too-many-cpu-threads/

This especially affected VMWare machines, specifically running the vmxnet3 network-adapter.
I recently got some info, that this is also a general issue of Windows with this adapter!
Changing some settings of the network-adapter seem to help, stabilizing the system and boosting performance!

Details here:
Windows Server 2012: http://lifeofageekadmin.com/network-performance-vmxnet3-windows-server-2012-r2/
Windows Server 2016: http://lifeofageekadmin.com/network-performance/

TEST THOROUGHLY! NO GUARANTEE. NO WARRANTY. NO SUPPORT.

NAV Talk – NAV/SQL Server Sizing & Konfiguration

Teaser zum NAV Talk “NAV/SQL Server Sizing & Konfiguration” vom 26.01.2016

Inhalte:

  • SQL Server & Dynamics NAV Service Tier: Sizing & Monitoring
  • Konfiguration SQL Server
  • Konfiguration Dynamics NAV Service Tier

(c) 2016 MBS Training, Rene Gayer

Der ganze NAV Talk ist im Rahmen eines Abonnements bei http://www.learn4nav.com/ verfügbar.

NAV/SQL is completely „stalled“ because of too many CPU Threads

Well, I was confronted with the following issue about a dozen times just in the past two months, so I decided that’s probably worth a BLOG article (hey, it’s “Santa Claus” today 🙂 )!

Symptom:

NAV and SQL Server stop responding, the system completely hangs, you cannot even access SQL Management Studio anymore. IF you can access SSMS you’ll see lots of processes/connections waiting on ASYNC_NETWORK_IO, seemingly blocking themselves.
With some luck, the system recovers within several minutes; if not, you need to reboot the Dynamics NAV services.

This only happens on NAV 2015 or 2016, running on builds released after April 2016 and before November 2016. So far I‘ve seen that, the problem only(?) happens on VMWare running the network-adapter “vmxnet3”, but I can imagine other adapters could have the same issue.

Cause:

There are more CPU Threads running than Worker Threads are available.

Now you might ask: Huh???

Explanations:

OK, this is going to be a longer shot; starting from the beginning …

The CPU is the component of a NAV/SQL system which is actually doing all the work. In terms of SQL Server, it’s the thingy processing the queries, the “questions”, so to speak, asked by NAV. Hence, the number of required CPU depends on the number of questions asked which depends on the number of Users, asking those questions. The more Users, the more CPU. A common and pretty good Best Practice for a SQL Server running NAV is:

1 logical CPU (Core/Thread) per 25 Users/Sessions

For example, you run NAV with 100 Users you’ll need 4 CPU (as it is with Best Practices, that’s just an educated guess, which may need to be adjusted to the individual requirements).

(BTW: the same formula applies to the CPU sizing of a NST; details not discussed here)

Now there’s something called “Expensive Queries” (XQ); that’s queries which put pressure on RAM and CPU, taking long time, giving bad User Experience. If such an XQ is processed, SQL Server might decide to invoke multiple CPU to process the query with parallel CPU Threads.

The SQL Instance setting “Max. Degree Of Parallelism” (MAXDOP) defines, how many CPU could be invoked. The default value 0 (zero) means “all” CPU. Thus, this means, that just one XQ could invoke “all” CPU of that server, which has the risk, that this single query could become a “global killer” by occupying too many CPU so other tasks could not be processed anymore. That’s why MAXDOP always (on any SQL Server) must be limited! Again, there are Best Practices, but we have to distinguish several things:

Scenario MAXDOP
Rest-Of-The-SQL-World-(non-NAV) Half number of CPU
NAV 2009 R2 and lower 1
NAV 2013 and higher 2

To make a long story short: NAV queries are not really capable to run on parallel threads.
If the setting is OK could be seen from the “Wait Statistics”, e.g. the CXPACKET waits.

Then there’s another setting, affecting the CPU: “Max. Worker Threads” (WT). Imagine this as some kind of “entry channel” to the CPU. The default setting is 0 (zero) which means, SQL Server is running on a built-in matrix, defining the number of WT created:

Number of CPU 32bit Computer 64bit Computer
<= 4 256 512
8 288 576
16 352 704
32 480 960

Usually this is absolutely OK and there’s no need to change anything.

For example:

We have 100 Users/Session, thus 4 CPU, then having 512 WT. Even assuming MAXDOP is still at default 0 (zero), assuming that all 100 Users at the very same time are executing such XQ, so even if all these 100 XQ are invoking “parallelism” with 4 threads each, this would generate a total of 400 CPU Threads, which is still below the “Max. Worker Threads” setting of 512. Even this would not completely kill the CPU. Hence, in real life hardly ever a problem will arise from this.

But when our problem occurs – everything’s stalled – you will see, that way more CPU Threads are running! Actually, there are more Threads running than WT are configured!
If that happens, thus all “entry channels” to the CPU are occupied, the CPU is overwhelmed, does not accept any further processes/task and gets totally unresponsive. The system is completely stalled, you cannot even connect SSMS or stuff.

You can check the configured WT and the Current Threads with this query:

SELECT max_workers_count as [Configured_Worker_Threads] FROM sys.dm_os_sys_info
SELECT SUM(current_workers_count) as [Running_Worker_Threads] FROM sys.dm_os_schedulers

So, how come? Assuming the SQL Server is properly sized and configured, how can there be more Threads than Workers defined??? Shouldn’t SQL Server limit the number of Threads?

Well, reason is, that also within the network-layer multiple CPU threads can be invoked (remember the good old OSI-model!?). If large result-sets are communicated over the network, the net-adapter could also invoke multiple CPU to handle the traffic quicker – the feature is called “Large Receive Offload” (LRO).
So it looks like – and the exact reasons are not fully clear to me, as this happens really deep in the communication layer – that a certain combination of NAV build (and the used “Multiple Active Result-Set” (MARS) technology) and network adapter is generating this huge number of threads (MARS ==> ASYN_NETWORK_IO), if large result-sets are transmitted between NST and SQL. As a result, all these threads totally overwhelm the CPU and the whole system goes “south” …

So, what can we do about this?

Workaround:

Completely suppress parallelism of the SQL Server instance:

Max. Degree Of Parallelism = 1
Cost Threshold For Parallelism = 60

Increase the number of configured Worker Threads:

Max. Worker Threads = 2048

EXEC sys.sp_configure N'max degree of parallelism', N'1'
GO
EXEC sys.sp_configure N'cost threshold for parallelism', N'60'
GO
EXEC sys.sp_configure N'max worker threads', N'2048'
GO
RECONFIGURE WITH OVERRIDE
GO

Plan B:
Change your network adapter; e.g. in VMWare this only seem to happen with “vmxnet3”; using the older “E1000” should not cause that problem.

This is dealing with the symptoms, not the cause. So additionally you need to identify the “Expensive Queries” which are triggering this mess! Hence, query- and index-tuning is required; maybe programming changes.

Resolution (supposedly):

Well, as this problem only affects certain NAV builds, we could already guess this whole mess is yet another bug in the NST. Well, it is … uh … was: it should be fixed with these Cumulative Updates:

CU25 for NAV 2015 (Build 47254) of 07.11.2016
https://blogs.msdn.microsoft.com/nav/2016/11/07/cumulative-update-25-for-microsoft-dynamics-nav-2015-has-been-released/

CU13 for NAV 2016 (Build 47256) of 07.11.2016
https://blogs.msdn.microsoft.com/nav/2016/11/07/cumulative-update-13-for-microsoft-dynamics-nav-2016-has-been-released/

Hope this could help you!

No warranties, no guaranties, no support.

Cheers & Merry Christmas!

Directions EMEA 2016 – SmartQuery Tuning

NAV/SQL Performance – Optimizing [not-so?] Smart-Queries

Abstract:

Query- and Index-Tuning in general are the most important aspects in optimizing performance in Dynamics NAV. Since NAV 2013 here a new “challenge” was introduced: SmartQueries! These SQ often turn out not being so smart at all, then causing seriously degraded performance and user-experience.
In this session it will be explained how to troubleshoot such issues, discussing options about diagnosis and resolution; also highlighting the special challenges in Cloud based scenarios.


 

Please find attached the related download-package containing the PowerPoint slides, Examples and essential Tuning Scripts.

Directions EMEA 2016.zip