Windows and DAW’s !
Real World DAW Multiprocessor X-Scaling Performance !
Following up on the recent DAWbench reports that Pete @ Scan completed featuring the AMD Threadripper Pro CPU’s, which showed the huge potential overhead available with the HEDT Workstation platforms in the standard DAWbench empirical saturation test results, I thought it would be a good idea to extend some testing across multiple DAW platforms, and in particular, using a test session not used in the standard DAWbench test results, which internally I have dubbed the deal breaker.
I have made the point on numerous occasions that the available overhead is only relevant if the DAW’s can utilize the extended logical core counts, which would be largely dependent on the respective audio engines and their thread management routines. Session logistics are also a key factor in overall DAW audio engine performance, especially when Groups/Busses are heavily utilized, where it’s been shown to expose audio engines to potential over elevated stresses.
This is a very common session layout for many end users, and is something that I focused on regards developing a DAWbench test session to better replicate a Real World production session. Enter DAW DSP MIX and MIX-EXT.
DAWbench DSP MIX/MIX-EXT
The DAWbench DSP MIX sessions were developed as a result of an investigation I initiated following up on a series of reports by a close knowledgeable friend who was experiencing odd threading and MP performance scaling on his Real World Mix sessions. The reports are quite detailed and too extensive to cover here, but I will give a summary and an overview in this article to get everyone up to speed. For those that want to dive right down the rabbit hole, you can read the reports at my DAWbench FB page, but probably best to read them in better context as I summarized them on my GS thread Here , starting at post #1359.
There has always been a vocal sector in the gallery that has continually dismissed the DAWbench test sessions as non-representative to Real World application, as if it’s some constant and a given. I have always maintained that the DAWbench sessions are an empirical test regiment that can be used to test scaling of CPU architectures in DAW environments, and that Real Wold application is dependent on session logistics, but that didn’t quieten the sector.
One complaint was that the tests didn’t represent single core / serial performance, so I included a BUS extension to the DSP test, which only revealed that respective DAW’s arbitrate the serial processing in different ways, and not necessarily landing the process on a single core.
All the above still didn’t cover the varied dynamics of Real World sessions regards tracks/bussing/FX sends etc, but instead of attempting an incremental/empirical Real World type session, I approached a new angle of testing after following and participating on the above noted thread, where it was showing comparative performance of a Real World Mix session across a lower core M1 Max Laptop and a higher core AMD 7950 desktop system.
This of course opened up some discussion regards what was happening with the thread management across the respective platforms on an identical RW Mix session. In the course of investigating, I requested a template of the RW session, removing all audio ( copyright ), but leaving all track, bussing logistics intact, that I could use to develop a RW test session using easily available Freeware Plugins, and replacing musical and audio elements with the available resources in DAWbench DSP.
Initial versions of the DAWbench DSP Mix sessions in Cubase showed the freshly ported session developed on my humble 12600K dev system, to deliver very similar performance and thread management dynamics to the 7950X. This made no sense considering that the 12600K 6 P-Core/12 Thread + 4 E-Core for total of 16 Threads, was primarily favouring the 6/12 P Cores as it theoretically should via Thread Director/Task Scheduler, but also showing the same dynamic on an All P-Core 16 Core/32 Thread Ryzen ?!
To add even further to the developing mystery, there was a trigger point we discovered that wakes the E-Cores and also the upper 17-32 cores of the Ryzen, problem being we hadn’t pinpointed the trigger as yet, and the deeper we tested with ports to other DAW’s, the more curves we encountered.
Original RW sessions were also ported to both Reaper and Studio One. These sessions became the standard DAWbench DSP MIX sessions , which have now been ported to additional DAW’s, Cakewalk Sonar, Tracktion Waveform and Bitwig Studio.
On top of the the standard session templates, I also developed an additional expanded version of the test session with even higher initial DSP overhead, and also included the facility where additional incremental load can be applied across busses to test the respective DAW audio engines threading and scaling capability to break point.
This is DAWbench DSP-EXT, which is the session that is utilised in this testing and report.
Multiprocessor X-Scaling !
To streamline the report I have only included a selection of CPU’s to highlight the delivered X-Scaling moving from the more widely used higher end Enthusiast level CPU’s from Intel and AMD, and the HEDT Threadripper. I did however expand the results to include an additional 2 DAW’s - Cubase 14 and Studio One 7, as well as the default Reaper 7. I felt this would give a better indication of how the respective DAW’s were arbitrating the session logistics involved
Regards the CPU’s, the most direct head to head comparative is the 9950X v 9750WX, as they are essentially the same Zen 5 core architecture, all P-Cores , and will show the direct results of the respective DAW X-Scaling capability going from 32 to 64 threads. (X being the scaling via the increased number of cores). I have also included the Intel 285K, as that is the direct competitor to the Ryzen 9950X.
Total Core / Thread Counts.
AMD Ryzen 9 9950X : 16 P-Cores/SMT : Total 32 Threads
AMD Threadripper Pro 9750X : 32 P-Cores/SMT : Total 64 Threads
Intel Ultra 9 285K : 8 P-Cores/16 E-Cores : Total 24 Threads
DAWbench DSP MIX-EXT
The listed Analog Obsession FrankCS plugins is the total number of additional plugins that can be incrementally activated across the configured Pre-Master Busses , over and above the session processing Preload.
Lots to unpack, and for those that haven’t read the Scan Threadripper DAWbench report linked above, it might be an idea to have a browse across that before I dive in.
First thing that is clearly evident is that the results for Reaper are substantially above those of both Studio One and Cubase. Now to reiterate, the MIX-EXT session is a very resource heavy RW styled mix session with lots of grouping/bussing and FX sends, and a large number of processing plugins across the tracks/groups/FX send and busses. All DAW’s were able to load and play the base session template successfully. Where we sorted the wheat from the chaff is how many additional plugins were able to be activated once the session was loaded.
So regards the DAW’s, there is a sliding scale from Reaper to Studio One and finally to Cubase, where ASIOGuard was heavily loaded and prematurely overrun by the processing flow of the session. Studio One is better, but once placed in context against Reaper, there is still a lot of room for improvement. In both these instances, total CPU resources used in TM would have been at best 20% on the 9950X/285K, lower again on the TR.
This is where it gets messier, because even using the best case scenario of Reaper, the X-Scaling from 32-64 threads between the 9950X and the 7950WX ranges from 1%-8% , Studio One/Cubase managed 0-1% , and by 512 the Intel 285K had actually managed 5% more plugins than the TR 9750WX, which is pretty wild when you think of the discrepancy in the total number of cores/threads available.
When More is Not More !
Now some may be thinking, well that’s only on that very specific session layout/logistic, I could get far better scaling depending on my own working environment and session layout.
True, Maybe !
The session layout isn’t anything that would not be similar to those used in Real World mixing environments, because it is literally based on a RW mix session. I have also developed several other sessions based on clients RW sessions that differ to this one, different track count, groupings, bussing, sends, used FX, etc, and the threading / audio engine dynamic is very close if not identical.
Large Virtual Instrument-Only templates may fare better, and that is still under investigation.
So what conclusions can we draw from the results using this specific session layout over and above the empirical saturation tests ?
Well quite simply, from the experience navigating these test sessions, none of the current DAW’s are actually able to utilize and scale efficiently to the additional cores/threads above 32, once the sessions involve commonly used session logistics like grouping/busses.
The Threadrippers for many if not most would be of no value performance wise, and with the large investment overhead involved to make the leap from say a Ryzen to the TR platform, for potentially only low single digit % improvements, I would reserve that option for those with very deep pockets, and/or very specific user case scenarios.
For some additional back ground, I have been navigating numerous Cubendo clients contacting me specifically about ASIOGuard spiking and/or being prematurely overrun with 80+% of resources still available for several years now, and I really don’t have any consensus on a solution for them past changing the session logistics if possible, or the inevitable freezing/rending tracks, which feel like a cop out to be honest. But there really is only so much that can be done while the devs are in a holding pattern at best, or in a mode of denial and dismissal at worst.
That’s not to say you can’t get work done on the current higher end CPU’s , which thankfully override some of the shortcomings with the brute force combination of clockspeed and IPC, but we are still leaving 80+% of available resources on the table
Some may ask if this is actually a Windows MP scheduling issue over a DAW thread management issue, which I suppose could be explored if all the DAW’s behaved in a similar manner, which they don’t. I am not ruling out that there are some curves that need to be navigated at the Windows end as well, but I’ll defer that for the next round.
Until then.



