toggle quoted messageShow quoted text
Many thanks for your detailed explanation of the need for AVX implementation , which makes total sense to me.
As you know I am a keen user of MAP65 and fully support this AVX development for use on Q65 decoding and other areas of the WSJT-X / MAP65 applications.
Keep up the great work
On Fri, 11 Jun 2021 at 01:05, Bill Somerville <g4wjs@...
On 09/06/2021 00:37, Bill Somerville wrote:
> Hi all WSJT-X users,
> we are looking into some performance enhancements that will take
> advantage of some parallel processing features of modern CPU
> architectures. In order to gauge how much backwards compatibility for
> older CPUs we will have to implement it would help to know who is
> using such older processors. Please don't turn this thread in to a
> mine is better than yours conversation, all I need to know is who or
> how many of you are using the older CPU architectures. Note that this
> applies to MS Windows, Intel Linux, and Intel macOS users, it is about
> CPUs not operating systems.
> The technology we will use is called AVX and that is present on all
> Intel CPUs branded Core i3/i5/i7/i9 (circa 2010 to present), it is
> also present on AMD CPUs since the Jaguar or Puma based CPU models
> (some late Athlon-II CPUs, all Zen based CPUs, including Ryzen) circa
> 2013 to present.
> Notably Intel CPUs branded Celeron, Pentium, or Atom do not support
> the AVX technology.
> So in summary, look up your CPU and if it **does not support AVX**
> (https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) then let me
thanks for all those who took the time to check their systems and report
those without AVX support. That has helped to get a broad picture of the
numbers with older CPUs that pre-date this feature. Not unsurprisingly a
significant proportion lack AVX, this is almost certainly due to systems
being acquired second-hand, repurposed from other uses, or kept for
extended periods as they are more than adequate for the average shack
PC. Some have suggested that we should not abandon owners of these
older PCs, don't worry as that has never been the intention of this
exercise, here is some background that should help to clarify what may
The MAP65 application, as of WSJT-X v2.5.0 RC1 has been updated to
decode Q65 signals, this is because we feel certain that Q65 is superior
for EME use on all bands and the prior JT65 decoding ability will be
superseded by Q65. The MAP65 decoder is able to decode many signals
across a wide pass-band, and also implements polarization diversity with
suitably equipped stations. Automatic linear drift compensation has also
been added to compensate for less well specified stations. This all
requires a lot of signal processing effort, but users expect signals to
be decoded in the short interval between the end of transmission and the
start of the next period (note with EME the path delay means that up to
2 1/2 seconds of that interval is lost compared with terrestrial paths).
The first use of hand coded micro-optimizations using AVX instructions
on suitable CPUs will be aimed at getting Q65 decodes done faster in
MAP65. Because the Q65 decoder is shared by WSJT-X and MAP65, the same
optimizations will be there for WSJT-X Q65 users. None of this is
particularly relevant to the survey of CPUs done here as I am sure that
PC costs are such a small part of the typical EME station investment
that users will find a way to upgrade their PCs if necessary.
So why did I ask the question about AVX? Once we start using AVX for
some parts of WSJT-X it makes sense to find other opportunities for
similar hand coded micro-optimizations elsewhere in out code base, not
only that but once implemented we may well choose to increase the
decoding depth of other decoders by taking advantage of such performance
gains. The net effect would be that those with AVX equipped PCs will see
faster and deeper decoding, those with older PCs will see the same extra
depth but overall decoding will take longer than before. My aim was to
judge what proportion of users might suffer this speed degradation
versus those that will see both faster and deeper decoding.
To reassure those that may have misunderstood, there is no intention to
exclude users from the latest WSJT-X enhancements just because they have
older CPUs. We would implement AVX implementations of critical
algorithms alongside their current linear implementations and the choice
of which to use would be made at runtime according to the available CPU
features. Note exactly this already happens in the FFT library we use
called FFTW3, so WSJT-X and MAP65 users have always had AVX specific
algorithm implementations for FFT calculations if the CPU they run on
supports them. We are investigating coding other critical algorithms in
a similar fashion. Notwithstanding that, we also have no intention of
dropping support for ARM CPU architectures like the Raspberry Pi, yet we
have no intention of similar hand coded micro-optimizations for that
platform since the required tools do not exist, so for that platform our
linear implementations would still be used, just like on non-AVX Intel
or AMD CPUs.