Locked Decode cycle dropped #FT8


Steve Miller
 

This is with version WSJT-X 2.1.2 using FT-8 mode but has been experienced on previous WSJT-X versions as well.

 

WSJT-X will skip decoding a cycle for no apparent reason. Signals are present as observed in the waterfall but nothing in that Rx cycle gets decoded. It happens randomly while just monitoring. Also running JTAlert, 2.15.6.

 

Anyone else seen this?


Dick
 

 > WSJT-X will skip decoding a cycle for no apparent reason.

> Signals are present as observed in the waterfall but nothing in that Rx cycle gets decoded.

> It happens randomly while just monitoring. Also running JTAlert, 2.15.6.

 

> Anyone else seen this?

Yes, in  my case it was always related to (periods of)  high CPU usage.

 

73, Dick PA-2015


Bill Somerville
 

On 14/01/2020 03:49, Steve Miller wrote:

This is with version WSJT-X 2.1.2 using FT-8 mode but has been experienced on previous WSJT-X versions as well.

 

WSJT-X will skip decoding a cycle for no apparent reason. Signals are present as observed in the waterfall but nothing in that Rx cycle gets decoded. It happens randomly while just monitoring. Also running JTAlert, 2.15.6.

 

Anyone else seen this?

Hi Steve,

there are many possible causes for this, some examples:

1) clock synchronization issues. Some third-party clock sync applications adjust the time in large steps and when that happens during an Rx period it is unlikely to decode successfully. One notable cause of this issue is that on Windows 10 some updates have gratuitously re-enabled the Windows Internet Time Sync application which then gets into a fight with the third-party application about the clock adjustments. You must only have one clock synchronization application active at any time.

2) high CPU resource spikes during Rx periods. Particularly disruptive are Deferred Procedure Calls (DPC) where, usually badly coded, operating system interrupt handlers block processing of critical functions like audio handling. There are a few free DPC checkers that can be run to determine if your system is suitable for near real-time audio handling. Reducing system load from other applications or devices may improve matters.

3) any form of audio level overload, even for short time periods, may disrupt the decoder. Noise spikes or very strong signals that exceed the maximum input level of the sound card ADC such clipping creates discontinuities in the received waveform. These discontinuities represent considerable wide-band noise added to the waveform that can mask all signals present. Make sure that you have sufficient headroom for strong signals by ensuring that the quiet band band noise is at 30dB or so on the WSJT-X thermometer type Rx level indicator (note that the indicator turns red if clipping is likely to occur due to excessive levels).

One indicator of problems with the incoming audio stream is a failure to save a .WAV file even when "Menu->Save->Save all" is checked, OTOH if a .WAV file is available for an Rx period that apparently failed to decode good signals; then that file should be shared with us for analysis.

73
Bill
G4WJS.


K9RX - Gary
 

I've seen this issue for some time now - a good number of versions ... AND it happens in JTDX at the same time/same sequence! So thanks Bill for the explanation. Although I'm not sure which of these would be the cause at my end.

-I will check the clock to see if Windows reset this, however if that were the case it makes no sense that it would be just one sequence out of the blue, be working fine - miss one sequence, go back to working fine. 
- high CPU is not likely with an i7-8770 and CPU typically sub 10% but I suppose possible...
-not an OL issue as I've seen this happen on quiet bands ... 

Gary 
K9RX


Al Groff
 

Open?? the Windows 'Resource Monitor' and observe CPU usage while running WSJT-X .. You may notice that the missed decode periods occur when the total CPU usage is near 100%.?? You may be able to shut down some apps and reduce this occurrence.

AL, K0VM

On 1/14/2020 3:25 AM, Dick wrote:

??> WSJT-X will skip decoding a cycle for no apparent reason.

> Signals are present as observed in the waterfall but nothing in that Rx cycle gets decoded.

> It happens randomly while just monitoring. Also running JTAlert, 2.15.6.

??

> Anyone else seen this?

Yes, in?? my case it was always related to (periods of) ??high CPU usage.

??

73, Dick PA-2015



    


Steve Miller
 

For the clock, using Diminsion4 utility, Will have to check if the W10 Time Syncro is disabled. My understanding is that W10 clock syncro happend once a week. Either way, I'll check this.
About high CPU, only during the decode process will I see it spike. Spike meaning my CPU will jump up from 20-25 up to 75-80 at the decode time then fall back to the 20s.
I'm usually only running my HRD interface, WSJT-X w/waterfall, JTAlert and HRD logging software. This is pretty standard during all my FT-8 sessions. No Chrome or IE windows are opened. I will check on the .WAV files too next time this issue is abserved.

If a session fails to decode, would the receive data still be put into the ALL.TXT file? I'm at work now but can check that once I'm QTH.
Thanks Bill.


VE5DP <ve5dp@...>
 

Bill - I have many decode failures and have followed all your suggestions.  I had been using Windows 7 on this machine and had several decode failures.  After upgrading to Windows 10 the failures have increased.   I have a WAV file demonstrating these failures and wonder where you want me to send it so you may analyze it.

Don - W7/VE5DP


Bill Somerville
 

On 19/01/2020 23:03, VE5DP via Groups.Io wrote:
Bill - I have many decode failures and have followed all your suggestions.  I had been using Windows 7 on this machine and had several decode failures.  After upgrading to Windows 10 the failures have increased.   I have a WAV file demonstrating these failures and wonder where you want me to send it so you may analyze it.

Don - W7/VE5DP
Hi Don,

I guess you addressed your message to me. Yes a .WAV file that demonstrates a failure to decode apparently good signals is the best way to diagnose such issues. If you upload the .WAV file to a Cloud storage service like Dropbox, Box, Google Drive, Microsoft One Drive, etc.; you can then post a link to it here.

73
Bill
G4WJS.


K9RX - Gary
 

I checked on my station whether or not the 'failure to decode' happened coincident with processor loading - and indeed I found that the opposite was the case. I have an i7-8770 that runs at about 8% with all the 'things' I have in play ... including PowerSDR as well as log, Chrome, VAC, etc ... the processor has 12 cores. I am running both WSJT and JTDX simultaneously. I see a spike in every one of those cores each time the sequence ends (i.e. heavy processing in WS/JT) ... BUT when the 'event' happens where nothing is decoded even though signals on the waterfall look no different than previously shown - there is NO spike in any of the cores! As if a decision was made NOT to process! I have a jpeg of that if anyone wants to see it. So for me at least its not processor overload. And I know it's not time (time is well controlled and is not seen as aberrant either immediately before or after the "non-decode" sequence). And there's no apparent audio overload at the time this is happening on the waterfall. 

Now - what I don't know - and I'll check on - is if at that very moment in this non-decode sequence the time 'controller' (don't remember which program I'm using - but one of the 2 recommended) is doing an adjustment - perhaps THAT is what is causing it. So what I'll do is when I see this happen again I will immediately bring up that program and see when its last 'sync' was done. If it literally just happened then indeed it would be 'time related' ... not time being off per se - but time correction. That might be the cause. 

Gary 
K9RX


Bill Somerville
 

On 20/01/2020 16:18, K9RX - Gary wrote:
I checked on my station whether or not the 'failure to decode' happened coincident with processor loading - and indeed I found that the opposite was the case. I have an i7-8770 that runs at about 8% with all the 'things' I have in play ... including PowerSDR as well as log, Chrome, VAC, etc ... the processor has 12 cores. I am running both WSJT and JTDX simultaneously. I see a spike in every one of those cores each time the sequence ends (i.e. heavy processing in WS/JT) ... BUT when the 'event' happens where nothing is decoded even though signals on the waterfall look no different than previously shown - there is NO spike in any of the cores! As if a decision was made NOT to process! I have a jpeg of that if anyone wants to see it. So for me at least its not processor overload. And I know it's not time (time is well controlled and is not seen as aberrant either immediately before or after the "non-decode" sequence). And there's no apparent audio overload at the time this is happening on the waterfall.

Now - what I don't know - and I'll check on - is if at that very moment in this non-decode sequence the time 'controller' (don't remember which program I'm using - but one of the 2 recommended) is doing an adjustment - perhaps THAT is what is causing it. So what I'll do is when I see this happen again I will immediately bring up that program and see when its last 'sync' was done. If it literally just happened then indeed it would be 'time related' ... not time being off per se - but time correction. That might be the cause.

Gary
K9RX
Hi Gary,

assuming you are using a third-party clock synchronization application, make sure that the MS Windows Internet Time Sync facility is not enabled. Some recent Windows 10 updates have rudely re-enabled the MS internet Time Sync without asking, this can result is unexpected clock jumps as the two facilities get into a fight about how to set the clock. The result of these time steps will be truncated Rx periods with missing samples, and WSJT-X will skip decoding if there are not enough audio samples collected for an Rx period. Another symptom of this issue is a failure to save a .WAV file even with "Menu->Save->Save all" checked.

73
Bill
G4WJS.


Al Groff
 

Gary,
  Could it be that the heavy CPU load at the end of a cycle is what is preventing the next cycle from starting.. And if it didn't start then there is nothing to decode at the end of the cycle and hence no CPU load at the end of the cycle.  With FT8, normally at the start of a cycle the audio stream should start to buffer and no heavy processing occurs until the end of the cycle. 
  Does the heavy processing that you see at the end of a cycle approach 100% ? 
  If you turn on save>save all, you may notice that only the decoded .wav files are saved but the failed decode .wav files are missing.

AL, K0VM

On 1/20/2020 10:56 AM, Bill Somerville wrote:

On 20/01/2020 16:18, K9RX - Gary wrote:
I checked on my station whether or not the 'failure to decode' happened coincident with processor loading - and indeed I found that the opposite was the case. I have an i7-8770 that runs at about 8% with all the 'things' I have in play ... including PowerSDR as well as log, Chrome, VAC, etc ... the processor has 12 cores. I am running both WSJT and JTDX simultaneously. I see a spike in every one of those cores each time the sequence ends (i.e. heavy processing in WS/JT) ... BUT when the 'event' happens where nothing is decoded even though signals on the waterfall look no different than previously shown - there is NO spike in any of the cores! As if a decision was made NOT to process! I have a jpeg of that if anyone wants to see it. So for me at least its not processor overload. And I know it's not time (time is well controlled and is not seen as aberrant either immediately before or after the "non-decode" sequence). And there's no apparent audio overload at the time this is happening on the waterfall.

Now - what I don't know - and I'll check on - is if at that very moment in this non-decode sequence the time 'controller' (don't remember which program I'm using - but one of the 2 recommended) is doing an adjustment - perhaps THAT is what is causing it. So what I'll do is when I see this happen again I will immediately bring up that program and see when its last 'sync' was done. If it literally just happened then indeed it would be 'time related' ... not time being off per se - but time correction. That might be the cause.

Gary
K9RX

Hi Gary,

assuming you are using a third-party clock synchronization application, make sure that the MS Windows Internet Time Sync facility is not enabled. Some recent Windows 10 updates have rudely re-enabled the MS internet Time Sync without asking, this can result is unexpected clock jumps as the two facilities get into a fight about how to set the clock. The result of these time steps will be truncated Rx periods with missing samples, and WSJT-X will skip decoding if there are not enough audio samples collected for an Rx period. Another symptom of this issue is a failure to save a .WAV file even with "Menu->Save->Save all" checked.

73
Bill
G4WJS.



    


VE5DP <ve5dp@...>
 

Bill.   I have uploaded a WAV file to Google Drive and set you as the addressee.   The decodews started at 211745 and ended at 213030.   It skipped the decodes at 212115. 212145, 212200, 212215, 212345, 212415,212430,212745, 212800, 212845 to 213000.
CPU usage averaged around 20% this whole time with occasional excursions to 50%.  I have the decode set to FAST but it fails no matter where I set it.
My Google Drive address is petersondon908@....
I hope you can accerss this file and find out why the decodes fail so oftem.

Don  W7/VE5DP
VE5DP@...


Bill Somerville
 

On 21/01/2020 00:15, VE5DP via Groups.Io wrote:
Bill.   I have uploaded a WAV file to Google Drive and set you as the addressee.   The decodews started at 211745 and ended at 213030.   It skipped the decodes at 212115. 212145, 212200, 212215, 212345, 212415,212430,212745, 212800, 212845 to 213000.
CPU usage averaged around 20% this whole time with occasional excursions to 50%.  I have the decode set to FAST but it fails no matter where I set it.
My Google Drive address is petersondon908@....
I hope you can accerss this file and find out why the decodes fail so oftem.

Don  W7/VE5DP
VE5DP@...

Hi Don,

I received a file but it was the Google Drive getting started PDF document, no sign of any .WAV files.

73
Bill
G4WJS.


K9RX - Gary
 

Al, No. Simple as that. You can clearly see peaks that go up and come right back down again. There is LOTS of time between those peaks that happen at the time of decode and the next end of sequence. If you go to TASK MANAGER and look at PERFORMANCE and then look at individual core performance you'll see what I mean. I am sure what I am describing is normal. However- when there is this loss of decode there is nothing - it doesn't even attempt to decode. 

Bill, I am using D4 and Windows is turned off. Note however if I look at the history in D4 there are these wild excursions of up to 0.4 seconds (24 hour sample - constantly doing this)... if I hit "OK" on D4, which forces a read of time and sync, it will show values like 0.25seconds and that is literally 2 minutes after having done it before. I decided to turn D4 off - and enable Windows... with that done I now have better Dt times displayed, more like 0.1 - 0.2 for most whereas before it was typically 0.3 or so ... I'm going to continue this test today and if I get time (no pun intended) see if I now have missing decode periods. 

Last: WHY IS IT that the Dt value are almost all positive?! I talked with a couple of friends and they see the same thing. If indeed compensation were being done correctly, and assuming most users are doing compensation as well then the average should be 0! I should see just as many negative values as positive values - seems only logical. 

Gary


VE5DP <ve5dp@...>
 

Bill.......here is a link to the WAV file - or so they tell me.   Hope you can get this one.

https://drive.google.com/open?id=1fdILnu0thkXM-keUqFHwl_iwh3S7OEUV

Don - W7/VE5DP
VE5DP@...


Bill Somerville
 

On 21/01/2020 13:43, K9RX - Gary wrote:
Bill, I am using D4 and Windows is turned off. Note however if I look at the history in D4 there are these wild excursions of up to 0.4 seconds (24 hour sample - constantly doing this)... if I hit "OK" on D4, which forces a read of time and sync, it will show values like 0.25seconds and that is literally 2 minutes after having done it before. I decided to turn D4 off - and enable Windows... with that done I now have better Dt times displayed, more like 0.1 - 0.2 for most whereas before it was typically 0.3 or so ... I'm going to continue this test today and if I get time (no pun intended) see if I now have missing decode periods.

Last: WHY IS IT that the Dt value are almost all positive?! I talked with a couple of friends and they see the same thing. If indeed compensation were being done correctly, and assuming most users are doing compensation as well then the average should be 0! I should see just as many negative values as positive values - seems only logical.

Gary
Hi Gary,

time excursions like you are seeing will cause WSJT-X many problems, you need to stop that happening. Using the MS Windows Internet Time Sync is not sufficiently accurate unless you make several registry changes to adjust the NTP parameters. You are better off installing Meinberg NTP Client, as that, once set up during installation to your local pool.ntp.org pool, will result in much smoother clock control.

DT values will always tend to positive numbers due to latencies. Latencies accumulate in the positive direction, so the sum of the path delay, any sound card delay, buffering between sound card and application, and processing delays all contribute. If your rig is an SDR or has a DSP there will be more latencies added from them too. Latency cannot be corrected by adjusting the PC clock since the result will be a delay to transmitted signals, thus compounding the error in a two way QSO. The software allows plenty of scope for normal latency and small clock errors.

73
Bill
G4WJS.


Bill Somerville
 

On 21/01/2020 16:05, VE5DP via Groups.Io wrote:
Bill.......here is a link to the WAV file - or so they tell me.   Hope you can get this one.

https://drive.google.com/open?id=1fdILnu0thkXM-keUqFHwl_iwh3S7OEUV

Don - W7/VE5DP
VE5DP@...

Hi Don,

that .WAV file decodes without issues for me:

213030   3  0.1  407 ~  CQ WI9SSR EN53     U.S.A.
213030 -22  0.4  512 ~  VA3NL K9IG -05
213030 -13  0.2  728 ~  KB8EE KM6GUO CM88
213030 -24  0.1  861 ~  CQ N5HXR EM40      U.S.A.
213030  -7  0.1  995 ~  ZS6LKF KE0TQI R-16
213030   3  0.1 1129 ~  CQ W9JA EN43       U.S.A.
213030 -13  0.1 1190 ~  CQ W7WMB EN72      U.S.A.
213030  15 -0.4 1268 ~  CQ DX W9GU EN62    U.S.A.
213030  14  0.1 1306 ~  CQ N7RO CN88       U.S.A.
213030 -14  0.3 1429 ~  NR3Y KF8FD -08
213030   8  0.1 1488 ~  KQ6CA W0BLE EN31
213030  -3  0.1 1575 ~  KJ7LEX AI4EY RR73
213030  12  0.1 1673 ~  N8NW WA0LJM EN27
213030  -5  0.3 1816 ~  N0BAV W1SIP R+00
213030 -21  0.1  809 ~  FG5GP AG5HC R-04
213030 -14 -0.2  998 ~  VA7AQ NN1D -03
213030 -11  0.2 1111 ~  CQ K9ZW EN64       U.S.A.
213030   3  0.2 1253 ~  CQ W7NRH DN28      U.S.A.
213030  -7  0.4 1607 ~  CQ CO7DSR FL11     Cuba
213030  -1  0.1 1651 ~  PS8JL KG5ZEL 73
213030 -12  0.1 1851 ~  KQ6CA KF5ZBL EM12
213030 -18  0.1 1957 ~  AB8O KE7C -15
213030  -5  1.3 1285 ~  K7YVR KW4SP EM64


73
Bill
G4WJS.


VE5DP <ve5dp@...>
 

Thanks, Bill.   I thought that WAV file should cover the time from 211745 to 2113030.  It decoded at 213030 for me also.    It skipped decodes at 211830, 211845, 212115, 212145, 212345, 212415 and 212745.
I am using an IC-7000 with a Signallink USB interface.  There were no transmit cycles during that time so no RFI.  The Windows clock sync is disabled.  The time syncing program was disabled so there should have been no clock issues .The DECODE button is currently set to Fast but it doesn't make any difference if it is set to normal or deep.  Using WSJT-X 2.1.2.
Or does the saved WAV file only include the last decode cycle?  I am I doing this incorrectly?
Do you ever sleep?

Don - W7/VE5DP


Markku SM5FLM
 

I have the same problem with occasionally dropped decode cycles.

I'm use the internal audio drivers in my Icom ic-7300, and the receive level is always on abt 60 db. I have no problem with non-synced clock. I see spikes on the cpu utilization, in the end of cycles to abt 70-80%. (But can we be sure the graphic display really shows the maximum value?)

I'm using Windows 10 with a new computer. Cpu: AMD A9-9425, with 5 cores, 3.1 GhZ. Memory 8 GB. Running applications: WSJT-X, DXlab: Commander, Dxkeeper, Spotcollector. I usally also have Firefox running, which take lot of memory. But it doesn't make any difference if it's running or not.


Al Groff
 

If you monitor total CPU load in windows Resource monitor, you may notice that there is a peak in CPU loading at the end of each decode cycle in which decodes did not fail.  If the cppu load approaches 100%, the following cycle may not have any decodes (and no .wav file is saved).  A portion the that cpu loading peak is the FT8 decode and a portion  of the peak is database lookups in DxKeeper and SpotCollector that occurs when there are decodes.

Shut down the wsjt-x feed to DXlabs for a while and see what happens to the dropped decode cycles.

AL, K0VM

On 1/22/2020 1:43 AM, marsip@... wrote:

I have the same problem with occasionally dropped decode cycles.

I'm use the internal audio drivers in my Icom ic-7300, and the receive level is always on abt 60 db. I have no problem with non-synced clock. I see spikes on the cpu utilization, in the end of cycles to abt 70-80%. (But can we be sure the graphic display really shows the maximum value?)

I'm using Windows 10 with a new computer. Cpu: AMD A9-9425, with 5 cores, 3.1 GhZ. Memory 8 GB. Running applications: WSJT-X, DXlab: Commander, Dxkeeper, Spotcollector. I usally also have Firefox running, which take lot of memory. But it doesn't make any difference if it's running or not.