WSJT-X 2.4.0 crashes using FT4 #FT4 #IssueReport


Bill Somerville
 

On 21/07/2021 21:59, mike@wu9d.com wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.
Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.


Al Groff
 

Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall..  Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@... wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.

Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.






Bill Somerville
 

Hi Al,

if this is on MS Windows; is there any record of the crash in the Windows Event Log Applications section?

73
Bill
G4WJS.

On 22/07/2021 12:52, Al Groff via groups.io wrote:
Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall..  Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@wu9d.com wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.
Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.


Bill Somerville
 

On 22/07/2021 15:32, mike@wu9d.com wrote:
I don't deal in theory. The app crashed and left a process running. I know this because I tried to restart the app and it wouldn't start due to the process running. If you're going to tell me Windows, any version, doesn't need to be rebooted to fix problems then I you're not in the real world. I found that the process was running when The program failed to restart a third time. I killed the process and I could restart but certain things were not functioning properly and I won't belabor the issue here.

I'm a retired IT professional and I know how to troubleshoot. I have given you all the information I can. I was mid QSO and wasn't looking at the screen at the time. And it's a GOOD thing I cannot reproduce the problem. It really annoys me when I report an issue and people question me with nonsense. I would rather like to hear if others are experiencing the same issue. The developers can trace their programming, I cannot debug as a user.
Mike,

WSJT-X takes action to shut down any orphaned jt9 process left behind by a crashed wsjtx process. If that is not happening then maybe the jt9 process is hanging and is not able to be shut down by the normal inter-process communication mechanism used by the wsjtx and jt9 processes. What was the exact message you got when trying to restart WSJT-X after the crash, including the details? That may be helpful.

I have never seen a case where a Windows restart is required after a WSJT-X failure, I have seen cases where an orphaned jt9 sub-process has to be killed manually using Task Manager, I have also seen cases where the WSJT-X UI freezes, but that too is fixable by killing the wsjtx process. The times when Windows does need a restart is usually where a Windows update is pending and some driver update is part completed.

Applications that crash with memory access violations and similar abends leave a record in the Applications section of the Windows Event Log. Those records are not particularly helpful but at least indicate what signal caused the application to exit.

If you cannot reproduce the issue reliably then it is pretty unlikely that we, the developers will be able to either. Unexplained crashes are near impossible to diagnose if they cannot be reproduced. But of course you know better as you are an IT professional!

73
Bill
G4WJS.


Al Groff
 

Bill,
  I have found no trace of the event in any of the Windows Event Logs...
The only thing I have found so far is in the ALL.TXT...

210721_201607    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201615    14.080 Rx FT4     -7  0.4  346 CQ KB3SQX FN10
210721_201615    14.080 Rx FT4      1  0.1  766 DK3WL K1WY RR73
210721_201615    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS EN81
210721_201615    14.080 Rx FT4      9  0.5 1718 W3AW K5CSA R-01
210721_201615    14.080 Rx FT4     -2  0.1 1830 VE3WWR W4FCC EM63
210721_201622    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201630    14.080 Rx FT4    -10  0.4  347 CQ KB3SQX FN10
210721_201630    14.080 Rx FT4     -5  0.1  766 EC1YP K1WY R-11
210721_201630    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS R-08
210721_201630    14.080 Rx FT4     10  0.4 1718 W3AW K5CSA R-
210721_201837    14.080 Rx FT4    -10  0.6 1352 K8PL K1ERZ RR73

The crash occurred at 201630+ . I should have started a TX a 201637, but it did not occur.
WSJT-X was restarted by 201837... 

Sorry I did not have 'save all' turned on.  Wish I could supply more clues.

AL, K0VM



On 7/22/2021 7:00 AM, Bill Somerville wrote:
Hi Al,

if this is on MS Windows; is there any record of the crash in the Windows Event Log Applications section?

73
Bill
G4WJS.

On 22/07/2021 12:52, Al Groff via groups.io wrote:
Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall..  Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@... wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.

Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.







Bill Somerville
 

Hi Al,

thanks for that. The incomplete decode print just before the crash may be a clue. That shouldn't happen as the write to ALL.TXT is immediately followed by a flush and file close, so it is quite hard for an incomplete line to be written. I will look around that area in the code base and see if anything obvious stands out.

73
Bill
G4WJS.

On 22/07/2021 17:29, Al Groff via groups.io wrote:
Bill,
  I have found no trace of the event in any of the Windows Event Logs...
The only thing I have found so far is in the ALL.TXT...

210721_201607    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201615    14.080 Rx FT4     -7  0.4  346 CQ KB3SQX FN10
210721_201615    14.080 Rx FT4      1  0.1  766 DK3WL K1WY RR73
210721_201615    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS EN81
210721_201615    14.080 Rx FT4      9  0.5 1718 W3AW K5CSA R-01
210721_201615    14.080 Rx FT4     -2  0.1 1830 VE3WWR W4FCC EM63
210721_201622    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201630    14.080 Rx FT4    -10  0.4  347 CQ KB3SQX FN10
210721_201630    14.080 Rx FT4     -5  0.1  766 EC1YP K1WY R-11
210721_201630    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS R-08
210721_201630    14.080 Rx FT4     10  0.4 1718 W3AW K5CSA R-
210721_201837    14.080 Rx FT4    -10  0.6 1352 K8PL K1ERZ RR73

The crash occurred at 201630+ . I should have started a TX a 201637, but it did not occur.
WSJT-X was restarted by 201837...

Sorry I did not have 'save all' turned on.  Wish I could supply more clues.

AL, K0VM



On 7/22/2021 7:00 AM, Bill Somerville wrote:
Hi Al,

if this is on MS Windows; is there any record of the crash in the Windows Event Log Applications section?

73
Bill
G4WJS.

On 22/07/2021 12:52, Al Groff via groups.io wrote:
Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall.. Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@wu9d.com wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.
Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.


groups@...
 

On 22/07/2021 17:08, mike@wu9d.com wrote:
Yes, yes, yes. Thank you for your support. You've been very helpful. Love your condescending tone. Sorry I posted anything here. Big mistake.
Good afternoon Mike

No. No. No.

I'm not happy with your intemperate language here. Bill has done his best to assist and I regard your response to be inappropriate. It's obvious that you are unhappy but have not chosen the politest way to make your feelings known. Bill is a volunteer and donates his time for nothing and deserves better than this.

I had placed you in temporary moderation after seeing your previous message to allow you to cool down but changed my mind after seeing the above and your account has now been placed in full moderation. This means that every message you send will be examined by the moderators for content before being broadcast to other members. You have created unnecessary work for the moderators.

You may feel that I'm being overly strict but experience has taught me that the tone of messages can quickly escalate beyond the moderators' ability to contain the situation.

I repeat that you are still able to post to the group. Please think twice, and then thrice, before posting in future.

73

Roger, G#4HZA
moderator, wsjtx.groups.io


Bill Somerville
 

Hi Al, and Mike,

it looks like the odd last message saved in ALL.TXT before the crash was indeed the clue needed to track this issue down. It turns out that a defect in the source encoding routines, which are used during decoding and message analysis as well, caused the crash. We are not sure how someone managed to transmit the message "W3AW K5CSA R-", perhaps by accidentally editing the Tx3 message. The next release of WSJT-X will handle such messages without crashing and they will be rightly sent and received as free text messages.

Thanks to Steve and Joe for tracking down and fixing the offending code.

73
Bill
G4WJS.

On 22/07/2021 17:38, Bill Somerville wrote:
Hi Al,

thanks for that. The incomplete decode print just before the crash may be a clue. That shouldn't happen as the write to ALL.TXT is immediately followed by a flush and file close, so it is quite hard for an incomplete line to be written. I will look around that area in the code base and see if anything obvious stands out.

73
Bill
G4WJS.

On 22/07/2021 17:29, Al Groff via groups.io wrote:
Bill,
  I have found no trace of the event in any of the Windows Event Logs...
The only thing I have found so far is in the ALL.TXT...

210721_201607    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201615    14.080 Rx FT4     -7  0.4  346 CQ KB3SQX FN10
210721_201615    14.080 Rx FT4      1  0.1  766 DK3WL K1WY RR73
210721_201615    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS EN81
210721_201615    14.080 Rx FT4      9  0.5 1718 W3AW K5CSA R-01
210721_201615    14.080 Rx FT4     -2  0.1 1830 VE3WWR W4FCC EM63
210721_201622    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201630    14.080 Rx FT4    -10  0.4  347 CQ KB3SQX FN10
210721_201630    14.080 Rx FT4     -5  0.1  766 EC1YP K1WY R-11
210721_201630    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS R-08
210721_201630    14.080 Rx FT4     10  0.4 1718 W3AW K5CSA R-
210721_201837    14.080 Rx FT4    -10  0.6 1352 K8PL K1ERZ RR73

The crash occurred at 201630+ . I should have started a TX a 201637, but it did not occur.
WSJT-X was restarted by 201837...

Sorry I did not have 'save all' turned on.  Wish I could supply more clues.

AL, K0VM



On 7/22/2021 7:00 AM, Bill Somerville wrote:
Hi Al,

if this is on MS Windows; is there any record of the crash in the Windows Event Log Applications section?

73
Bill
G4WJS.

On 22/07/2021 12:52, Al Groff via groups.io wrote:
Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall.. Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@... wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.

Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.



Al Groff
 

Bill,
   Thanks for  the feedback..
AL, K0VM


On 7/23/2021 8:51 AM, Bill Somerville wrote:
Hi Al, and Mike,

it looks like the odd last message saved in ALL.TXT before the crash was indeed the clue needed to track this issue down. It turns out that a defect in the source encoding routines, which are used during decoding and message analysis as well, caused the crash. We are not sure how someone managed to transmit the message "W3AW K5CSA R-", perhaps by accidentally editing the Tx3 message. The next release of WSJT-X will handle such messages without crashing and they will be rightly sent and received as free text messages.

Thanks to Steve and Joe for tracking down and fixing the offending code.

73
Bill
G4WJS.

On 22/07/2021 17:38, Bill Somerville wrote:
Hi Al,

thanks for that. The incomplete decode print just before the crash may be a clue. That shouldn't happen as the write to ALL.TXT is immediately followed by a flush and file close, so it is quite hard for an incomplete line to be written. I will look around that area in the code base and see if anything obvious stands out.

73
Bill
G4WJS.

On 22/07/2021 17:29, Al Groff via groups.io wrote:
Bill,
  I have found no trace of the event in any of the Windows Event Logs...
The only thing I have found so far is in the ALL.TXT...

210721_201607    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201615    14.080 Rx FT4     -7  0.4  346 CQ KB3SQX FN10
210721_201615    14.080 Rx FT4      1  0.1  766 DK3WL K1WY RR73
210721_201615    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS EN81
210721_201615    14.080 Rx FT4      9  0.5 1718 W3AW K5CSA R-01
210721_201615    14.080 Rx FT4     -2  0.1 1830 VE3WWR W4FCC EM63
210721_201622    14.080 Tx FT4      0  0.0 1144 K8PL K0VM -13
210721_201630    14.080 Rx FT4    -10  0.4  347 CQ KB3SQX FN10
210721_201630    14.080 Rx FT4     -5  0.1  766 EC1YP K1WY R-11
210721_201630    14.080 Rx FT4     17  0.2 1505 K1ERZ AC8WS R-08
210721_201630    14.080 Rx FT4     10  0.4 1718 W3AW K5CSA R-
210721_201837    14.080 Rx FT4    -10  0.6 1352 K8PL K1ERZ RR73

The crash occurred at 201630+ . I should have started a TX a 201637, but it did not occur.
WSJT-X was restarted by 201837...

Sorry I did not have 'save all' turned on.  Wish I could supply more clues.

AL, K0VM



On 7/22/2021 7:00 AM, Bill Somerville wrote:
Hi Al,

if this is on MS Windows; is there any record of the crash in the Windows Event Log Applications section?

73
Bill
G4WJS.

On 22/07/2021 12:52, Al Groff via groups.io wrote:
Just FYI...
  Yesterday I had a similar crash... I  think I was running ft4 at the time and in the middle of a RX cycle as I recall.. Had to kill the jt9 process before I could restart WSJT-X ( 2.5.0-rc3 ).

AL, K0VM


On 7/21/2021 4:10 PM, Bill Somerville wrote:
On 21/07/2021 21:59, mike@... wrote:
The main window and waterfall crashed and closed without notice. A process was left running which prevented a restart. The process was JT9 - WSJT slow mode decoder. It was necessary to restart the pc, start WSJT-X and close it successfully to continue. I cannot reproduce the event.

Hi Mike,

unless you can come up with a recipe to reproduce the issue reliably there's not much we can do with this issue report.

If the wsjtx process crashes leaving the jt9 sub-process running, in theory restarting wsjtx will close the jt9 process and start a new one. If that is not happening then perhaps the jt9 process is hanging. Ddi/Does the issue happen during decoding, i.e. when the Decode button is cyan coloured?

There should be no need to reboot the system, killing the jt9 process manually will allow WSJT-X to be restarted.

73
Bill
G4WJS.