locked WSJTX crashes on Raspberry Pi 4b - Puzzled....


tony.volpe.1951@...
 

Hi,

I've been running a WSPR decode set up on my Raspberry Pi 4b for a week or so, using WSJTX - GQRX and a Funcube Pro Plus dongle.

For the last few days, the previously working setup has been troubled by random shut downs of WSJTX. I check the thing in the morning when I get up and often find WSJTX has shut down in the middle of the night. GQRX is working normally, but the decoding programme has given up. I restart it, and it runs faultlessly for a number of hours and then does it again.

I have checked the settings and nothing looks out of order. I found a huge log called WSPR.TXT which I deleted in case it was getting too large. I wonder if anyone has any ideas or solutions.

Thanks


Bill Somerville
 

On 20/01/2020 10:21, tony.volpe.1951@... wrote:
Hi,

I've been running a WSPR decode set up on my Raspberry Pi 4b for a week or so, using WSJTX - GQRX and a Funcube Pro Plus dongle.

For the last few days, the previously working setup has been troubled by random shut downs of WSJTX. I check the thing in the morning when I get up and often find WSJTX has shut down in the middle of the night. GQRX is working normally, but the decoding programme has given up. I restart it, and it runs faultlessly for a number of hours and then does it again.

I have checked the settings and nothing looks out of order. I found a huge log called WSPR.TXT which I deleted in case it was getting too large. I wonder if anyone has any ideas or solutions.

Thanks
Hi Tony?

if you are not running WSJT-X from a terminal window then try that and see if WSJT-X prints any error message when this happens. If that doesn't help and you have gdb installed (or can install it) then try running WSJT-X under the control of gdb like this:

gdb --args wsjtx

add any command line arguments you use as normal. When WSJT-X unexpectedly exits there will be a message from gdb saying which signal caused the exit. You can also print a stack trace by typing bt and hitting ENTER. Type q and hit enter to exit gdb.

73
Bill
G4WJS.


tony.volpe.1951@...
 

Thanks Bill. That sounds like a productive route to follow. I will have to do some digging to find out about this 'gdb' thing. Never heard of it, so this will provide a new interest to get me through the dark time of the winter... :))) I am a linux beginner so I am constantly coming up against issues, one way or another. 

I will also re-start wsjtx in a terminal window. I've seen error messages when I was having trouble with GQRX. I've had a variety of problems before this, but managed to solve them easily enough. I almost always have to use the 'killall wsjtx' command to unlock the program after it has stopped, because when it falls over, it doesn't do so tidily and leaves some resource or other tied up with the dead instance of the program.

73s - and thanks for your input. I'll come back if that's OK and tell you what happened...

de G0BZB - OP Tony


tony.volpe.1951@...
 

OK Bill - I started wsjtx in a terminal and this is the output. THE PROGRAM starts up and runs fine (apparently anyway) in spite of the warnings. It typicaly runs for about 12 or 20 hours before falling over, but sometimes stops earlier than that.

Output from terminal:

pi@raspberrypi:~ $ wsjtx
[20200122 09:58:13.106 GMT D] using qt5ct plugin
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1043:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
[20200122 09:58:26.061 GMT D] D-Bus global menu: no


Cheers

Tony G0BZB


Bill Somerville
 

On 22/01/2020 10:06, tony.volpe.1951@... wrote:
OK Bill - I started wsjtx in a terminal and this is the output. THE PROGRAM starts up and runs fine (apparently anyway) in spite of the warnings. It typicaly runs for about 12 or 20 hours before falling over, but sometimes stops earlier than that.

Output from terminal:

pi@raspberrypi:~ $ wsjtx
[20200122 09:58:13.106 GMT D] using qt5ct plugin
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1043:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dsnoop.c:638:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_dsnoop.c:575:(snd_pcm_dsnoop_open) The dsnoop plugin supports only capture stream
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
ALSA lib pcm_usb_stream.c:508:(_snd_pcm_usb_stream_open) Unknown field hint
[20200122 09:58:26.061 GMT D] D-Bus global menu: no


Cheers

Tony G0BZB
Hi Tony,

those messages are benign. Did the output you showed include a crash of WSJT-X?

73
Bill
G4WJS.


tony.volpe.1951@...
 

No Bill.

That is the start up message. I will let it run and when it next crashes, I will look and see what comes up at the end of the terminal window. If that doesn't present any directions of enquiry, I will install and run it within the gdb C debugging utility.

I will feedback when something happens.

Cheers and thanks again for your expertise.

de G0bzb - op Tony


tony.volpe.1951@...
 

Right Bill,

Good morning.

We have some progress. I woke up in an armchair last night and checked and wsjtx had done its stopping thing and the terminal window it was started from has a clue:

 
[20200122 11:57:54.238 GMT D] D-Bus global menu: no
Assertion 're->data || re->memblock' failed at pulsecore/pstream.c:862, function do_read(). Aborting.
Aborted

Something to do with a memory block failure. Is this a bit of bad memory on the Pi's wee memory card? They are notorious for having defective blocks. 

If so, can I run some utility to scan the card and rule out sending data to bad blocks?

Many thanks

Tony
 


tony.volpe.1951@...
 

By the way Bill, having re-started wsjtx, the program failed again with an identical error eight minutes later. I rebooted the system and restarted gqrx and wsjtx. 

It is 00:41 now, so I will go to bed and see what happens in the morning.

73s


Nc8q-mesh@gelm.net <nc8q-mesh@...>
 

On 1/22/20 7:28 PM, tony.volpe.1951@... wrote:
If so, can I run some utility to scan the card and rule out sending data to bad blocks?

man badblocks

$0.02
IIRC, SD cards have a limited number of write cycles.
Running a 'badblocks' test on a SD card will shorten it life.
However, if there are badblocks on a SD card, it may be likely that more badblocks will occur in the near future.
Typically, I use a spinning/magnetic HD for '/' and let the RPi boot from the SD card.



--


Bill Somerville
 

On 23/01/2020 00:28, tony.volpe.1951@... wrote:
Right Bill,

Good morning.

We have some progress. I woke up in an armchair last night and checked and wsjtx had done its stopping thing and the terminal window it was started from has a clue:

[20200122 11:57:54.238 GMT D] D-Bus global menu: no
Assertion 're->data || re->memblock' failed at pulsecore/pstream.c:862, function do_read(). Aborting.
Aborted

Something to do with a memory block failure. Is this a bit of bad memory on the Pi's wee memory card? They are notorious for having defective blocks.

If so, can I run some utility to scan the card and rule out sending data to bad blocks?

Many thanks

Tony
Hi Tony,

you seem to be confusing system memory and disk memory. Although system memory can be swapped out to disk temporarily when memory resources are scarce, a bad cluster on a disk (SD card for example) will not be the direct cause of a memory read error in an application. The error is more likely a program error rather that a hardware issue. Since the assertion that triggers in in pulseaudio I would guess that the defect lies there. All I can suggest is keep updating your system and look for pulse audio updates that might fix the issue. You could try raising an issue with the pulseaudio developers at https://gitlab.freedesktop.org/pulseaudio/pulseaudio/issues .

73
Bill
G4WJS.


tony.volpe.1951@...
 

Hi Bill,

Yes on the confusion. I wrote that remark at about 2AM after waking up on the couch... :))

As soon as I got up this morning, I realised how stupid it was to have said what I did, since my Pi has 4Gbyts of memory to do its jobs in.

I also scanned around the web for similar error messages, deleting the particular memory address, and there are plenty of other people seeing crashes in different programmes, but all of them are using the terrible pulseaudio and involve that software.

Every problem I have had setting up and running the wspr decoder with GQRX and WSJTX have involved pulseaudio. There re lots of problems people come up against.

Many thanks for your interest in my problem. 

Have  a good day.

Thanks also to Chuck NC8Q for his remarks about bad blocks. 

73s....

Tony G0BZB


Brian
 

Hi Tony, 

I recall Bill telling me last year that WSJTX doesn't need pulseaudio.  I am going from memory and hopefully am relating correctly what Bill told me.  If you don't need pulseaudio for other purposes on your RPi, perhaps remove it from the audio chain by uninstalling it.  Then point WSJTX directly to the ALSA audio devices instead.  Perhaps that will bring you the stability you require to your wspr node.

regards, 
Brian
VE3IBW

On Thu, Jan 23, 2020 at 8:44 AM <tony.volpe.1951@...> wrote:
Hi Bill,

Yes on the confusion. I wrote that remark at about 2AM after waking up on the couch... :))

As soon as I got up this morning, I realised how stupid it was to have said what I did, since my Pi has 4Gbyts of memory to do its jobs in.

I also scanned around the web for similar error messages, deleting the particular memory address, and there are plenty of other people seeing crashes in different programmes, but all of them are using the terrible pulseaudio and involve that software.

Every problem I have had setting up and running the wspr decoder with GQRX and WSJTX have involved pulseaudio. There re lots of problems people come up against.

Many thanks for your interest in my problem. 

Have  a good day.

Thanks also to Chuck NC8Q for his remarks about bad blocks. 

73s....

Tony G0BZB


 

Have you ruled out any system throttling behavior due to heat?  A look at system utilization on tx  might shed some light too.  Pi 4b should be ok for light duty tx in wspr,  but since the most common scenario you report is based on running for "a while", gradual heat buildup might be a factor.   IME software error nessages are often misleading when taken literally, not withstanding Bill's advice.  Unlike the 3b, the 4b with only passive cooling  can heat up and throttle pretty quickly.  If you can rig a small 12 v computer fan over the board that might enable you to rule that out.  See YT channel "Explaining  Computers" for benchmark tests of 4b in passive and active cooling scenarios. A search on the Pi foundation board may also reveal other interesting 4b audio reports. HTH CHUCK AB1VL