Skip to content

Process enters uninterruptible sleep during sustained FireWire AV/C operations on Pi 5 #7352

@YanqiHe03

Description

@YanqiHe03

Describe the bug

Sustained FireWire AV/C operations (specifically trickplay + timecode commands sent via dvcont / libavc1394) eventually cause processes to enter uninterruptible sleep (D state). In one instance, the hung task detector showed the process stuck in fw_device_op_release(), but I have not confirmed this is the case for every occurrence. Once triggered, modprobe -r firewire_ohci also hangs, and the system cannot shut down cleanly. The systemd-shutdown blocks indefinitely waiting for the stuck processes.

No DMA or OHCI errors appear in dmesg. I'm not sure whether the root cause is in firewire_core, in the interaction between libavc1394 and the kernel, or something specific to this hardware/platform combination. However, regardless of the real cause, I think my operations shouldn't be able to leave processes in an unkillable D state.

Steps to reproduce the behaviour

  1. Connect a MiniDV device (camera or deck) via FireWire to a Pi 5 with a PCIe OHCI card
  2. Run the following loop:
for i in $(seq 1 500); do
  dvcont trickplay 5
  sleep 0.2
  dvcont timecode
  sleep 0.2
  dvcont trickplay -5
  sleep 0.2
  dvcont timecode
  sleep 0.2
  echo "$i"
done
  1. The loop will eventually hang (one dvcont process enters D state and never returns)

In my tests:

  • Clean boot, trickplay+timecode loop: hangs at ~487 iterations
  • After 500× dvcont status (which alone does not hang), trickplay+timecode loop: hangs at ~22 iterations
  • Pure dvcont status ×500: does not hang
  • Pure open()/close() on /dev/fw1 ×500: does not hang, but a single dvcont status after this immediately hangs.

Device (s)

Raspberry Pi 5

System

https://pastebin.com/kV6EecHZ

Logs

Kernel stack trace (from hung task detector):

INFO: task dvcont:4136 blocked for more than 120 seconds.
      Not tainted 6.12.78-v8-16k+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:dvcont          state:D stack:0     pid:4136  tgid:4136  ppid:1      flags:0x0000000d
Call trace:
  __switch_to+0xe8/0x148
  __schedule+0x390/0xb68
  schedule+0x3c/0x148
  fw_device_op_release+0x2d4/0x320 [firewire_core]
  __fput+0xd0/0x2e0
  ____fput+0x1c/0x30
  task_work_run+0x88/0x100
  do_exit+0x2e8/0x9a8
  do_group_exit+0x3c/0xa0
  get_signal+0x960/0xa20
  do_signal+0xfc/0x1060
  do_notify_resume+0xd8/0x160
  el0_svc+0xd4/0xf8
  el0t_64_sync_handler+0x120/0x130
  el0t_64_sync+0x190/0x198

dmesg (no firewire errors prior to hang):

[    5.835977] firewire_ohci 0001:02:00.0: enabling device (0000 -> 0002)
[    5.896418] firewire_ohci 0001:02:00.0: added OHCI v1.10 device as card 0, 8 IR + 8 IT contexts, quirks 0x2
[    6.431656] firewire_core 0001:02:00.0: created device fw0: GUID 7856341278563412, S800
[ 1765.948922] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 1766.681399] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100
[ 2146.157674] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 2146.899609] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100

Process state when hung:

yanqihe     4136  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4140  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4144  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4149  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4154  0.0  0.0      0     0 pts/0    D+   21:47   0:00 [dvcont]
root        4170  0.0  0.0  11248  3936 pts/2    D+   21:48   0:00 modprobe -r firewire_ohci

Cascading failure:

Once the hang occurs:

  • sudo modprobe -r firewire_ohci hangs
  • sudo reboot blocks at shutdown, waiting for the stuck processes indefinitely

The attached log:

firewire_hang_dmesg.txt

Additional context

The use case is an art installation that uses dvcont trickplay commands to seek a MiniDV tape to specific timecodes based on real-time audio input. The hang occurs reliably during normal operation after an extended period. I have not been able to test with a second FireWire card or on another platform, so I can't rule out a hardware-specific issue with this particular controller. I'll try another FireWire card later.

lspci -v:

0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 30) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 39
        Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
        Memory behind bridge: 80000000-801fffff [size=2M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: <access denied>
        Kernel driver in use: pcieport
 
0001:01:00.0 PCI bridge: Texas Instruments XIO2213A/B/XIO2221 PCI Express to PCI Bridge [Cheetah Express] (rev 01) (prog-if 00 [Normal decode])
        Subsystem: Device 3412:7856
        Flags: bus master, fast devsel, latency 0
        Memory at 1b80100000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: [disabled] [32-bit]
        Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: <access denied>
 
0001:02:00.0 FireWire (IEEE 1394): Texas Instruments XIO2213A/B/XIO2221 IEEE-1394b OHCI Controller [Cheetah Express] (rev 01) (prog-if 10 [OHCI])
        Subsystem: Device 3412:7856
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 39
        Memory at 1b80004000 (32-bit, non-prefetchable) [size=2K]
        Memory at 1b80000000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: firewire_ohci
        Kernel modules: firewire_ohci

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions