systemd-udevd SIGKILL'ing kernel processes at bootup
Description
Problem/Justification
None
Impact
None
Activity
Show:

Bug Clerk June 2, 2022 at 1:32 PM
22.02.2 PR: https://github.com/truenas/middleware/pull/9098

Caleb June 2, 2022 at 12:36 PM
All my time was spent actually waiting on the box to reboot since it takes ~30-40 mins to do so :|

Bug Clerk June 2, 2022 at 12:36 PM
Complete
Pinned fields
Click on the next to a field label to start pinning.
Details
Details
Assignee

Reporter

Labels
Time remaining
0m
Components
Fix versions
Affects versions
Priority
Katalon Platform
Linked Test Cases, Katalon Defect Results, Katalon Studio Test Results
Katalon Platform
Linked Test Cases, Katalon Defect Results, Katalon Studio Test Results
Created June 2, 2022 at 11:57 AM
Updated July 1, 2022 at 6:03 PM
Resolved June 2, 2022 at 1:38 PM
Jun 01 07:24:24 m60-100b systemd-udevd[1691]: 0000:3d:00.0: Worker [1822] processing SEQNUM=23088 is taking a long time
Jun 01 07:26:24 m60-100b systemd-udevd[1691]: 0000:3d:00.0: Worker [1822] processing SEQNUM=23088 killed
Jun 01 07:26:45 m60-100b systemd-udevd[1691]: 0000:3d:00.0: Worker [1822] failed
As the logs show, on an M60 with 12x ES102 JBODs fully populated the default timeout of 20 seconds for events isn't long enough so systemd-udevd is SIGKILL'ing kernel worker processes responsible for setting up certain hardware. It just so happens that 0000:3d:00.0 is the ntb device.....so when it boots ntb0 doesn't exist.
NOTE: systemd-udevd is killing processes for setting up SCSI disks/ DAX1 (don't know what this is) etc. After a few reboots, I've found that setting --event-timeout=300 allows everything to be setup properly so that HA isn't broken on reboot/failover.