Description
Steps to Reproduce
Expected Result
Actual Result
Environment
Hardware Health
Error Message (if applicable)
Activity
Bug Clerk last week
This issue has now been closed. Comments made after this point may not be viewed by the TrueNAS Teams. Please open a new issue if you have found a problem or need to re-engage with the TrueNAS Engineering Teams.
Stavros Kois last week
Nope, nothing changed in the last week.
I’ll close this for now, if you can reproduce it, please let me know!
Thanks
Stuart Espey last week
Hmmm, today I have been unable to reproduce. I also tried restarting, and moving the docker dataset back to a spinning rust pool.
This is the current logs…
2024-11-20 01:23:28.354952+00:00s6-rc: info: service s6rc-oneshot-runner: starting
2024-11-20 01:23:28.361735+00:00s6-rc: info: service s6rc-oneshot-runner successfully started
2024-11-20 01:23:28.362139+00:00s6-rc: info: service fix-attrs: starting
2024-11-20 01:23:28.369012+00:00s6-rc: info: service fix-attrs successfully started
2024-11-20 01:23:28.369161+00:00s6-rc: info: service legacy-cont-init: starting
2024-11-20 01:23:28.375233+00:00cont-init: info: running /etc/cont-init.d/01-timezone
2024-11-20 01:23:29.156574+00:00cont-init: info: /etc/cont-init.d/01-timezone exited 0
2024-11-20 01:23:29.156915+00:00cont-init: info: running /etc/cont-init.d/50-cron-config
2024-11-20 01:23:29.176801+00:00cont-init: info: /etc/cont-init.d/50-cron-config exited 0
2024-11-20 01:23:29.178608+00:00s6-rc: info: service legacy-cont-init successfully started
2024-11-20 01:23:29.178872+00:00s6-rc: info: service legacy-services: starting
2024-11-20 01:23:29.191433+00:00services-up: info: copying legacy longrun collector-once (no readiness notification)
2024-11-20 01:23:29.195471+00:00services-up: info: copying legacy longrun cron (no readiness notification)
2024-11-20 01:23:29.198311+00:00services-up: info: copying legacy longrun influxdb (no readiness notification)
2024-11-20 01:23:29.201443+00:00services-up: info: copying legacy longrun scrutiny (no readiness notification)
2024-11-20 01:23:29.207430+00:00s6-rc: info: service legacy-services successfully started
And I note that the difference seems to be setting 01-timezone is not timing out?
2024-11-20 01:23:28.375233+00:00cont-init: info: running /etc/cont-init.d/01-timezone
2024-11-20 01:23:29.156574+00:00cont-init: info: /etc/cont-init.d/01-timezone exited 0
vs
2024-11-13 03:15:15.032061+00:00cont-init: info: running /etc/cont-init.d/01-timezone
2024-11-13 03:15:20.015043+00:00s6-rc: fatal: timed out
I wonder if anything changed in the app to fix?
Stavros Kois last week
Sorry for the delay,
I’ve been trying to reproduce, even dropped startup to 1s and timeout to 1s as well. Still starts fine.
Can you please set S6_VERBOSITY
to 5
via additional environment variables.
Lets see if we can see something there.
Stuart Espey November 14, 2024 at 4:11 AM
btw, left it deploying for 2 hours. no change.
I found that when I installed Scrutiny app from the catalog, that if I specified more than one disk, it would timeout when starting.
By modifying the healthcheck timeout from 5s to 10s in the yaml, then it started correctly.
Ie
```
healthcheck:
interval: 10s
retries: 30
start_period: 10s
test: >-
curl --silent --output /dev/null --show-error --fail
http://127.0.0.1:8080/api/health
timeout: 5s
```
`timeout: 5s` -> `timeout: 10s`
It seems that this timeout code is specified in the base library, in healthcheck.py,
```
class Healthcheck:
def _init_(self, render_instance: "Render"):
self._render_instance = render_instance
self._test: str | list[str] = ""
self._interval_sec: int = 10
self._timeout_sec: int = 5
self._retries: int = 30
self._start_period_sec: int = 10
self._disabled: bool = False
```
It may be that a default timeout of 5s is is too short.
When it fails, the app log shows
```
4-11-13 03:15:15.012553+00:00s6-rc: info: service s6rc-oneshot-runner: starting
2024-11-13 03:15:15.018647+00:00s6-rc: info: service s6rc-oneshot-runner successfully started
2024-11-13 03:15:15.018961+00:00s6-rc: info: service fix-attrs: starting
2024-11-13 03:15:15.025921+00:00s6-rc: info: service fix-attrs successfully started
2024-11-13 03:15:15.026223+00:00s6-rc: info: service legacy-cont-init: starting
2024-11-13 03:15:15.032061+00:00cont-init: info: running /etc/cont-init.d/01-timezone
2024-11-13 03:15:20.015043+00:00s6-rc: fatal: timed out
2024-11-13 03:15:20.019007+00:00s6-sudoc: fatal: unable to get exit status from server: Operation timed out
```
Session ID: f1d8785f-02dc-6cfc-b538-ccdb6b1b3ffd