Thanks for using the TrueNAS Community Edition issue tracker! TrueNAS Enterprise users receive direct support for their reports from our support portal.

Replication resume token is not updated without data writes

Description

I’m getting

CRITICAL Replication "truenas-home push hosts" failed: Replication has stuck..

I have tried numerous times, but it doesn’t finish the replication. I don’t have clear instruction how to reproduce it. It only happens on one of my servers with one single dataset. Logs attached.

Problem/Justification

None

Impact

None

Attachments

3
  • 18 Jan 2024, 11:20 AM
  • 18 Jan 2024, 08:54 AM
  • 13 Jan 2024, 09:03 AM

Activity

Show:

Bug Clerk April 2, 2024 at 7:00 PM

This issue has now been closed. Comments made after this point may not be viewed by the TrueNAS Teams. Please open a new issue if you have found a problem or need to re-engage with the TrueNAS Engineering Teams.

Alexander Motin April 2, 2024 at 7:00 PM

We've merged the patch into upcoming SCALE 24.04 and Core 13.3 releases.

Alexander Motin February 23, 2024 at 7:54 PM

I was able to reproduce the scenario, and this slightly hackish patch fixes it for me: https://github.com/openzfs/zfs/pull/15927. Lets see what the community think about it.

Marco February 13, 2024 at 5:51 PM

I’m glad you found the issue! I’ve deactivated atime now as I don’t need it. I haven’t touched this setting since years. I may have ran a find command on the filesystem, therefore touching every file. I don’t know, but that would be an explanation. However, setting atime should not break replication. But I suppose that’s what you’re fixing right now. Let me know if you need additional infos or if I should try something else. Thanks for taking the time to look into that.

Alexander Motin February 13, 2024 at 5:40 PM

I see the problem. ZFS updates receive_resume_token only when receiving some data. But the stream in this case includes no data writes at all. Without token updates for third-party observer it looks like replication is stuck, and after restart it restart from the beginning again. I need to look for other good point to update the token.

Complete
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Impact

Medium

Components

Affects versions

Priority

More fields

Katalon Platform

Created January 13, 2024 at 9:02 AM
Updated May 2, 2024 at 1:41 PM
Resolved April 2, 2024 at 7:00 PM