randomly erorrs during replication: "Timeout in head()"

Description

PULL replicating a dataset recursively from TN 12.0-U7 to TrueNAS-SCALE-22.02-MASTER-20220118-034447 over LAN, just 1 switch in between. Around 600 snapshots needed to be transferred. Works basically fine, but stops randomly with given error message at some snapshot.

Restarting the task works, it resumes the snapshot were it stopped. Some snapshot later same error again.

Of course i could increase the retry count, but i assume this should not happen in the first place. There must be something not working as expected.

Problem/Justification

None

Impact

None

Activity

Show:

Bug Clerk January 22, 2022 at 5:05 PM

Vladimir Vinogradenko January 22, 2022 at 5:03 PM

we'll improve the error message, but increasing retry count attempt is the correct approach here. Your sending system is slow and zfs send invocation exec syscall sometimes times out

Bonnie Follweiler January 20, 2022 at 1:50 PM

Thank you for your ticket submission .

This ticket is now in our queue to review.

An engineering representative will update with any further questions or details in the near future.

Metis IT January 19, 2022 at 10:35 PM

FYI diagnose and screenshots were attached in UI, but did not end up here in jira. see also https://jira.ixsystems.com/browse/NAS-112742 

 

Complete

Details

Assignee

Reporter

Labels

Time remaining

0m

Components

Fix versions

Affects versions

Priority

Katalon Platform

Created January 19, 2022 at 10:25 PM
Updated July 1, 2022 at 2:38 PM
Resolved January 24, 2022 at 10:47 AM