Unable to replace disk
Description
Problem/Justification
Impact
SmartDraw Connector
Katalon Manual Tests (BETA)
Activity
Caleb June 21, 2021 at 2:11 PM
No response after 11 days.
Caleb June 10, 2021 at 3:08 PM
Hi @Elvin Chng any update on my previous request? Did you try to reformat the disk and replace it?
Caleb June 8, 2021 at 3:11 PM
@Elvin Chng thanks for the reply.
1. it seems I've misinterpreted the ciss(4) driver output in the debug. You are using HBA mode which is pass-through so everything is fine there. There is no concern.
2. the screenshot that you posted is the ZFS reported errors. The SMART errors are the errors reported by the drive firmware and will not show up in ZFS. ZFS will only report errors if the data that is being written and/or retrieved from the disks is broken. You can view the SMART test results for drives in the webUI. Please check our documentation on how to do that.
So it seems that "da2" already has a ZFS partition on it so it would be best to format that disk again and then try to replace the old disk. I'm not sure how you're formatting the device but you can run this command from the command line to format it.
1. ssh into your system
2. run "midclt call disk.format da2 2 false"
Once the above command completes, then you try to replace the disk in the webUI again.
Elvin Chng June 6, 2021 at 3:01 AMEdited
Hi Caleb,
1. Yes you are correct, I'm using a HP Smart Array P840 Controller for both my Zpools which I have set to HBA mode. How do I set it to HBA that supports "IT" mode or "pass-through"?
2. Thanks for the heads up. I didnt realise as the GUI has not been reporting any errors. I have attached a screenshot for you.
1. Im trying to replace both da1 or da3 to a larger disk, da2 for now and later with another disk when it arrive.
Caleb June 5, 2021 at 5:55 PMEdited
@Elvin Chng thanks for the debug. Let me start with the first problem.
1. everyone of your data disks for both zpools (Home and Jail) are being presented via the ciss(4) driver which is hardware RAID. The only disks that are not using the hardware RAID card is the 2x SATA SSD's that you're using as cache devices on the "Home" zpool. The disks being presented via hardware RAID is an unsupported configuration and you're going to have nothing but problems in the future with this type of setup. ZFS expects direct access to the disks. I'd highly encourage you to replace the current HBA with an HBA that supports "IT" mode or "passthrough" mode.
2. The vast majority of your disks are reporting a high number of SMART errors. This is not good and I'd advise you start replacing these disks sooner rather than later. You can look at the SMART errors for each disk in the "SMART" directory in the debug archive that you attached to this ticket.
Moving on to the traceback that you're receiving.
1. What disk are you trying to replace da1 or da3 with?? Are you trying to replace those disks with da2?
Hi,
I'm trying to replace a disk in my pool but I got this error.
Please advise.
Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/zfs.py", line 277, in replace
target.replace(newvdev)
File "libzfs.pyx", line 391, in libzfs.ZFS._exit_
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/zfs.py", line 277, in replace
target.replace(newvdev)
File "libzfs.pyx", line 2060, in libzfs.ZFSVdev.replace
libzfs.ZFSException: already in replacing/spare config; wait for completion or use 'zpool detach'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/local/lib/python3.8/site-packages/middlewared/worker.py", line 94, in main_worker
res = MIDDLEWARE._run(*call_args)
File "/usr/local/lib/python3.8/site-packages/middlewared/worker.py", line 45, in _run
return self._call(name, serviceobj, methodobj, args, job=job)
File "/usr/local/lib/python3.8/site-packages/middlewared/worker.py", line 39, in _call
return methodobj(*params)
File "/usr/local/lib/python3.8/site-packages/middlewared/worker.py", line 39, in _call
return methodobj(*params)
File "/usr/local/lib/python3.8/site-packages/middlewared/schema.py", line 977, in nf
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/zfs.py", line 279, in replace
raise CallError(str(e), e.code)
middlewared.service_exception.CallError: [EZFS_BADTARGET] already in replacing/spare config; wait for completion or use 'zpool detach'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/middlewared/job.py", line 367, in run
await self.future
File "/usr/local/lib/python3.8/site-packages/middlewared/job.py", line 403, in __run_body
rv = await self.method(*([self] + args))
File "/usr/local/lib/python3.8/site-packages/middlewared/schema.py", line 973, in nf
return await f(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/pool_/replace_disk.py", line 122, in replace
raise e
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/pool_/replace_disk.py", line 102, in replace
await self.middleware.call(
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1238, in call
return await self._call(
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1203, in _call
return await self._call_worker(name, *prepared_call.args)
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1209, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1136, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1110, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [EZFS_BADTARGET] already in replacing/spare config; wait for completion or use 'zpool detach'