Debugging of suspected ZFS deadlocks
Please first read the following. Please direct your ZFS reports to freebsd-fs@FreeBSD.org.
The best first step is to capture stack traces of all threads in one of the following ways:
procstat -kk -a in userland if it is still responsive and you are able to save the output somehow
alltrace in ddb
thread apply all bt in kgdb
If you see thread(s) with zio_wait call in their stacks and you also see thread(s) with zio_done call in their stacks, then this is very likely a true ZFS deadlock. Please report.
Similarly, if you see thread(s) with zio_wait call in their stacks and you also see thread(s) with zio_interrupt call in their stacks, then this is very likely a true ZFS deadlock. Please report.
If you do not see any threads with zio_wait call, but you see threads with the following calls (or similar):
zfs_freebsd_read
zfs_freebsd_write
dmu_buf_hold_array
arc_read
buf_hash_find
dmu_read_uio
dmu_write_uio
zfs_zget
then this is very likely a true ZFS deadlock. Please report.
If neither of the above is true. That is, you do see zio_wait and you don't see either of zio_done or zio_interrupt, then the problem is most likely with the storage layer:
- storage adapter/controller driver
- storage adapter/controller firmware
- storage adapter/controller hardware
Consider reporting this problem. Please be realistic about the problem. Do not expect a resolution in ZFS code.
Some notes:
- it's better to somehow verify that the threads are stuck where you see them as opposed to doing a lot of work and thus appearing in those places with high probability
camcontrol tags disk -v can be used to check number of queued commands for controllers that work via CAM
when reporting a problem please always include full information about thread stacks, don't cherry pick; the output can be large, upload it somewhere and post a link
If you are into deep debugging some very interesting/useful information can be seen in vdev_t structures associated with each leaf vdev of a pool.
vdev_queue = { vq_deadline_tree = {avl_root = 0xfffffe0338dbb248, avl_compar = 0xffffffff816855b0 <vdev_queue_deadline_compare>, avl_offset = 584, avl_numnodes = 116, avl_size = 896}, vq_read_tree = {avl_root = 0xfffffe019d0b65b0, avl_compar = 0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes = 8, avl_size = 896}, vq_write_tree = { avl_root = 0xfffffe03e3d19230, avl_compar = 0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes = 108, avl_size = 896}, vq_pending_tree = {avl_root = 0xfffffe025e32c230, avl_compar = 0xffffffff81685600 <vdev_queue_offset_compare>, avl_offset = 560, avl_numnodes = 10, avl_size = 896},
avl_numnodes provides a number of requests (zio-s) in a given queue. vq_deadline_tree is a queue of incoming requests, vq_read_tree and vq_write_tree are sub-queues for read and write requests correspondingly. vq_pending_tree is a queue of requests that have been issued to the underlying storage layer, ZFS is waiting for these requests to be completed.