I'm using btrfs-image with the following options: btrfs-image -c9 -t4 -w /dev/sdb ~/btrfs_sdb.z It spent a lot of time writing the output file, and it's currently at 1672450048 bytes for a 4-drive RAID1 of 1TB drives. At some point, it stopped doing I/O entirely and is now just using 100% CPU, which has been going on for at least 30 minutes now. I think it's stuck, so below is info from the debugger. Note that although I told it to use four threads, I was watching the CPU and I/O load, and at no point did I see it use any of the worker threads. And as another note, there is nothing that makes it clear whether or not you have to image separate block devices separately if they're in a RAID1 configuration. (gdb) info threads Id Target Id Frame 5 Thread 0x7f4da6c8a700 (LWP 4363) "btrfs-image" pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 4 Thread 0x7f4da6489700 (LWP 4364) "btrfs-image" pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 3 Thread 0x7f4da5c88700 (LWP 4365) "btrfs-image" pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 2 Thread 0x7f4da5487700 (LWP 4366) "btrfs-image" pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 * 1 Thread 0x7f4da7aa58c0 (LWP 4362) "btrfs-image" 0x00007f4da70326f3 in pread64 () at ../sysdeps/unix/syscall-template.S:81 (gdb) thread 1 [Switching to thread 1 (Thread 0x7f4da7aa58c0 (LWP 4362))] #0 0x00007f4da70326f3 in pread64 () at ../sysdeps/unix/syscall-template.S:81 81 in ../sysdeps/unix/syscall-template.S (gdb) where #0 0x00007f4da70326f3 in pread64 () at ../sysdeps/unix/syscall-template.S:81 #1 0x0000000000433978 in pread64 (__offset=<optimized out>, __nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:117 #2 read_data_extent (md=<optimized out>, async=<optimized out>) at btrfs-image.c:908 #3 flush_pending (md=0x7fff5e4e58e0, done=23238559, done@entry=0) at btrfs-image.c:963 #4 0x0000000000433c08 in add_extent (start=1128534638592, size=262144, md=0x7fff5e4e58e0, data=1) at btrfs-image.c:1042 #5 0x000000000043618f in copy_space_cache (root=0x15db980, path=0x1545890, metadump=0x7fff5e4e58e0) at btrfs-image.c:1220 #6 create_metadump (input=<optimized out>, out=<optimized out>, num_threads=4, compress_level=<optimized out>, sanitize=<optimized out>, walk_trees=<optimized out>) at btrfs-image.c:1403 #7 0x000000000040821f in main (argc=2, argv=0x7fff5e4e5c78) at btrfs-image.c:2825 (gdb) thread 2 [Switching to thread 2 (Thread 0x7f4da5487700 (LWP 4366))] #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 185 ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory. (gdb) where #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000431ceb in dump_worker (data=0x7fff5e4e58e0) at btrfs-image.c:648 #2 0x00007f4da702b204 in start_thread (arg=0x7f4da5487700) at pthread_create.c:310 #3 0x00007f4da6d7257d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) thread 3 [Switching to thread 3 (Thread 0x7f4da5c88700 (LWP 4365))] #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 185 in ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S (gdb) where #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000431ceb in dump_worker (data=0x7fff5e4e58e0) at btrfs-image.c:648 #2 0x00007f4da702b204 in start_thread (arg=0x7f4da5c88700) at pthread_create.c:310 #3 0x00007f4da6d7257d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) thread 4 [Switching to thread 4 (Thread 0x7f4da6489700 (LWP 4364))] #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 185 in ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S (gdb) where #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000431ceb in dump_worker (data=0x7fff5e4e58e0) at btrfs-image.c:648 #2 0x00007f4da702b204 in start_thread (arg=0x7f4da6489700) at pthread_create.c:310 #3 0x00007f4da6d7257d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) thread 5 [Switching to thread 5 (Thread 0x7f4da6c8a700 (LWP 4363))] #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 185 in ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S (gdb) where #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000431ceb in dump_worker (data=0x7fff5e4e58e0) at btrfs-image.c:648 #2 0x00007f4da702b204 in start_thread (arg=0x7f4da6c8a700) at pthread_create.c:310 #3 0x00007f4da6d7257d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 I attached strace, and I see this over and over again: pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) pread(5, "\0", 1, 314051657727) = 1 pread(-1, 0x16297a0, 65536, 310867132416) = -1 EBADF (Bad file descriptor) There's nothing in dmesg about any of this.
Hi Timothy, Curious about whether all the 4 devices are online? Seems one of the device is not online? And btrfs-image fails to handle such case as it record wrong(-1, error return value from open), and cause the error. BTW that's a bug of btrfs-progs, not sure if is appropriate to address a kernel bug report here. Thanks, Qu
Thanks for the fix, added to the devel branch.