Bug 78761

Summary: HFS+ unable to access random files after a while -- "ls : Invalid argument"
Product: File System Reporter: Luc Pi (pionchon.luc)
Component: HFS/HFSPLUSAssignee: fs_hfs (fs_hfs)
Status: NEW ---    
Severity: high CC: dreamcat4, menghan412, saproj, szg00000
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 4.1.0-040100rc1 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Fix

Description Luc Pi 2014-06-23 09:17:38 UTC
I reported this bug also at ubuntu [1]. I post here too to get some insight.

I have a dual boot OSX+Ubuntu on a Macbook with a SSD drive.
I use a partition to share files between both OS.
The partition is HFS+, Non journalized, Case sensitive.

This has been working perfectly for years (since 2006).

It has been some weeks now that accessing some files fails. It's only some files, not others. I see no obvious reason why these ones and not others. Usually unmounting and remounting may give access back, but not always. After a while again some files start to be unaccessible. Not especially the same files as earlier. Checking the filesystem either on OSX or Ubuntu sometimes gives errors, but not always.

I was using linux-image-3.11.0-18-generic which had a bug on suspend/resume. Then I tried linux-image-3.14.0-031400rc6-generic from [2]. It is possible that this is at that time that using the HFS+ partition started to go wrong, but I am not 100% sure. Now I have linux-image-3.13.0-29-generic from the ubuntu 14.04 release, and the issue is still here.



The type of error I get is like:

$ ls -l .
total 0
drwxr-xr-x 1 me me 14 Jun 20 13:09 foo

$ ls foo/
ls: reading directory foo/: Invalid argument

$ touch foo/bar
touch: setting times of ‘foo/bar’: Invalid argument

Or in nautilus, I get errors like:
Sorry, could not display all the contents of “foo”: Error when getting information for file '/media/share/foo/baz': Invalid argument


Some times I can list the parent directory and some files appear with no attributes, like this:

$ ls -l /media/share/abc/
ls: cannot access /media/share/abc/maps: Invalid argument
ls: cannot access /media/share/abc/rewire.jpg: Invalid argument
d????????? ? ? ? ? ? maps
-????????? ? ? ? ? ? rewire.jpg


Another type of error with nautilus: I opened a file "more.html" and could see it's content. Then I wanted to delete it, and I got this error:

“more.html” can't be put in the trash. Do you want to delete it immediately?
Error trashing file: Cannot allocate memory

(and I cannot open the file anymore)




See various system files and info at [1] 

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1332950
[2] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-rc6-trusty/
Comment 1 Luc Pi 2014-07-07 10:09:04 UTC
Let me know if I could take any action to help figure out what's going wrong
Comment 2 Luc Pi 2014-07-13 06:49:43 UTC
this is also present with 3.16.0-rc4 (generic, i386) from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.16-rc4-utopic/
Comment 3 Luc Pi 2014-07-13 07:00:56 UTC
I will try older kernels,
although some had other problems
(like crashes, suspend/resume issues, or hfs+ extended file attributes corruption).

- longterm: 3.12.24 http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12.24-trusty/
- longterm: 3.10.48 http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.10.48-saucy/
Comment 4 Luc Pi 2014-07-14 15:58:45 UTC
- 3.12.24 seem to have the issue too (?)
- I cannot boot 3.10.48
- now trying 3.11.10 http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11.10.8-saucy/
Comment 5 Luc Pi 2015-04-30 08:22:07 UTC
(In reply to Luc Pionchon from comment #4)
> - 3.12.24 seem to have the issue too (?)

after long use, I retract, 3.12.24 seems to work fine.


3.13, 3.16, are buggy


It still exists in 4.1.0, 
as an example, when trying to save a file, I got the dialog:

----
**Could not read the contents of my-folder**
Error when getting information for file '/path/to/file/foo': 
Cannot allocate memory
----



as far as I can tell it appeared between 3.12 and 3.13
Comment 6 Luc Pi 2015-04-30 08:47:03 UTC
so maybe from this diff???

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/fs/hfsplus/?id=v3.12&id2=v3.13



Can anybody comment?
Comment 7 Luc Pi 2015-05-05 11:01:09 UTC
Other errors I get, with 4.1.0

- "Error when getting information for file 'foo': invalid argument"

- "Error when getting information for file 'foo': input/output error"
Comment 8 Luc Pi 2015-05-06 09:29:14 UTC
When the issue is present, nor OSX Disk Utility, 
nor fsck.hfsplus report any issue


$ ls /media/share/
ls: cannot access /media/share/foo: Cannot allocate memory
foo bar

$ sudo fsck.hfsplus /dev/sda7
** /dev/sda7
** Checking HFS Plus volume.
** Detected a case-sensitive catalog.
** Checking Extents Overflow file.
** Checking Catalog file.
** Checking Catalog hierarchy.
** Checking Extended Attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume share appears to be OK.
Comment 9 Sergei 2015-06-07 14:19:06 UTC
Created attachment 178961 [details]
Fix

Please, check this patch out.
Comment 10 dreamcat4 2015-07-03 06:50:54 UTC
Hi!
I tried this fix on "3.19.0-21-generic" kernel. For that version of kernel only the first line of patch was needed. Because the second part of Sergi's patch was already #if 0 disabled.

Then re-ran my copy of large folders (many files) from HFS+ --> ext4 drive. And there were no errors. 

Previously had these error messages:

 Cannot allocate memory
 Invalid argument

Which would not go away (3 times i try without this patch - always got the same error messages).

So I think it worked! When I use rsync --dry-run afterwards (to look for missing files) - it shows nothing more to copy.
Comment 11 Sergei 2015-07-03 11:18:56 UTC
The code in #if 0 is re-enabled by the patch. You see, the patch removes "#if 0" and "#endif" (also adds curly brackets for readability).  With only the first part of the patch, your tests were expected to work too. But the second part of the patch is there to return unneeded memory pages back to the system.
Comment 12 Luc Pi 2015-08-19 08:07:13 UTC
(In reply to dreamcat4 from comment #10)
> Previously had these error messages:
>  Cannot allocate memory
>  Invalid argument

it's nice to hear that I am not the only one!


(In reply to Sergei from comment #9)
> Created attachment 178961 [details]
> Fix
> 
> Please, check this patch out.

I finally managed to test your patch. It worked flawlessly for a few days.

*Many* thanks Sergei for catching this!
Comment 13 Luc Pi 2015-08-19 08:09:48 UTC
Apparently the patch was merged in Andrew Morton's -mm tree and hence linux-next.

It might be merged into 4.3 kernels.