Bug 69021 - Nondeterministic output from kallsyms ... maybe? Cause for inconsistent kallsyms data
Summary: Nondeterministic output from kallsyms ... maybe? Cause for inconsistent kalls...
Status: NEW
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: other_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-20 09:31 UTC by Hrvoje
Modified: 2014-01-22 16:35 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.12.8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
System map from first pass (936.89 KB, application/octet-stream)
2014-01-20 09:31 UTC, Hrvoje
Details
System map fro second pass (936.89 KB, application/octet-stream)
2014-01-20 09:31 UTC, Hrvoje
Details
Kernel .config file for build (53.14 KB, text/plain)
2014-01-20 17:44 UTC, Hrvoje
Details
Patch fixing found issues (974 bytes, patch)
2014-01-22 16:35 UTC, Hrvoje
Details | Diff

Description Hrvoje 2014-01-20 09:31:24 UTC
Created attachment 122681 [details]
System map from first pass

Hi.

First, i'm not sure if this is the right place to put this. I'm also not sure if this is a bug, or just a feature.

In short, i have two very similar symbol maps (system.map). Both are generated while building kernel, and they are from stage 1 and stage 2 of embeding symbol table in kernel.

Both are processesed by kallsyms utility, to produce "compressed" output. And of course, outputs are different. Not only in content, but also in size!

Now, because comparison in last stage of inserting symbols table in kernel, build fails with "Inconsistent kallsyms data". Because of change in size, all additional stages of build fail, so "KALLSYMS_EXTRA_PASS=1" does _not_ help.

Attached you will find two files, symbols maps. You can run them through latest version of kallsyms, and it will produce different output for each. Difference in them is only in sorting, so this _should not_ case such big difference in output.

Or it should?

Regards,

H.
Comment 1 Hrvoje 2014-01-20 09:31:55 UTC
Created attachment 122691 [details]
System map fro second pass
Comment 2 Alan 2014-01-20 16:43:40 UTC
Can you attach the ".config" and the sequence of build instructions used.

Not sure if its a bug - I would ask linux-kernel@vger.kernel.org
Comment 3 Hrvoje 2014-01-20 17:43:48 UTC
Hi.

It will take some time for me to get buildlog. Also, those symbol tables were produced compiling hevily patched 3.2.26, and build for ARM (openwrt build system). Some of patches are not publicly available, so i guess that it is not possible for anyone else to reproduce build environment.

Still, i think that this is general problem, and can affect build for any platform. BUT, the same problems/effects as those related to "Inconsistent kallsyms data" error message are present - _any_ change in .config or any other related tool, will not yield symbol table that catches this issue (that is, symbols change location/order, and thus, this issue goes away).

Just in case, i'm attaching .config file. Build process goes as it should - first .tmp_vmlinux1 is built, then symbols extracted, then using this table .tmp_vmlinux2 is built, and again symbol table is created, and then vmlinux is built, and it's symbol table (system.map) is compared to one from .tmp_vmlinux2.

And there build fails with "Inconsistent kallsyms data".

Do you wont me to mail to linux-kernel list, or you will do it?

Regards,

H.
Comment 4 Hrvoje 2014-01-20 17:44:21 UTC
Created attachment 122781 [details]
Kernel .config file for build
Comment 5 Alan 2014-01-20 17:49:29 UTC
If it's using non public patches then I think you are on your own. To chase it down someone would neeed a clean way to reproduce it.

Still worth asking - perhaps the ARM list might be better given its a build funny and they tend to be platform specific ?
Comment 6 Hrvoje 2014-01-20 17:59:06 UTC
Hi.

I can reproduce it every time, and this is the reason why i did manage to track this down to kallsyms.

I guess that root problem is in code alignment, that is, specification to linker (lds files) how code should be aligned.

Since kallsyms utils reproduces different output size, it changes location of symbols, and it can happen that alignment cause, again, code (symbol) reordering.

As i did say, this is root cause of all "Inconsistent kallsyms data" errors, and it can happen on any architecture and .config file - it just depends how lucky you are. :-)

I would just like to know is this behaviour of "kallsyms" util intended, or it should not behave this way. If it is not intended, or desired (and i think it is not!), then this would become a bug report.

I'll send a mail to linux-kernel list, and redirect to this bug report.

In any case, i would like to see that kallsyms util behaves in "predictable" way, and that _just_ symbol reordering should not cause that, after compression, symbols table change size.

Regards,

H.
Comment 7 Hrvoje 2014-01-21 17:35:30 UTC
Update.

After some more investigation, it seems that one of symbols is _removed_ from symbol table, and this causes change in size of compressed table.

This piece of code is doing "dropping" part:

                /* Corner case.  Discard any symbols with the same value as
                 * _etext _einittext; they can move between pass 1 and 2 when
                 * the kallsyms data are added.  If these symbols move then
                 * they may get dropped in pass 2, which breaks the kallsyms
                 * rules.
                 */
                if ((s->addr == text_range_text->end &&
                                strcmp((char *)s->sym + offset, text_range_text->etext)) ||
                    (s->addr == text_range_inittext->end &&
                                strcmp((char *)s->sym + offset, text_range_inittext->etext)))

I'm under impression that this could be wrong ... I'm not shure that _any_ type symbol can be "removed" because it is aligned with those text sections.

Regards,

H.
Comment 8 Hrvoje 2014-01-22 16:34:02 UTC
Hi.

For test, i did compile few kernels (3.2.26,3.3.8,3.8.13 and 3.12.8), patched and vanilla, and compared system.map (symbol table) from "pass 1" and "pass 2" of inserting symbol table. I tried them for x86 and arm architecture.

In all this builds, i did not found one symbol that it is present in pass 1, and missing or added in pass 2. That is, symbols does not disappear and appear between stages, regardless of their alignment.

From this i would conclude that comment quoted in previous post is not true anymore, and this part of code should be discarded as not true anymore.

I did made a small patch, that do two things:
1. removes this statement which deletes symbols if they are aligned on etext* boundary
2. initializes token_profit variable, which seems to be used uninitialzed

I would like if this patch could be tested on more different builds, to be sure that it does not introduce some new issues. Also, this should completely remove false "Inconsistent kallsyms data" error, permanently.

Regards,

H.
Comment 9 Hrvoje 2014-01-22 16:35:48 UTC
Created attachment 123031 [details]
Patch fixing found issues

Note You need to log in before you can comment on or make changes to this bug.