Bug 8501 - udivdi3 absence with gcc-4.3.0 on kernels 2.6.20.11 & 2.6.22.-rc1
Summary: udivdi3 absence with gcc-4.3.0 on kernels 2.6.20.11 & 2.6.22.-rc1
Status: CLOSED CODE_FIX
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: other_other
URL:
Keywords:
: 8500 8530 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-05-18 08:56 UTC by Rainer Malitzke-Goes
Modified: 2008-03-14 05:56 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.20.11, 2.6.22-rc1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
time.s of kernel-2.6.22-rc1 (35.15 KB, application/octet-stream)
2007-05-18 12:18 UTC, Rainer Malitzke-Goes
Details
timekeping.s of kernel-2.6.22-rc1 (43.80 KB, application/octet-stream)
2007-05-18 12:42 UTC, Rainer Malitzke-Goes
Details

Description Rainer Malitzke-Goes 2007-05-18 08:56:13 UTC
Most recent __gcc__ compiler where this bug did *NOT* occur: gcc-4.2.0
Distribution:
Hardware Environment: x86 (will try on MAC G4 machine)
Software Environment: gcc compilers
Problem Description: "make V=1"  output 

kernel/built-in.o: In function `getnstimeofday':
(.text+0x1e48d): undefined reference to `__udivdi3'
kernel/built-in.o: In function `do_gettimeofday':
(.text+0x1e5b5): undefined reference to `__udivdi3'
kernel/built-in.o: In function `update_wall_time':

(.text+0x1e9e8): undefined reference to `__udivdi3'
make: *** [.tmp_vmlinux1] Error 1

Steps to reproduce: use gcc-4.3.0 (experimental)

Dose not occur with gcc-4.2.0 or earlier

This is just a cautionary advisory for experts from the kernel, gcc (and
perhasps binutils) to hash out. I do not belive it is urgent, but it concerns
different organizations.
Comment 1 Andrew Morton 2007-05-18 09:04:54 UTC
darn.  Please do

make kernel/time.s
make kernel/time/timekeeping.s

and attach time.s and timekeeping.s to this report, thanks.
Comment 2 Rainer Malitzke-Goes 2007-05-18 09:16:26 UTC
The correspondig PR on "gcc.gnu.org" is

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31990
Comment 3 Dave Jones 2007-05-18 09:47:42 UTC
*** Bug 8500 has been marked as a duplicate of this bug. ***
Comment 4 Rainer Malitzke-Goes 2007-05-18 12:18:03 UTC
Created attachment 11539 [details]
time.s of kernel-2.6.22-rc1

First of two requested
Comment 5 Rainer Malitzke-Goes 2007-05-18 12:42:05 UTC
Created attachment 11540 [details]
timekeping.s of kernel-2.6.22-rc1

Second requested file

Observation:

Even if you people at osdl make changes to time.c and timekeeping.c you might
still not be able to compile kernel-2.6.22-rc1 with gcc-4.3.0 because of a
bitfield addressing problem I reported earlier to gcc as PR31541 and somebody
else reported an equivalent problem with kernel-2.6.20.7 also having bitfield
problems and reported under PR31654.
Comment 6 Adrian Bunk 2007-05-18 16:09:20 UTC
On Fri, May 18, 2007 at 03:29:35PM -0700, Andrew Morton wrote:
>...
> I expect that this optimisation will remain in gcc-4.3 and we'll end up
> having major kernel releases which don't build on i386 with major gcc
> releases, which isn't altogether desirable.  I suspect we'll need to fix this
> fairly urgently, and to backport the fix into a number of kernel releases.
>...

Building old kernels with the latest gcc has never worked [1].

First of all, gcc 4.3 is still one year away from being released, so 
there's zero urgency.

And considering that compile errors are the trivial things, and the real 
problems are new runtime errors caused by incorrect kernel code combined 
with gcc optimization [2] I'd even consider backporting only the compile 
fixes before the real issues have been shaken out a bad idea.

cu
Adrian

[1] the backported gcc 4 support in kernel 2.4 is an exception
[2] look e.g. at commit 8690ba446defe2e2b81803756c099d2943dfd5fd

Comment 7 Thomas Gleixner 2007-05-18 16:10:09 UTC
On Fri, 2007-05-18 at 15:29 -0700, Andrew Morton wrote:
> We use the above idiom in several places.  A suitable fix might be to hunt
> down those various sites and then make them call a helper function which
> does
> 
> 	if (unlikely(ns >= NSEC_PER_SEC)) {
> 		do_div(...)
> 	}
> 
> (Better would be to inline the comparison and to uninline the do_div(),
> if it's a 32-bit arch.  Doing all this in a backportable fashion may
> prove tricky)

A suitable fix would be to slap the gcc maintainers with a large trout
to fix this nonsense. At least they should provide a command line option
to prevent this extra intelligent optimization.

	tglx


Comment 8 Anonymous Emailer 2007-05-18 16:44:20 UTC
Reply-To: segher@kernel.crashing.org

> gcc-4.3 appears to have cunningly converted this:
>
> static inline void timespec_add_ns(struct timespec *a, u64 ns)
> {
> 	ns += a->tv_nsec;
> 	while(unlikely(ns >= NSEC_PER_SEC)) {
> 		ns -= NSEC_PER_SEC;
> 		a->tv_sec++;
> 	}
> 	a->tv_nsec = ns;
> }
>
> into a divide-by-1000000000 operation, so it emits a call to udivdi3 
> and we
> don't link.

Exactly.  It obviously is a bug in the kernel that it depends
on certain compiler optimisations that it doesn't have direct
control over to happen or not.  OTOH, GCC's behaviour here is
probably a non-optimal code issue; it doesn't seem to take the
unlikely() into account when doing the loop transform.

> I expect that this optimisation will remain in gcc-4.3

If someone files a *useable* problem report it most likely
will be taken care of, actually.

> and we'll end up
> having major kernel releases which don't build on i386 with major gcc
> releases, which isn't altogether desirable.

Yeah, like 4.2.0 with powerpc.  Seems like no one tested it :-(

> I suspect we'll need to fix this
> fairly urgently, and to backport the fix into a number of kernel 
> releases.

If it is 4.3 only, you could instead try to work *with* the GCC
people.  It _is_ very fragile code of course, it wouldn't hurt
to replace it with something better.

> We use the above idiom in several places.  A suitable fix might be to 
> hunt
> down those various sites and then make them call a helper function 
> which
> does
>
> 	if (unlikely(ns >= NSEC_PER_SEC)) {
> 		do_div(...)
> 	}
>
> (Better would be to inline the comparison and to uninline the do_div(),
> if it's a 32-bit arch.  Doing all this in a backportable fashion may
> prove tricky)

Perhaps putting a compiler barrier in there would be enough -- like
an empty asm() that takes the loop variable as input.


Segher

Comment 9 Adrian Bunk 2007-05-24 11:19:48 UTC
*** Bug 8530 has been marked as a duplicate of this bug. ***
Comment 10 Damir Islamov 2008-01-17 09:47:44 UTC
Hi,

I have an workaround, may be it will be good fix:

--- include/linux/time.h.orig 2008-01-17 23:09:16.000000000 +0600
+++ include/linux/time.h      2008-01-17 23:13:20.000000000 +0600
@@ -2,6 +2,7 @@
 #define _LINUX_TIME_H

 #include <linux/types.h>
+#include <asm/div64.h>

 #ifdef __KERNEL__
 # include <linux/cache.h>
@@ -171,10 +172,13 @@
  */
 static inline void timespec_add_ns(struct timespec *a, u64 ns)
 {
+       time_t delta;
        ns += a->tv_nsec;
-       while(unlikely((long)ns >= NSEC_PER_SEC)) {
+       delta = div64_64(ns, NSEC_PER_SEC);
+       while(unlikely(delta >= 0 )) {
                ns -= NSEC_PER_SEC;
                a->tv_sec++;
+               delta--;
        }
        a->tv_nsec = ns;
 }


The second solution might be better:

--- time.h.orig 2008-01-17 23:09:16.000000000 +0600
+++ time.h      2008-01-17 23:40:05.000000000 +0600
@@ -2,6 +2,7 @@
 #define _LINUX_TIME_H

 #include <linux/types.h>
+#include <asm/div64.h>

 #ifdef __KERNEL__
 # include <linux/cache.h>
@@ -171,11 +172,11 @@
  */
 static inline void timespec_add_ns(struct timespec *a, u64 ns)
 {
+       time_t delta;
        ns += a->tv_nsec;
-       while(unlikely((long)ns >= NSEC_PER_SEC)) {
-               ns -= NSEC_PER_SEC;
-               a->tv_sec++;
-       }
+       delta = div64_64(ns, NSEC_PER_SEC);
+       a->tv_sec +=delta;
+       ns -= delta*NSEC_PER_SEC;
        a->tv_nsec = ns;
 }
 #endif /* __KERNEL__ */



With this patches the kernel can be build on i386 with gcc-4.3.

These two solutions cover time period more that 136 years in 'ns' variable, while the original code covers about 585 years. To my mind 136 years is more that enough.
Comment 11 Adrian Bunk 2008-01-19 06:31:58 UTC
Thanks for your suggestions.

We already have suggestions for workarounds that are less invasive than the patches you suggested.

What we actually need long-term are not workarounds but an agreement between the gcc developers and the kernel developers whether or not the kernel has to link against libgcc.
Comment 12 Pierre Ossman 2008-03-03 12:55:39 UTC
Any progress on this? It's getting a bit annoying as Fedora is now shipping gcc 4.3.0 in rawhide.
Comment 13 Adrian Bunk 2008-03-14 05:56:39 UTC
fixed by commit 38332cb98772f5ea757e6486bed7ed0381cb5f98

Note You need to log in before you can comment on or make changes to this bug.