This is essentially cross-post of my gcc.gnu.org/buzilla posting (restricted to my own work product) the original posting is: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32044 Due to an (intentionally) offensive refusal to cross-post (which coming from the individual involved I took as compliment) I hesitate to provide here anything but my own work and the final outcome. There are surely people within the linux organization that have access to gcc bugzilla. The result is a new optimization effort to make udivdi3 much less likely when the introduction of udivdi3 would be counterproductive as demonstrated below. More important was the revelation after much prodding that there exists a gcc flag: note this can be worked around with -fno-tree-scev-cprop as well. I was only able to verify this flag on my test program below. I am just not sufficiently versed in the kernel Makefile arrangement to have this flag work only on selected compilations like kernel/time.c and kernel/time/timekeeping.c. I would be more than happy to test this flags with the help of some guidance. Now to my initial submission: According to my (admittedly second hand (Fifth Edition of "C A Reference Manual" by Samuel P. Harbison III & Guy L. Stelle Jr) reading; GCC, by not providing a means to disable the use of libgcc (including udivdi3) is not in strict conformance with the C standard for free-standing use through C99. __udivdi3 is a reserved identifier and hence non-conforming. The irony is that, besides, being non-conforming and prejudicing free standing applications that aim for maximum portability, it is highly counterproductive in its own right. Also, the forced and silent use of libgcc (lld does not show it being used) violates one of the fundamental principles of both UNIX and C. Namely that the user (certainly root) is to be in full and absolute command of the system without hidden reinterpretation of his commands or MS type questions. As a practical matter the use of __builtin_expect could be taken as signal to allow only reordering of instructions (to avoid pipeline stalls and reloading of caches) are to be avoided in the marked unlikely cases. Any fundamental changes like exchanging a while and a subtraction for a non-hardware divide should no occur If anybody at GCC wants to know what others (including L. Torvalds and A. Morton) think; checking Google on udivdi3 might be instructive. What follows are the result of tests using current versions of gcc-4.3 and 4.2.1. I believe the results speak for themselves. Besides the data for x86 I also have quite similar data for powerpc G4, which I will make available as a follow on. Yes, I did us long long to place the shortcomings into more evidence. Program (gcc-4.3.0 first followed by gcc-4.2.1 compilations) #define NSEC_PER_SEC 1000000000UL int rmg(void); int main(void) { /* int sec; */ return rmg(); } int rmg(void) { static unsigned long long nsec = 0; static int sec = 0; while (sec < 1 ) { nsec++; while (__builtin_expect(nsec >= NSEC_PER_SEC, 0)) { nsec -= NSEC_PER_SEC; ++sec; } } return sec; } gcc_43 -O0 -rwxr-xr-x 1 root root 8478 2007-05-22 08:23 rmgg_O0 -rw-r--r-- 1 root root 1238 2007-05-22 08:18 rmgg_O0.s real 0m27.613s user 0m27.607s sys 0m0.003s gcc_43 -O1 -rwxr-xr-x 1 root root 12586 2007-05-22 08:25 rmgg_O1 -rw-r--r-- 1 root root 1572 2007-05-22 08:25 rmgg_O1.s real 0m12.776s user 0m12.775s sys 0m0.003s gcc_43 -O2 -rwxr-xr-x 1 root root 12586 2007-05-22 08:27 rmgg_O2 -rw-r--r-- 1 root root 1874 2007-05-22 08:27 rmgg_O2.s real 0m16.415s user 0m16.414s sys 0m0.004s gcc_43 -Os -rwxr-xr-x 1 root root 12586 2007-05-22 08:29 rmgg_Os -rw-r--r-- 1 root root 1925 2007-05-22 08:29 rmgg_Os.s real 2m8.817s user 2m8.831s sys 0m0.003s Program #define NSEC_PER_SEC 1000000000UL int rmg(void); int main(void) { /* int sec; */ return rmg(); } int rmg(void) { static unsigned long long nsec = 0; static int sec = 0; while (sec < 1 ) { nsec++; while (__builtin_expect(nsec >= NSEC_PER_SEC, 0)) { nsec -= NSEC_PER_SEC; ++sec; } } return sec; } gcc_42 -O0 -rwxr-xr-x 1 root root 8471 2007-05-21 16:46 rmgg_O0 -rw-r--r-- 1 root root 1236 2007-05-21 16:41 rmgg_O0.s time ./rmgg_O0 real 0m27.678s user 0m27.680s sys 0m0.002s Script done on Mon 21 May 2007 04:53:29 PM EDT gcc_42 -O1 -rwxr-xr-x 1 root root 8471 2007-05-21 16:41 rmgg_O1 -rw-r--r-- 1 root root 1572 2007-05-22 09:39 rmgg_O1.s Script started on Mon 21 May 2007 04:56:20 PM EDT time ./rmgg_O1 real 0m12.771s user 0m12.767s sys 0m0.003s Script done on Mon 21 May 2007 04:56:55 PM EDT gcc_42 -O2 -rwxr-xr-x 1 root root 8471 2007-05-21 16:41 rmgg_O2 -rw-r--r-- 1 root root 1262 2007-05-21 17:41 rmgg_O2.s Script started on Mon 21 May 2007 04:57:14 PM EDT time ./rmgg_O2 real 0m12.532s user 0m12.531s sys 0m0.003s Script done on Mon 21 May 2007 04:58:18 PM EDT gcc -Os -rwxr-xr-x 1 root root 8471 2007-05-21 16:41 rmgg_Os -rw-r--r-- 1 root root 1017 2007-05-21 16:40 rmgg_Os.s Script started on Mon 21 May 2007 04:58:30 PM EDT time ./rmgg_O2 real 0m12.571s user 0m12.562s sys 0m0.004s Script done on Mon 21 May 2007 04:59:11 PM EDT Thank you Mr Taylor; your suggestion to use volatile certainly work in this drastically reduced test case. If it will work when nsec is part of a kernel structure I will leave to the experts. I, certainly, know better than to argue with you, who wrote uucp and has been active on gcc for probably 15 years or more. The volatile fix would be fine, but (at least for me) does not work with the kernel. There is that little message: kernel/time.c:479: warning: passing argument 3 of 'div_long_rem_signed' discards qualifiers from pointer target type. and others like it, and, udivdi3 reappears. Mr. Park! The patch you kindly included in comment 3 presented two difficulties: 1. I Acould not extricate it cleanly enough from the html encoding apparently standard with bugzilla. (this is a Mozilla Product) 2. After some editing patch just accepted 1 hunk and upon checking it turned out that the svn derived tree-scalar.evolution.c did not match the enclosing lines around the lines to be added. I added those lines by hand (possibly imperfectly, enven on careful checking) the file compiled OK, but, runnign the gcc check sequence I got a stream of error. These errors disappeared on using a sequestered unpatched copy of the file. Hence, udivdi3 reappeared. If you see fit for me to test this not only on the reduced test case but on the actual kernel I suggest sending me a updated patch as a text attachment to my email. Thanks to all for trying to help. Thank you (muchas gracias) for looking at the matter from a user's point of view and considering my arguments concerning __builtin_expect. You seem to be the first to look at the timings and amount of code generated. If you are interested I have equivalent data taken on a MAC with dual G4's. I did not send it so far because until you intervened I got mostly legalistic arguments and proposed fixes that do no solve the real problem of avoiding both udivdi3 and more importantly libgcc. I have *.s files taken with both gcc-4.2.1 and gcc-4.3.0 that are attachmnets to PR 32044. I also equivalent data taken on linux kernel MAC G4 that nobody at gcc requested and that show similar counterproductive trends.
*** This bug has been marked as a duplicate of 8501 ***