Bug 1669
Summary: | ACPI global lock macros | ||
---|---|---|---|
Product: | ACPI | Reporter: | Luming Yu (luming.yu) |
Component: | Power-Other | Assignee: | Luming Yu (luming.yu) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | acpi-bugzilla |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.4 2.6 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
a patch for fixing this issue on i386 platform
c version x86_64 version |
Description
Luming Yu
2003-12-10 23:18:07 UTC
Created attachment 1658 [details]
a patch for fixing this issue on i386 platform
This patch is based on paul 's proposal patch, and correct one error in that
proposal.
The existing code is definately incorrect, and will result in ACPI_ACQUIRE_GLOBAL_LOCK() to always succeed -- even if the lock was held. The patch assembles correctly and matches the ACPI spec. But Luming, I'm curious why you didn't include Paul's suggestion of letting gcc choose the output register and instead left it hard-coded to eax? eg. "sbbl %0,%0" \ :"=r"(Acq) ... I've decoded most of the asm syntax, but still don't understand what the last parameter means. ie. - ...:"dx"); + ...:"dx", "ax"); Is it _really_ impossible to do this in C? ia64 and x86_64 seem to be able to pull it off... thanks, -Len Patch filed here is not the latest one, please take a look at http://bugzilla.kernel.org/show_bug.cgi?id=1781#c10 Len, Regarding hardcode register in Inline assembly code, I just prefer programmer has more control over register allocation. Otherwise, there could be some potential bugs introduced by GCC. Regarding why not use C code, I think of Inline assembly code is unable to get removed cleanly. Because we need instructions like "lock; cmpxchgl.." Of cause, we can minimalize the usage of Inlie assemly code like IA64. Thanks, Luming Luming, This isn't performance critical -- it is "correctness" critical;-) So I'd like to see a solution in C like ia64 and x86_64, with the assembly only used for the exotic atomic instructions. thanks, -Len Created attachment 2238 [details]
c version
This is c version for i386. It is a little bit different with current X86_64
version. I try to copy code from include/asm-x86_64/acpi.h,
static inline __u32 cmpxchg4_locked(__u32 *ptr, __u32 old, __u32 new)
{
asm volatile("lock ; cmpxchgl %k1,%2" :
"=r" (new) : "0" (old), "m" (*(__u32 *)ptr) : "memory");
return new;
}
static inline int
__acpi_acquire_global_lock (unsigned int *lock)
{
unsigned int old, new, val;
do {
old = *lock;
new = (((old & ~0x3) + 2) + ((old >> 1) & 0x1));
val = cmpxchg4_locked(lock, new, old);
} while (unlikely (val != old));
return (new < 3) ? -1 : 0;
}
but gcc generate the following bad code.
c01fc0d8: 8b 13 mov (%ebx),%edx
c01fc0da: 89 d0 mov %edx,%eax
c01fc0dc: 89 d1 mov %edx,%ecx
c01fc0de: d1 e8 shr %eax
c01fc0e0: 83 e1 fc and $0xfffffffc,%ecx
c01fc0e3: 83 e0 01 and $0x1,%eax
c01fc0e6: 8d 4c 08 02 lea 0x2(%eax,%ecx,1),%ecx
c01fc0ea: 89 c8 mov %ecx,%eax
c01fc0ec: f0 0f b1 03 lock cmpxchg %eax,(%ebx)
c01fc0f0: 39 d0 cmp %edx,%eax
Maybe current X86_64 version is wrong. We need confirm it using X86_64
compiler.
> -----Original Message-----
> From: Brown, Len
> Suresh,
> Any chance you can dis-assemble an invocation of
> __acpi_acquire_global_lock() on an x86_64 machine -- as built
> with the native compiler?
Disassembly of x86_64 portion:
ffffffff8023a952: 8b 16 mov (%rsi),%edx
ffffffff8023a954: 89 d0 mov %edx,%eax
ffffffff8023a956: 89 d1 mov %edx,%ecx
ffffffff8023a958: d1 e8 shr %eax
ffffffff8023a95a: 83 e1 fc and $0xfffffffffffffffc,%ecx
ffffffff8023a95d: 83 e0 01 and $0x1,%eax
ffffffff8023a960: 8d 4c 08 02 lea 0x2(%rax,%rcx,1),%ecx
ffffffff8023a964: 89 c8 mov %ecx,%eax
ffffffff8023a966: f0 0f b1 06 lock cmpxchg %eax,(%rsi)
ffffffff8023a96a: 39 d0 cmp %edx,%eax
ffffffff8023a96c: 75 e4 jne ffffffff8023a952
<acpi_ev_acquire_global_lock+0x7b>
ffffffff8023a96e: 83 f9 03 cmp $0x3,%ecx
ffffffff8023a971: 19 c0 sbb %eax,%eax
My question is around here: ffffffff8023a964: 89 c8 mov %ecx,%eax ffffffff8023a966: f0 0f b1 06 lock cmpxchg %eax,(%rsi) According to IA32 instruction reference, eax should hold old value. Created attachment 2239 [details]
x86_64 version
this is patch for X86_64. Who can help testing?
Ok. It now generates correct code. ffffffff8023a956: 8b 0e mov (%rsi),%ecx ffffffff8023a958: 89 c8 mov %ecx,%eax ffffffff8023a95a: 89 ca mov %ecx,%edx ffffffff8023a95c: d1 e8 shr %eax ffffffff8023a95e: 83 e2 fc and $0xfffffffffffffffc,%edx ffffffff8023a961: 83 e0 01 and $0x1,%eax ffffffff8023a964: 8d 7c 10 02 lea 0x2(%rax,%rdx,1),%edi ffffffff8023a968: 89 c8 mov %ecx,%eax ffffffff8023a96a: 89 fa mov %edi,%edx ffffffff8023a96c: f0 0f b1 16 lock cmpxchg %edx,(%rsi) ffffffff8023a970: 39 c1 cmp %eax,%ecx ffffffff8023a972: 75 e2 jne ffffffff8023a956 And kernel boots fine on my system. The following is disassembly code from IA64: a000000100317910: 0b 10 00 22 10 10 [MMI] ld4 r2=[r17];; a000000100317916: b0 e0 0b 58 44 00 and r11=-4,r2 a00000010031791c: 22 10 00 52 extr.u r16=r2,1,1;; a000000100317920: 0a 50 2c 20 00 20 [MMI] add r10=r11,r16;; a000000100317926: 00 00 00 02 00 e0 nop.m 0x0 a00000010031792c: 21 50 00 84 adds r15=2,r10 a000000100317930: 0b 00 08 40 2a 04 [MMI] mov.m ar.ccv=r2;; a000000100317936: 90 78 44 22 20 00 cmpxchg4.acq r9=[r17],r15,ar.ccv a00000010031793c: 00 00 04 00 nop.i 0x0;; a000000100317940: 10 00 00 00 01 00 [MIB] nop.m 0x0 a000000100317946: 90 48 08 10 71 04 cmp4.eq p9,p8=r9,r2 a00000010031794c: d0 ff ff 4b (p08) br.cond.dpnt.few a000000100317910 <acpi_ev_global_lock_handler+0xf0> applied to 2.4.26 and 2.6.5 -- please close. |