Bug 11074

Summary: /dev/cpu/*/cpuid malfunctioning
Product: Platform Specific/Hardware Reporter: Peter Ganzhorn (peter.ganzhorn)
Component: i386Assignee: platform_i386
Status: CLOSED INVALID    
Severity: normal CC: alan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25.10 Subsystem:
Regression: No Bisected commit-id:

Description Peter Ganzhorn 2008-07-12 09:01:36 UTC
I recently discovered a tool I wrote producing weird output on one of my machines. Obviously the /dev/cpu/*/cpuid interface is malfunctioning on this machine (P4 Xeon)

Here's the code of a testing program I wrote that confirms the malfunction:

---
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <sys/types.h>
#include <stdint.h>
#include <errno.h>
#include <unistd.h>

void cpuid(int *a, int *b, int *c, int *d)
{
    asm volatile (" mov %4, %%eax;"
                  " mov %5, %%ecx;"
                  " cpuid;"
                  " mov %%eax, %0;"
                  " mov %%ebx, %1;"
                  " mov %%ecx, %2;"
                  " mov %%edx, %3;"
  /* Output */    : "=a" (*a), "=r" (*b), "=c" (*c), "=d" (*d)
  /* Input  */    : "a" (*a), "c" (*c)
  /* Clobbered */ : "cc");
}

int cpuidx(int *ra, int *rb, int *rc, int *rd)
{
  int fd;
  char cpuid_filename[64];
  int cpu=0;

  uint32_t data[4];
  uint32_t adr[2];
  uint64_t address;

  adr[0]=*ra;
  adr[1]=*rc;

  if ( *ra != 4 ) *rc=0;

  memcpy(&address,adr,8);

  //address=adr[0];
  //printf("%X\n",address);

  sprintf(cpuid_filename, "/dev/cpu/%d/cpuid", cpu);

  if ( (fd = open(cpuid_filename, O_RDONLY)) < 0 )
  {
    printf("Error opening %s errno=%d (%s)\n", cpuid_filename, errno, strerror(errno));
    return errno;
  }

  lseek(fd, address, SEEK_SET);
  read(fd, &data, 16);

  close(fd);

  *ra=data[0];
  *rb=data[1];
  *rc=data[2];
  *rd=data[3];

  return 0;
}

int main()
{
  int a=0,b=0,c=0,d=0;
  a=0x80000000;
  cpuid(&a,&b,&c,&d);
  printf("%X %X %X %X\n",a,b,c,d);
  a=0x80000000;
  cpuidx(&a,&b,&c,&d);
  printf("%X %X %X %X\n",a,b,c,d);

  return 0;
}
---

Output on my laptop (OK):
# ./test 
80000008 0 0 0
80000008 0 0 0

Output on the P4 Xeon:
# ./test 
80000004 0 0 0
B7F50FF4 B7F514F8 1 BFF4F140

Running the tool again will produce different values in the second line (read through /dev interface). The first line is the direct usage of the cpuid instruction via inline-assembly and obviously correct.
I noticed that although the values differ, the "4" at the end of the first integer is always printed and does not change. In most cases the integers start with "BF" or "B7", everything else seems random.

Please investigate this matter and tell me whether I made an incredibly dumb mistake here or if /dev/cpu/*/cpuid needs some serious fixing...
Comment 1 Andrew Morton 2008-07-12 13:03:09 UTC
I reassigned this to platform_i386@kernel-bugs.osdl.org
Comment 2 Peter Ganzhorn 2008-07-13 11:22:04 UTC
I have found yet another machine that shows the same erratic behaviour, it's even worse:
On the P4 Xeon only cpuid-calls with EAX>=0x80000000h returned random values, on this machine (P4 Prescott) even EAX=0x0h returns random values. I am unable to get one single correct reading from /dev/cpu/0/cpuid on this machine.
The testing OS was Knoppix 5.1, running a 2.6.18 kernel.

My workstation which returns correct data is running a 2.6.26-rc9 kernel, I'll do some testing with 2.6.25.10 (running on the P4 Xeon) on my workstation, although I doubt this will change too much, as there are no major changes in the kernels cpuid.c since 2.6.25.10.

Both machines returning random data have a hyper-threading CPU, my workstation has a multi-core cpu. All CPUs are Intel CPUs, I haven't tested any AMD CPUs so far.
Comment 3 Thomas Gleixner 2008-09-05 05:42:06 UTC
Can you please add return value checks to those two functions ?

  lseek(fd, address, SEEK_SET);
  read(fd, &data, 16);

Thanks,
         tglx
Comment 4 H. Peter Anvin 2008-09-05 09:05:57 UTC
Additionally, to work correct on 32-bit platforms, you need to either compile with -D_FILE_OFFSET_BITS=64 or use open64() and pread64().  In general it is better to use pread() than lseek+read, if not for any other reason than it saves a system call.

Oh, and do run strace on your programs for debugging this class of problems.