Bug 6613
Summary: | iptables broken on 32-bit PReP (ARCH=ppc) | ||
---|---|---|---|
Product: | Networking | Reporter: | Meelis Roos (mroos) |
Component: | Netfilter/Iptables | Assignee: | Harald Welte (laforge) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | protasnb, stelian |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17-rc4 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
.config for 2.6.17-rc5-git
Patch hopefully fixing the problem |
Description
Meelis Roos
2006-05-25 03:02:11 UTC
Additionally, when stracing the failed iptables -A ..., iptables is killed with SIGSEGV (not while in a syscall) and doen not yield a Invalid Argument error. This might of course be another bug, in ppc ptrace or iptables userspace program or whatever. bugme-daemon@bugzilla.kernel.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=6613 > > Summary: iptables broken on 32-bit PReP (ARCH=ppc) > Kernel Version: 2.6.17-rc4 > Status: NEW > Severity: normal > Owner: laforge@gnumonks.org > Submitter: mroos@linux.ee > > > Most recent kernel where this bug did not occur: none known, this is a fresh > install > Distribution: Debian unstable > Hardware Environment: 32-bit PowerPC 604 with PReP subarch (using old > ARCH=ppc) > Software Environment: usual 32-bit ppc userspace, gcc 4.0.3 > Problem Description: iptables operations usually just give "Incalida > operation". modprobe iptable_filter and adding rules to the nat table have > failed in testing while iptable_nat can be modprobed and listed. > > Steps to reproduce: > modprobe iptable_filter (errors out with Invalid Argument) > iptables -t nat -A POSTROUTING -s 10.0.0.0/8 -j SNAT --to 192.168.1.1 (usually > errors out with Invalid Argument, sometimes succeeds, when succeeds then the > rule works fine) > Andrew Morton wrote:
> bugme-daemon@bugzilla.kernel.org wrote:
>
>>http://bugzilla.kernel.org/show_bug.cgi?id=6613
>>
>> Summary: iptables broken on 32-bit PReP (ARCH=ppc)
>> Kernel Version: 2.6.17-rc4
>> Status: NEW
>> Severity: normal
>> Owner: laforge@gnumonks.org
>> Submitter: mroos@linux.ee
>>
>>
>>Most recent kernel where this bug did not occur: none known, this is a fresh
>>install
>>Distribution: Debian unstable
>>Hardware Environment: 32-bit PowerPC 604 with PReP subarch (using old
>>ARCH=ppc)
>>Software Environment: usual 32-bit ppc userspace, gcc 4.0.3
>>Problem Description: iptables operations usually just give "Incalida
>>operation". modprobe iptable_filter and adding rules to the nat table have
>>failed in testing while iptable_nat can be modprobed and listed.
>>
>>Steps to reproduce:
>>modprobe iptable_filter (errors out with Invalid Argument)
>>iptables -t nat -A POSTROUTING -s 10.0.0.0/8 -j SNAT --to 192.168.1.1 (usually
>>errors out with Invalid Argument, sometimes succeeds, when succeeds then the
>>rule works fine)
Meelis, it would really help if you could try 2.6.16 and in case
that doesn't work 2.6.15 to give an idea about whether this is a
recent regression or an old problem. We had a number of changes
in this area in the last two kernel versions that could be related.
> Meelis, it would really help if you could try 2.6.16 and in case
> that doesn't work 2.6.15 to give an idea about whether this is a
> recent regression or an old problem. We had a number of changes
> in this area in the last two kernel versions that could be related.
Yes, I'm still compiling 2.6.16, since just before sending the report.
Will let you know ASAP.
>>> http://bugzilla.kernel.org/show_bug.cgi?id=6613 > > Meelis, it would really help if you could try 2.6.16 and in case > that doesn't work 2.6.15 to give an idea about whether this is a > recent regression or an old problem. We had a number of changes > in this area in the last two kernel versions that could be related. 2.6.16 doesn't work either. Tried 2.6.8-3 from sarge package, it is working. Compiling 2.6.15 now... > Meelis, it would really help if you could try 2.6.16 and in case
> that doesn't work 2.6.15 to give an idea about whether this is a
> recent regression or an old problem. We had a number of changes
> in this area in the last two kernel versions that could be related.
Unfortunatlety, 2.6.15 does not boot on this machine so I'm locked out
remotely at the moment. Will see if I can find the boot cure - there
used to be a Motorola Powerstack-specific patch to make it boot that
Debian 2.6.18 and IIRC 2.6.12 packages included and that was integrated
somewhere later - maybe it's missing fom 2.6.15.
>>> modprobe iptable_filter (errors out with Invalid Argument) >>> iptables -t nat -A POSTROUTING -s 10.0.0.0/8 -j SNAT --to 192.168.1.1 (usually >>> errors out with Invalid Argument, sometimes succeeds, when succeeds then the >>> rule works fine) > > Meelis, it would really help if you could try 2.6.16 and in case > that doesn't work 2.6.15 to give an idea about whether this is a > recent regression or an old problem. We had a number of changes > in this area in the last two kernel versions that could be related. Have not gotten 2.6.15 to work with one evening of tinkering - the irq patch was not sufficent, there is something more broken in booting that I dodn't figure out yet. So no test results for 2.6.15 yet. Meelis Roos wrote:
>> Meelis, it would really help if you could try 2.6.16 and in case
>> that doesn't work 2.6.15 to give an idea about whether this is a
>> recent regression or an old problem. We had a number of changes
>> in this area in the last two kernel versions that could be related.
>
>
> Have not gotten 2.6.15 to work with one evening of tinkering - the irq
> patch was not sufficent, there is something more broken in booting that
> I dodn't figure out yet. So no test results for 2.6.15 yet.
Then lets try something different. Please enable the
DEBUG_IP_FIREWALL_USER define in net/ipv4/netfilter/ip_tables.c and
post the results, if any.
> Then lets try something different. Please enable the
> DEBUG_IP_FIREWALL_USER define in net/ipv4/netfilter/ip_tables.c and
> post the results, if any.
On bootup I get this in dmesg (one Bad offset has been added):
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
ip_conntrack version 2.4 (1536 buckets, 12288 max) - 224 bytes per conntrack
translate_table: size 632
Bad offset cb437924
ip_nat_init: can't setup rules.
And on iptables -t nat -L
translate_table: size 632
Bad offset cb4368f4
ip_nat_init: can't setup rules.
translate_table: size 632
Bad offset cb4368f4
ip_nat_init: can't setup rules.
Seems iptable_nat does not load at all this time.
Modprobe iptable_filter still fails, dmesg contains
translate_table: size 632
Finished chain 1
Finished chain 2
Finished chain 3
Next modprobe iptable_nat gives
translate_table: size 632
Bad offset c8e01944
ip_nat_init: can't setup rules.
Meelis Roos wrote:
>> Then lets try something different. Please enable the
>> DEBUG_IP_FIREWALL_USER define in net/ipv4/netfilter/ip_tables.c and
>> post the results, if any.
>
>
> On bootup I get this in dmesg (one Bad offset has been added):
>
> ip_tables: (C) 2000-2006 Netfilter Core Team
> Netfilter messages via NETLINK v0.30.
> ip_conntrack version 2.4 (1536 buckets, 12288 max) - 224 bytes per
> conntrack
> translate_table: size 632
> Bad offset cb437924
> ip_nat_init: can't setup rules.
>
> And on iptables -t nat -L
>
> translate_table: size 632
> Bad offset cb4368f4
> ip_nat_init: can't setup rules.
> translate_table: size 632
> Bad offset cb4368f4
> ip_nat_init: can't setup rules.
>
> Seems iptable_nat does not load at all this time.
>
> Modprobe iptable_filter still fails, dmesg contains
> translate_table: size 632
> Finished chain 1
> Finished chain 2
> Finished chain 3
>
> Next modprobe iptable_nat gives
>
> translate_table: size 632
> Bad offset c8e01944
> ip_nat_init: can't setup rules.
Very strange, this means that the initial table data must somehow
be wrong, but for some reason it still seems to get past the
size and offset checks for the filter table. I can't see how
loading the filter table could fail after the "Finished chain .."
messages without another message. Which kernel version did you
perform these test on?
> Very strange, this means that the initial table data must somehow
> be wrong, but for some reason it still seems to get past the
> size and offset checks for the filter table. I can't see how
> loading the filter table could fail after the "Finished chain .."
> messages without another message. Which kernel version did you
> perform these test on?
Yesterdays 2.6.17-rc5+git.
Meelis Roos wrote:
>> Very strange, this means that the initial table data must somehow
>> be wrong, but for some reason it still seems to get past the
>> size and offset checks for the filter table. I can't see how
>> loading the filter table could fail after the "Finished chain .."
>> messages without another message. Which kernel version did you
>> perform these test on?
>
>
> Yesterdays 2.6.17-rc5+git.
Please enable DEBUG_IP_FIREWALL_USER in net/netfilter/x_tables.c as well
and retry. Results of the raw or mangle table would also be interesting
because they contain a different number of built-in chains.
> Please enable DEBUG_IP_FIREWALL_USER in net/netfilter/x_tables.c as well
> and retry. Results of the raw or mangle table would also be interesting
> because they contain a different number of built-in chains.
Sorry it took so long, I was away. Adding this define does not seem to
do much (table->private->number prints only):
On boot (1 nat rule):
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
ip_conntrack version 2.4 (1536 buckets, 12288 max) - 224 bytes per conntrack
translate_table: size 632
Finished chain 0
Finished chain 3
Finished chain 4
table->private->number = 4
t->private->number = 4
translate_table: size 800
Bad offset cba528d4
modprobe iptable_nat succeeded in manual modprobe.
modprobe iptable_filter:
translate_table: size 632
Bad offset cbbd910c
modprobe iptable_mangle:
translate_table: size 936
Bad offset cbbd80dc
modprobe iptable_raw:
translate_table: size 480
Bad offset cb8abd44
Retrying ifup and ifdown that tried to do iptables -D and iptables -I:
t->private->number = 4
t->private->number = 4
t->private->number = 4
translate_table: size 800
Bad offset cbbd80dc
t->private->number = 4
And retrying it more (succeeded this time):
t->private->number = 4
t->private->number = 4
translate_table: size 800
Finished chain 0
Finished chain 3
Finished chain 4
ip_tables: Translated table
do_replace: oldnum=4, initnum=4, newnum=5
t->private->number = 5
Hmm, I think I'm bitten by this same bug, on an Apple Powerbook (ARCH=powerpc) here. Running latest git as of now. All the relevant netfilter options are set to 'y', however /proc/net/ip_tables_names shows only the 'raw' table and all I'm able to find in the kernel logs are those init messages: Netfilter messages via NETLINK v0.30. ip_conntrack version 2.4 (8192 buckets, 65536 max) - 204 bytes per conntrack ip_tables: (C) 2000-2006 Netfilter Core Team ip_nat_init: can't setup rules. ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/ arp_tables: (C) 2002 David S. Miller I'm attacing the full .config. Stelian. Created attachment 8255 [details]
.config for 2.6.17-rc5-git
I've examined my logs, and I believe this problem was not in the original 2.6.17-rc4... After some more research, I found out that the problem exists in both 2.6.17-rc4 and -rc3, but it only happens when you compile with CONFIG_DEBUG_SLAB. in 2.6.16 (with DEBUG_SLAB), /proc/net/ip_tables_names shows 'raw' and 'filter', but no 'nat' table. Same message (ip_nat_init: can't setup rules) in dmesg. 2.6.15 is ok. Created attachment 8265 [details]
Patch hopefully fixing the problem
The problem was caused by an alignment problem. On PowerPC, vmalloc() does not
always return an __alignof__(struct _xt_align) aligned address, causing some
test to fail later.
The fix is crude, it is probably possible to do better than that but right now
I need some sleep :)
Please make sure this patch (or a better version of it) hits Linus ASAP,
hopefully before final 2.6.17 is released.
Stelian.
> The problem was caused by an alignment problem. On PowerPC, vmalloc() does
> not
> always return an __alignof__(struct _xt_align) aligned address, causing some
> test to fail later.
This patch works for me too.
Stelian, what's the status on this patch, has it been submitted? Thanks. (In reply to comment #22) > Stelian, what's the status on this patch, has it been submitted? > Thanks. I must say I don't know, it's been a long time... Since I haven't activated CONFIG_DEBUG_SLAB since, I haven't been affected by the bug, should it still be present. This should be retested. Meelis, I don't see Stelian's patch in the git tree, but still some other changes could've fixed your problem. Can you please test the new kernel 2.6.22+ and confirm the problem is still there or otherwise. > I don't see Stelian's patch in the git tree, but still some other changes
> could've fixed your problem. Can you please test the new kernel 2.6.22+ and
> confirm the problem is still there or otherwise.
Tried it with SLAB debug and everyting worked fine (2.6.23-rc1+git). So
it seems to have been fixed meanwhile by something else.
OK, thanks, let's close the report. |