Latest working kernel version: Earliest failing kernel version: 2.6.23 Distribution: Mandriva 2007.1 and 2008.0 Hardware Environment: It happens in many different servers Software Environment: Problem Description: I have close to 200 servers, most with mandriva 2006.0 using kernel 2.6.15, some with mandriva 2008.0 using kernel 2.6.23 or 2.6.25. In some of them (kernels 2.6.23 and 2.6.25 confirmed) the server hangs at random (some servers hang more than once a day, some once a month). The hardware is different from each other and i have about 5 servers with exactly the same configuration (proc, mem, ethernet, so one) and one hangs every day while the others are running fine, all with the same rules for traffic shapping (tc using htb). I think that it is something related to tc because last week i accessed a server and when i type tc del to remove the shapping it hanged. My client restarted the server and about 10 minutes later i did it again with the same effect. No kernel panic, no oops, just hangs. I've read some posts and bugs but i see something related to ethernet driver (like sk98lin), but it is happening with several servers with different hardwares. I have some servers with kernel 2.6.15 and, as far as i know, it doesn't happen with them, but some of they use a different set of tc rules (a few less rules actually) or none at all. I don't use the kernel shippied with mandriva distro, always got kernel from kernel.org and compilled myself. Steps to reproduce: Handly, since it is random, it takes minutes or weeks to happen, but always with some change in tc (start or stop).
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Mon, 4 Aug 2008 08:01:05 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11249 > > Summary: TC HTB hanging problem > Product: Networking > Version: 2.5 > KernelVersion: 2.6.23 and 2.6.25 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Other > AssignedTo: acme@ghostprotocols.net > ReportedBy: lansoweb@hotmail.com > > > Latest working kernel version: > Earliest failing kernel version: 2.6.23 > > Distribution: Mandriva 2007.1 and 2008.0 > > Hardware Environment: > It happens in many different servers > > Software Environment: > Problem Description: > I have close to 200 servers, most with mandriva 2006.0 using kernel 2.6.15, > some with mandriva 2008.0 using kernel 2.6.23 or 2.6.25. In some of them > (kernels 2.6.23 and 2.6.25 confirmed) the server hangs at random (some > servers > hang more than once a day, some once a month). > The hardware is different from each other and i have about 5 servers with > exactly the same configuration (proc, mem, ethernet, so one) and one hangs > every day while the others are running fine, all with the same rules for > traffic shapping (tc using htb). > I think that it is something related to tc because last week i accessed a > server and when i type tc del to remove the shapping it hanged. My client > restarted the server and about 10 minutes later i did it again with the same > effect. No kernel panic, no oops, just hangs. > I've read some posts and bugs but i see something related to ethernet driver > (like sk98lin), but it is happening with several servers with different > hardwares. > I have some servers with kernel 2.6.15 and, as far as i know, it doesn't > happen > with them, but some of they use a different set of tc rules (a few less rules > actually) or none at all. > I don't use the kernel shippied with mandriva distro, always got kernel from > kernel.org and compilled myself. > > Steps to reproduce: > Handly, since it is random, it takes minutes or weeks to happen, but always > with some change in tc (start or stop).
Hello Andrew! Just to add, 2 weeks ago one of my clients had to reboot the server 3 times during the day, so I disabled the qos and it worked fine for 3 days. After this days I started the qos again and 10 hours later the server hanged again, so it's disabled until now without hanging. Another info is that I got other server that was hanging randomly and put the users using a router but kept the server on, in the internet and with qos running and it doesn't hang in the last 10 days. So I guess it's not the qos only, but something with qos and usage by users. I have other client with same kernel, same rules running for more than 2 month and with more than 4 times the internet usage than the others and it never hanged, so it's not only high usage. I really don't know what is happening. Thanks a lot any advice, Leandro -----Mensagem original----- De: Andrew Morton [mailto:akpm@linux-foundation.org] Enviada em: segunda-feira, 4 de agosto de 2008 14:55 Para: netdev@vger.kernel.org Cc: bugme-daemon@bugzilla.kernel.org; lansoweb@hotmail.com Assunto: Re: [Bugme-new] [Bug 11249] New: TC HTB hanging problem (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Mon, 4 Aug 2008 08:01:05 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11249 > > Summary: TC HTB hanging problem > Product: Networking > Version: 2.5 > KernelVersion: 2.6.23 and 2.6.25 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Other > AssignedTo: acme@ghostprotocols.net > ReportedBy: lansoweb@hotmail.com > > > Latest working kernel version: > Earliest failing kernel version: 2.6.23 > > Distribution: Mandriva 2007.1 and 2008.0 > > Hardware Environment: > It happens in many different servers > > Software Environment: > Problem Description: > I have close to 200 servers, most with mandriva 2006.0 using kernel 2.6.15, > some with mandriva 2008.0 using kernel 2.6.23 or 2.6.25. In some of them > (kernels 2.6.23 and 2.6.25 confirmed) the server hangs at random (some servers > hang more than once a day, some once a month). > The hardware is different from each other and i have about 5 servers with > exactly the same configuration (proc, mem, ethernet, so one) and one hangs > every day while the others are running fine, all with the same rules for > traffic shapping (tc using htb). > I think that it is something related to tc because last week i accessed a > server and when i type tc del to remove the shapping it hanged. My client > restarted the server and about 10 minutes later i did it again with the same > effect. No kernel panic, no oops, just hangs. > I've read some posts and bugs but i see something related to ethernet driver > (like sk98lin), but it is happening with several servers with different > hardwares. > I have some servers with kernel 2.6.15 and, as far as i know, it doesn't happen > with them, but some of they use a different set of tc rules (a few less rules > actually) or none at all. > I don't use the kernel shippied with mandriva distro, always got kernel from > kernel.org and compilled myself. > > Steps to reproduce: > Handly, since it is random, it takes minutes or weeks to happen, but always > with some change in tc (start or stop).
Leandro Oliveira da Silva wrote, On 08/04/2008 08:44 PM: ... > Just to add, 2 weeks ago one of my clients had to reboot the server 3 times > during the day, so I disabled the qos and it worked fine for 3 days. After > this days I started the qos again and 10 hours later the server hanged > again, so it's disabled until now without hanging. Hi, There were a few bugs found for these kernels, but alas not all stable versions were fixed. The best thing would be trying eg. 2.6.25.14 or 2.6.26. Otherwise you could especially check these two patches: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit;h=066a3b5b2346febf9a655b444567b7138e3bb939 http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit;h=734bf48fe5276f319464fd30dc4a046a29d2b94a Alas some HTB (or around) problems are still diagnosed, so this could be not enough. Regards, Jarek P.
Hello Jarek! Many thanks for your response! Are these two patches already included in 2.6.26 version? Thanks a lot again, Leandro -----Mensagem original----- De: Jarek Poplawski [mailto:jarkao2@gmail.com] Enviada em: segunda-feira, 4 de agosto de 2008 18:59 Para: Leandro Oliveira da Silva Cc: 'Andrew Morton'; netdev@vger.kernel.org; bugme-daemon@bugzilla.kernel.org Assunto: Re: RES: [Bugme-new] [Bug 11249] New: TC HTB hanging problem Leandro Oliveira da Silva wrote, On 08/04/2008 08:44 PM: .. > Just to add, 2 weeks ago one of my clients had to reboot the server 3 times > during the day, so I disabled the qos and it worked fine for 3 days. After > this days I started the qos again and 10 hours later the server hanged > again, so it's disabled until now without hanging. Hi, There were a few bugs found for these kernels, but alas not all stable versions were fixed. The best thing would be trying eg. 2.6.25.14 or 2.6.26. Otherwise you could especially check these two patches: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit ;h=066a3b5b2346febf9a655b444567b7138e3bb939 http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit ;h=734bf48fe5276f319464fd30dc4a046a29d2b94a Alas some HTB (or around) problems are still diagnosed, so this could be not enough. Regards, Jarek P.
Hello! I've Just checked the 2.6.26 kernel and this patch is there, I'll give a try and put this one in some critical clients. Thanks, Leandro -----Mensagem original----- De: Jarek Poplawski [mailto:jarkao2@gmail.com] Enviada em: segunda-feira, 4 de agosto de 2008 18:59 Para: Leandro Oliveira da Silva Cc: 'Andrew Morton'; netdev@vger.kernel.org; bugme-daemon@bugzilla.kernel.org Assunto: Re: RES: [Bugme-new] [Bug 11249] New: TC HTB hanging problem Leandro Oliveira da Silva wrote, On 08/04/2008 08:44 PM: .. > Just to add, 2 weeks ago one of my clients had to reboot the server 3 times > during the day, so I disabled the qos and it worked fine for 3 days. After > this days I started the qos again and 10 hours later the server hanged > again, so it's disabled until now without hanging. Hi, There were a few bugs found for these kernels, but alas not all stable versions were fixed. The best thing would be trying eg. 2.6.25.14 or 2.6.26. Otherwise you could especially check these two patches: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit ;h=066a3b5b2346febf9a655b444567b7138e3bb939 http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.25.y.git;a=commit ;h=734bf48fe5276f319464fd30dc4a046a29d2b94a Alas some HTB (or around) problems are still diagnosed, so this could be not enough. Regards, Jarek P.
On Tue, Aug 05, 2008 at 09:13:15AM -0300, Leandro Oliveira da Silva wrote: > Hello Jarek! > > Many thanks for your response! Are these two patches already included in > 2.6.26 version? Yes. > Thanks a lot again, Not at all... at least until we know if it works! Jarek P.
Hi Jarek! Good news. I downloaded the 2.6.26 version to one of my clients, compiled, and before the reboot to put this one as active I ran my script and the server hanged. When they restarted the server the kernel 2.6.26 entered and I ran the script several times without problem. I guess in my case it was one of the two bugs, but I'm putting this kernel in other 3 servers and let's see if they work fine now. I'll send a email on Friday with the status. Thanks a lot, Leandro -----Mensagem original----- De: Jarek Poplawski [mailto:jarkao2@gmail.com] Enviada em: ter
Hi Jarek! Some clients of mine are using the 2.6.26 version without a problem for 1 week now, and some of them used to hang every day, i guess it's solved. thanks a lot for the advice! Leandro > Date: Tue, 5 Aug 2008 12:29:16 +0000 > From: jarkao2@gmail.com > To: lansoweb@hotmail.com > CC: akpm@linux-foundation.org; netdev@vger.kernel.org; > bugme-daemon@bugzilla.kernel.org > Subject: Re: RES: RES: [Bugme-new] [Bug 11249] New: TC HTB hanging problem > > On Tue, Aug 05, 2008 at 09:13:15AM -0300, Leandro Oliveira da Silva wrote: > > Hello Jarek! > > > > Many thanks for your response! Are these two patches already included in > > 2.6.26 version? > > Yes. > > > Thanks a lot again, > > Not at all... at least until we know if it works! > > Jarek P. _________________________________________________________________ Conhe
On Wed, Aug 13, 2008 at 09:36:35AM -0300, Leandro Oliveira da Silva wrote: > > Hi Jarek! > > Some clients of mine are using the 2.6.26 version without a problem for 1 > week now, and some of them used to hang every day, i guess it's solved. > thanks a lot for the advice! > Hi Leandro! Very nice to "hear" this! Thanks for testing, Jarek P. PS: When you're sure there is nothing more around this you could probably close this bugzilla report.