Bug 16558 - iSCSI Connection / Stability Problems
Summary: iSCSI Connection / Stability Problems
Status: RESOLVED OBSOLETE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-10 23:31 UTC by peepstein
Modified: 2012-08-13 15:45 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.32-02063215
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Output of lspci -vv and dmesg (84.48 KB, text/plain)
2010-08-10 23:31 UTC, peepstein
Details

Description peepstein 2010-08-10 23:31:12 UTC
Created attachment 27400 [details]
Output of lspci -vv and dmesg

Hi All,  I am using the Atto Xtend SAN Initiator on a couple of Mac OS X computers (one desktop, one laptop).  I've also tried using the globalSAN initiator on the laptop and with both initiators I've had the same problem.

The target is running on a fresh Ubuntu Server 10.04.1 install, and I have the problem regardless of whether I run the Ubuntu kernel (2.6.32-24-server) or the mainline kernel from the Ubuntu Mainline package (2.6.32-02063215).

The server hardware is a new build of new components and I am trying to rule out flaky hardware as well so if you see anything that might indicate that please let me know.

I have the same problem regardless of whether I use my PCI Intel Gigabit NIC or the PCI-Express built-in Realtek NIC. I've also installed and tried with the latest drivers downloaded from the Realtek site.

I've tried with both the standalone iscsi-target 1.4.20.2 kernel module and with the tgt userspace iSCSI tool without the iscsi-target module-- so using whatever kernel hooks exist for iSCSI that, I believe, have been in the kernel since 2.6.20 (according to stgt.sourceforge.net).

Along with the symptom of losing my iSCSI connection, of course, I see this sort of thing in my kern.log:

Aug  7 10:37:21 robot kernel: [49647.372558] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(200) exp_cmd_sn(202) max_cmd_sn(0)
Aug  7 10:37:21 robot kernel: [49647.383599] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(201) exp_cmd_sn(202) max_cmd_sn(0)
Aug  7 15:36:15 robot kernel: [67173.945360] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(1863) exp_cmd_sn(1864) max_cmd_sn(0)
Aug  7 15:36:18 robot kernel: [67176.440027] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(21064) exp_cmd_sn(21066) max_cmd_sn(0)
Aug  7 15:36:18 robot kernel: [67176.441095] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(21065) exp_cmd_sn(21066) max_cmd_sn(0)
Aug  7 16:27:53 robot kernel: [70272.008883] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(2817) exp_cmd_sn(2818) max_cmd_sn(0)
Aug  7 16:30:44 robot kernel: [70443.114556] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(2855) exp_cmd_sn(2856) max_cmd_sn(0)
Aug  7 19:36:57 robot kernel: [81616.310002] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38980) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:57 robot kernel: [81616.320817] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38981) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:57 robot kernel: [81616.332909] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38982) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:57 robot kernel: [81616.343656] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38983) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.352255] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38984) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.373330] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38985) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38986) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38987) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38988) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38989) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38990) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38991) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:36:58 robot kernel: [81616.382273] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(38992) exp_cmd_sn(38993) max_cmd_sn(0)
Aug  7 19:39:05 robot kernel: [81744.120085] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(71184) exp_cmd_sn(71185) max_cmd_sn(0)
Aug  7 20:00:13 robot kernel: [83011.830886] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(1) exp_cmd_sn(0) max_cmd_sn(0)
Aug  7 20:03:19 robot kernel: [83197.481900] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85037) exp_cmd_sn(85038) max_cmd_sn(0)
Aug  7 20:03:47 robot kernel: [83225.874178] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85047) exp_cmd_sn(85048) max_cmd_sn(0)
Aug  7 20:03:47 robot kernel: [83225.874199] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85047) exp_cmd_sn(85048) max_cmd_sn(0)
Aug  7 20:04:04 robot kernel: [83243.068396] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85052) exp_cmd_sn(85053) max_cmd_sn(0)
Aug  7 20:04:04 robot kernel: [83243.068396] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85052) exp_cmd_sn(85053) max_cmd_sn(0)
Aug  7 20:09:27 robot kernel: [83565.852475] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85159) exp_cmd_sn(85160) max_cmd_sn(0)
Aug  7 20:14:30 robot kernel: [83868.657851] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85265) exp_cmd_sn(85266) max_cmd_sn(0)
Aug  7 20:16:29 robot kernel: [83987.606834] iscsi_trgt: check_cmd_sn(549) sequence error: cmd_sn(85305) exp_cmd_sn(85306) max_cmd_sn(0)



Aug  9 21:07:23 robot kernel: [ 1007.350013] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:07:23 robot kernel: [ 1007.350013] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:07:38 robot kernel: [ 1022.050061] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:07:38 robot kernel: [ 1022.050061] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:07:46 robot kernel: [ 1030.450119] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:07:46 robot kernel: [ 1030.450119] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:08:07 robot kernel: [ 1051.320081] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:08:07 robot kernel: [ 1051.320081] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:08:21 robot kernel: [ 1065.170089] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:08:21 robot kernel: [ 1065.170089] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:08:26 robot kernel: [ 1069.730090] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:08:26 robot kernel: [ 1069.730090] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0
Aug  9 21:08:51 robot kernel: [ 1095.600121] iscsi_trgt: scsi_cmnd_start(949) 1002 0
Aug  9 21:08:51 robot kernel: [ 1095.600121] iscsi_trgt: cmnd_skip_pdu(459) 1002 1c 0 0

Aug  9 21:10:27 robot kernel: [ 1191.160556] iscsi_trgt: scsi_cmnd_start(1045) Unsupported 5a
Aug  9 21:10:27 robot kernel: [ 1191.160556] iscsi_trgt: cmnd_skip_pdu(459) 101f 1c 5a 0
Aug  9 21:14:57 robot kernel: [ 1461.144821] iscsi_trgt: scsi_cmnd_start(1045) Unsupported 5a
Aug  9 21:14:57 robot kernel: [ 1461.144821] iscsi_trgt: cmnd_skip_pdu(459) 101f 1c 5a 0
Aug  9 21:19:25 robot kernel: [ 1729.057073] iscsi_trgt: cmnd_rx_start(1849) 1 117a -7
Aug  9 21:19:25 robot kernel: [ 1729.070381] iscsi_trgt: cmnd_skip_pdu(459) 117a 1 2a 8192


I've had quite a few other problems surrounding my server, but they all relate to iSCSI usage (that is the only use of the server).

I've had to be booting with notsc hpet=disable and clocksource=acpi_pm because of other errors and wonkyness I've had related to timing.  I've also had some weirdness related to a Promise SATA 300TX4 PCI card, so I've removed it from the system for now.

Let me know if you think I should create a kernel bug as well at the kernel bugzilla.

Helpful output attached.
Comment 1 Anonymous Emailer 2010-08-11 03:11:44 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Tue, 10 Aug 2010 23:31:19 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16558
> 
>            Summary: iSCSI Connection / Stability Problems
>            Product: IO/Storage
>            Version: 2.5
>     Kernel Version: 2.6.32-02063215
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: peepstein@gmail.com
>         Regression: No
> 
> 
> Created an attachment (id=27400)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=27400)
> Output of lspci -vv and dmesg
> 
> Hi All,  I am using the Atto Xtend SAN Initiator on a couple of Mac OS X
> computers (one desktop, one laptop).  I've also tried using the globalSAN
> initiator on the laptop and with both initiators I've had the same problem.
> 
> The target is running on a fresh Ubuntu Server 10.04.1 install, and I have
> the
> problem regardless of whether I run the Ubuntu kernel (2.6.32-24-server) or
> the
> mainline kernel from the Ubuntu Mainline package (2.6.32-02063215).
> 
> The server hardware is a new build of new components and I am trying to rule
> out flaky hardware as well so if you see anything that might indicate that
> please let me know.
> 
> I have the same problem regardless of whether I use my PCI Intel Gigabit NIC
> or
> the PCI-Express built-in Realtek NIC. I've also installed and tried with the
> latest drivers downloaded from the Realtek site.
> 
> I've tried with both the standalone iscsi-target 1.4.20.2 kernel module and
> with the tgt userspace iSCSI tool without the iscsi-target module-- so using

iscsi-target 1.4.20 is an out-of-tree kernel module. So reporting the
problem to linux-scsi doesn't help. Use
iscsitarget-devel@lists.sourceforge.net instead.


> whatever kernel hooks exist for iSCSI that, I believe, have been in the
> kernel
> since 2.6.20 (according to stgt.sourceforge.net).

You confuse two different iSCSI implementations. Seems that Ubuntu
supports two different implementations:

iscsitarget.sourceforge.net
stgt.sourceforge.net

The log says that you use the former. If you use the latter and hit a
problem, please report it to stgt@vger.kernel.org.
Comment 2 peepstein 2010-08-11 03:22:35 UTC
(In reply to comment #1)
> Reply-To: fujita.tomonori@lab.ntt.co.jp
> 
> On Tue, 10 Aug 2010 23:31:19 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > I've tried with both the standalone iscsi-target 1.4.20.2 kernel module and
> > with the tgt userspace iSCSI tool without the iscsi-target module-- so
> using
> 
> iscsi-target 1.4.20 is an out-of-tree kernel module. So reporting the
> problem to linux-scsi doesn't help. Use
> iscsitarget-devel@lists.sourceforge.net instead.
> 
> 
> > whatever kernel hooks exist for iSCSI that, I believe, have been in the
> kernel
> > since 2.6.20 (according to stgt.sourceforge.net).
> 
> You confuse two different iSCSI implementations. Seems that Ubuntu
> supports two different implementations:
> 
> iscsitarget.sourceforge.net
> stgt.sourceforge.net
> 
> The log says that you use the former. If you use the latter and hit a
> problem, please report it to stgt@vger.kernel.org.


I have the same problem with the latter as the former, it makes no difference.  I should also mention that one particular symptom is that if I have an ssh connection open to the server,  it will stall. Keypresses aren't registered in the ssh session--- however,  if I bang on the console keyboard which is connected directly to the server via USB,  the ssh session comes back to life, spewing my keypresses back to me.  As well,  when I bang on the console USB keyboard,   the console spews the sequence error messages that I posted in my description.  In addition, the iSCSI connection seems to come back to life too.  However I can't be sitting at the console and banging on the keyboard every two minutes to ensure that my connection stays up.  :)

The behaviour is the same regardless of whether I use tgt or iscsitarget.
Comment 3 Anonymous Emailer 2010-08-11 03:35:40 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 11 Aug 2010 03:22:36 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> I have the same problem with the latter as the former, it makes no
> difference. 

Ok, please report it to both mailing lists respectively. But the log
that you sent is not about the latter. So please reports it with the
proper long to the latter.


> I should also mention that one particular symptom is that if I have an ssh
> connection open to the server,  it will stall. Keypresses aren't registered
> in
> the ssh session--- however,  if I bang on the console keyboard which is
> connected directly to the server via USB,  the ssh session comes back to
> life,
> spewing my keypresses back to me.

Hmm, sounds like a different problem. Needs to find what is the root
problem.

Note You need to log in before you can comment on or make changes to this bug.