Bug 218152 - net rx interrupts causing a high load average on system
Summary: net rx interrupts causing a high load average on system
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: ARM Linux
: P3 high
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-16 06:20 UTC by Monthero Ronald
Modified: 2024-12-23 11:01 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kconfig file (103.69 KB, text/plain)
2023-11-16 06:23 UTC, Monthero Ronald
Details
Kernel trace RX irq processing (10.05 KB, text/plain)
2024-12-23 10:41 UTC, Monthero Ronald
Details

Description Monthero Ronald 2023-11-16 06:20:25 UTC
-> Specifically the net rx load is observed to cause a high load average on the system running kernel 5.15.93-rt58 as compared to 4.9.214.   
-> When checking the net rx interrupts they seems to quite high as compared to similar workloads running on 4.9.214-rt137-g1fc72f4. ( I mean I tested with same hardware for both kernels).

=> Note: When all ifaces are made down or disconnected from network, load average drops from 1.4 to 0.2 or 0.1 , so most of the load is induced by net rx interrupts in some manner. 

Is there any known issue reported on 5.15.93-xx specific to net rx handling that causes this load ? 

Environment: kernel 5.15.93-rt58
Hardware: ARM 


Observations of high load on 5.15.93-rt58

# cat /proc/interrupts
           CPU0       CPU1       
 25:          0          0     GICv2  29 Level     arch_timer
 26:    3696358    3059507     GICv2  30 Level     arch_timer
 31:          0          0     GICv2 196 Edge      arm-irq1
 32:          0          0     GICv2 197 Edge      arm-irq2
 35:      96168          0     GICv2 107 Level     fsl-ifc           
 36:        128          0     GICv2 126 Level     mmc0
 39:       5343          0     GICv2 120 Level     2180000.i2c
 41:          0          0     GICv2 130 Level     gpio-cascade
 42:          0          0     GICv2 131 Level     gpio-cascade
 43:          0          0     GICv2 132 Level     gpio-cascade
 44:          0          0     GICv2 198 Level     gpio-cascade
 46:        428          0     GICv2 112 Level     fsl-lpuart
 47:      45740          0     GICv2 113 Level     fsl-lpuart       
 49:          0          0     GICv2 147 Level     2ad0000.watchdog
 50:          0          0     GICv2 167 Level     eDMA
 54:      17218          0     GICv2 158 Level     can0
 55:          0          0     GICv2 203 Level     fsl-usb2-udc
 56:       4436          0     GICv2 125 Level     xhci-hcd:usb1
 58:          0          0     GICv2 123 Level     aerdrv
 59:    1251756          0       MSI 524288 Edge      xdma
 60:       6987          0     GICv2 176 Level     eth0_g0_tx
 61:     389169          0     GICv2 177 Level     eth0_g0_rx          <<<<< RX interrupts 
 62:          0          0     GICv2 178 Level     eth0_g0_er
 63:       1838          0     GICv2 179 Level     eth0_g1_tx
 64:          0          0     GICv2 180 Level     eth0_g1_rx
 65:          0          0     GICv2 181 Level     eth0_g1_er
 72:        263          0     GICv2 135 Level     1710000.jr
 73:          0          0     GICv2 136 Level     1720000.jr
 74:          0          0     GICv2 137 Level     1730000.jr
 75:          0          0     GICv2 138 Level     1740000.jr
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:    1254818    1669575  Rescheduling interrupts                   <<<<<<   
IPI3:      97630      21898  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:     109629      83567  IRQ work interrupts
IPI6:          0          0  completion interrupts



# top 

top - 06:46:03 up 32 min,  1 user,  load average: 1.44, 1.36, 1.19
                                                  ^^^^^^^^^--------------->>>>>>>> load avg over 1 5 and 15 mins  ( 2 CPU system)                                                         
Tasks: 142 total,   1 running, 141 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.5 us,  2.5 sy,  0.0 ni, 97.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   1001.9 total,    336.0 free,    521.5 used,    144.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.    446.4 avail Mem 


# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 25.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

processor       : 1
model name      : ARMv7 Processor rev 5 (v7l)
BogoMIPS        : 25.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc07
CPU revision    : 5

Hardware        : Freescale LS1021A
Revision        : 0000
Serial          : 0000000000000000
#

==> Nothing cumbersome in the iptables rules, are minimal 

root@ls1021atwr:~# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 80 -j DROP
-A INPUT -i eth1 -p tcp -m tcp --dport 80 -j DROP
-A INPUT -i bt0 -p tcp -m tcp --dport 80 -j DROP
-A INPUT -i rsi_wlan -p tcp -m tcp --dport 80 -j DROP
-A INPUT -i ppp0 -p tcp -m tcp --dport 80 -j DROP
root@ls1021atwr:~#


==== LOAD AVERAGE WHEN NET IFACES ARE brought DOWN 

Load average dropped from 1.44 to 0.06 


top - 06:20:09 up  1:59,  1 user,  load average: 0.06, 0.45, 0.86              <<<<<< 
Tasks: 149 total,   1 running, 148 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.8 us, 11.9 sy,  0.0 ni, 83.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   1003.9 total,    396.0 free,    529.6 used,     78.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.    452.9 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 5865 root      20   0    3008   1716   1484 R  16.7   0.2   0:00.05 top
  760 root      -3   0    9716   3916   3424 S   5.6   0.4   0:17.46 smp
    1 root      20   0    1780    476    428 S   0.0   0.0   0:09.23 init
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.02 kthreadd
    3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp
    5 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 slub_flushwq
    7 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-events_highpri
    9 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq
   10 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tasks_kthre
   11 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tasks_rude_
Comment 1 Monthero Ronald 2023-11-16 06:23:33 UTC
Created attachment 305412 [details]
kconfig file
Comment 2 Monthero Ronald 2023-11-16 06:25:09 UTC
My observation during investigating this issue,
WHILE running iperf tests and checking for rx and tx processing of irqs. 

irq processing for RX takes 55% CPU 
irq processing for TX takes 11% CPU

 rx CPU % usage is 5 times higher compared to tx 


root@ls1021atwr:/var/nand/debugtools# top | head -16; echo ; sleep 1;  top | head -16; echo ; sleep 1;  top | head -16; echo ; sleep 1; 



the net rx interrupts taking up 5x higher cpu % than net tx isrs cpu % 

top - 02:09:21 up 20 min,  3 users,  load average: 1.85, 1.26, 0.93
Tasks: 139 total,   2 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.6 us, 55.2 sy,  0.0 ni, 21.1 id,  0.0 wa,  0.0 hi, 21.1 si,  0.0 st
MiB Mem :   1001.9 total,    392.8 free,    524.6 used,     84.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.    455.9 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                             
 1138 root     -51   0       0      0      0 S  55.3   0.0   0:18.70 irq/61-eth0_g0_          << IRQ 61 is for net_rx    ( net rx cpu % is atleast 5 times higher than net tx  cpu % )          
 1415 root      20   0    5308   2428   2124 R  53.9   0.2   0:04.88 iperf3                              
 1419 root      20   0    3008   1832   1604 R  15.8   0.2   0:00.27 top                                 
 1137 root     -51   0       0      0      0 S  11.8   0.0   0:01.08 irq/60-eth0_g0_          << IRQ 60 is net tx                 
  918 root     -91   0    6976   3752   3364 S   3.9   0.4   0:39.25 prot                                
   12 root      20   0       0      0      0 S   1.3   0.0   0:05.77 ksoftirqd/0                         
   22 root      -2   0       0      0      0 S   1.3   0.0   0:01.15 rcuc/1                              
   23 root      20   0       0      0      0 S   1.3   0.0   0:17.29 ksoftirqd/1                         
  114 root     -93   0       0      0      0 S   1.3   0.0   0:10.47 irq/59-xdma
Comment 3 Monthero Ronald 2023-11-16 06:29:28 UTC
Pasting the iperf tests and tx and rx interrupts and CPU % usage here again as the previous output is a bit mangled. 

Triage during iperf tests for rx and tx processing

irq processing for RX takes 55% CPU 
irq processing for TX takes 11% CPU

 rx CPU % usage is 5 times higher compared to tx 


root@ls1021atwr:/var/nand/debugtools# top | head -16; echo ; sleep 1;  top | head -16; echo ; sleep 1;  top | head -16; echo ; sleep 1; 

top - 02:09:21 up 20 min,  3 users,  load average: 1.85, 1.26, 0.93
Tasks: 139 total,   2 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.6 us, 55.2 sy,  0.0 ni, 21.1 id,  0.0 wa,  0.0 hi, 21.1 si,  0.0 st
MiB Mem :   1001.9 total,    392.8 free,    524.6 used,     84.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.    455.9 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                             
 1138 root     -51   0       0      0      0 S  55.3   0.0   0:18.70 irq/61-eth0_g0_    <<<< IRQ 61  is for net_rx 
                                                ^^^^^---------------------------------------------->          
 1415 root      20   0    5308   2428   2124 R  53.9   0.2   0:04.88 iperf3                              
 1419 root      20   0    3008   1832   1604 R  15.8   0.2   0:00.27 top                                 
 1137 root     -51   0       0      0      0 S  11.8   0.0   0:01.08 irq/60-eth0_g0_   <<<< IRQ 60 is for net_tx                 
                                                ^^^^^----------------------------------------------->
  918 root     -91   0    6976   3752   3364 S   3.9   0.4   0:39.25 prot                                
   12 root      20   0       0      0      0 S   1.3   0.0   0:05.77 ksoftirqd/0                         
   22 root      -2   0       0      0      0 S   1.3   0.0   0:01.15 rcuc/1                              
   23 root      20   0       0      0      0 S   1.3   0.0   0:17.29 ksoftirqd/1                         
  114 root     -93   0       0      0      0 S   1.3   0.0   0:10.47 irq/59-xdma                         

< ...snipped .. >
Comment 4 Bagas Sanjaya 2023-11-16 11:13:17 UTC
(In reply to Monthero Ronald from comment #0)
> -> Specifically the net rx load is observed to cause a high load average on
> the system running kernel 5.15.93-rt58 as compared to 4.9.214.   
> -> When checking the net rx interrupts they seems to quite high as compared
> to similar workloads running on 4.9.214-rt137-g1fc72f4. ( I mean I tested
> with same hardware for both kernels).
> 

First, you need to check if vanilla v5.15.93 also have this regression. If
not, that's rt-specific bug. Then, you also need to test latest mainline
(currently v6.7-rc1) to see if this regression has been fixed already.
Comment 5 Monthero Ronald 2023-11-16 14:06:51 UTC
(In reply to Bagas Sanjaya from comment #4)
> (In reply to Monthero Ronald from comment #0)
> > -> Specifically the net rx load is observed to cause a high load average on
> > the system running kernel 5.15.93-rt58 as compared to 4.9.214.   
> > -> When checking the net rx interrupts they seems to quite high as compared
> > to similar workloads running on 4.9.214-rt137-g1fc72f4. ( I mean I tested
> > with same hardware for both kernels).
> > 
> 
> First, you need to check if vanilla v5.15.93 also have this regression. If
> not, that's rt-specific bug. Then, you also need to test latest mainline
> (currently v6.7-rc1) to see if this regression has been fixed already.

Bagas, could you elaborate a bit more on why you think this could be rt specific ? 
Because it's not easy to have the vanilla kernel setup in our environment, several of the RT processes 
and a lot of setup changes and with the, BSP, Yocto build setup etc. 

IMO the rt scheduler is not a suspect here because rx interrupt handling is clearly taking more cycles than tx.Even during iperf tests also we see quite sizable difference between tx and rx irqs handling. iperf client/server tests pumps data for tx and rx  for irq handling, the rt scheduler makes no distinction for tx and rx irqs.  RX is taking 55% CPU and and TX is taking 11 % 
and the number of irqs for RX is quite high. 

And also running other fio or iozone tests do not cause similar loads. If rt specific then it should have an impact on all IO, file system or network transfer. I also ran rngtests, random number generation tests, that require entropy pool constraints, and it runs fine without such load. 

Other users in manjaro linux forum also have reported such high load issue, so imo its probably not related to rt kernel. 
https://forum.manjaro.org/t/kernel-5-15-causes-very-high-load-averages/92224
Comment 6 Stephen Hemminger 2023-11-16 18:34:13 UTC
What is the network driver being used?
It doesn't appear to be any of the in kernel drivers?

The RT patches do change where networking is scheduled. RT does not have softirq.
Comment 7 Stephen Hemminger 2023-11-17 03:40:24 UTC
Looking at names, this looks like the gianfar driver.
Comment 8 Monthero Ronald 2023-11-19 06:02:16 UTC
(In reply to Stephen Hemminger from comment #6)
> What is the network driver being used?

> It doesn't appear to be any of the in kernel drivers?

Thanks for taking a look. It's the  gianfar drivers indicated below and their kernel source files are at 

./drivers/net/ethernet/freescale/

freescale$ ls 
dpaa   enetc  fec_main.c     fec_mpc52xx.h      fec_ptp.c  fs_enet        gianfar.c          gianfar.h  Makefile    ucc_geth_ethtool.c  xgmac_mdio.c
dpaa2  fec.h  fec_mpc52xx.c  fec_mpc52xx_phy.c  fman       fsl_pq_mdio.c  gianfar_ethtool.c  Kconfig    ucc_geth.c  ucc_geth.h


Module                  Size  Used by
xhci_plat_hcd          16384  0
xhci_hcd              163840  1 xhci_plat_hcd
sdhci_of_esdhc         24576  0
sdhci_pltfm            16384  1 sdhci_of_esdhc
sdhci                  49152  2 sdhci_of_esdhc,sdhci_pltfm
mmc_block              36864  1
mmc_core              131072  3 sdhci,sdhci_of_esdhc,mmc_block
usb_f_ecm              16384  1
g_ether                16384  0                                  \
usb_f_rndis            28672  2 g_ether                           | 
u_ether                24576  3 usb_f_ecm,g_ether,usb_f_rndis     |  
libcomposite           53248  3 usb_f_ecm,g_ether,usb_f_rndis    /

< .. snipped .. > 

> 
> The RT patches do change where networking is scheduled. RT does not have
> softirq.

rt kernels also use softirqs as shown below. 

# uname -r
5.15.93-rt58+ge0f69a158d5b

root@ls1021atwr:# cat /proc/softirqs
                    CPU0       CPU1
          HI:         16          3
       TIMER:      15574     110873
      NET_TX:       2410         53    <<<< softirqs of net rx on cpu{0,1}
      NET_RX:     209403          0
       BLOCK:       1284        159
    IRQ_POLL:          0          0
     TASKLET:       5276         46   
       SCHED:     308979     298884
     HRTIMER:     536692    1008360
         RCU:          0          0
root@ls1021atwr:/proc#
Comment 9 Monthero Ronald 2023-11-19 06:03:50 UTC
(In reply to Stephen Hemminger from comment #7)
> Looking at names, this looks like the gianfar driver.

correct, it is freescale gianfar drivers

./drivers/net/ethernet/freescale/

freescale$ ls 
dpaa   enetc  fec_main.c     fec_mpc52xx.h      fec_ptp.c  fs_enet        gianfar.c          gianfar.h  Makefile    ucc_geth_ethtool.c  xgmac_mdio.c
dpaa2  fec.h  fec_mpc52xx.c  fec_mpc52xx_phy.c  fman       fsl_pq_mdio.c  gianfar_ethtool.c  Kconfig    ucc_geth.c  ucc_geth.h

Kindly suggest if such, it needs attention of a different mailing list.
Comment 10 Monthero Ronald 2023-11-19 06:06:33 UTC
linux-git$ ./scripts/get_maintainer.pl /home/penguin32/GPL/linux-git/drivers/net/ethernet/freescale/gianfar.c 
Claudiu Manoil <claudiu.manoil@nxp.com> (maintainer:FREESCALE eTSEC ETHERNET DRIVER (GIANFAR))
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
netdev@vger.kernel.org (open list:FREESCALE eTSEC ETHERNET DRIVER (GIANFAR))
linux-kernel@vger.kernel.org (open list)
linux-git$
Comment 11 Monthero Ronald 2024-12-23 10:36:44 UTC
Just an update on some of the kernel debug and triage experiments that I tried to isolate the issue: 

a) Experiment of the similar workloads on a non real time kernel is fine, there is no issue.(ie kernel with PREMEPT_RT disabled and rebuilt kernel and gianfar drivers.In this scenario the high load issue for RX interrupts is not observed) 

b) The high load issue is observed only when using RT kernel (fsl gianfar drivers, the RX processing load is a bit high) 

Below is a kernel ftrace for RX irq processing.  I noticed patterns of irq RX path using irq_enter and irq_exit times of upto 50 micro seconds before it hands over to soft irqs for processing. I mean HARD irq processing of duration of about 50 microseconds  and then hands over  to soft irqs. This is noted continuously, perhaps some additional overhead for RX irq processing of packets, upto 50 microseconds. 

X1 (irq_enter)  ==>  X2 (irq_exit)      - 50 microseconds
Y1 (irq_enter)  ==> Y2 (irq_exit)     -  51 microseconds

In below example:  Kthread -  irq/61-eth0_g0_-871  ( Interrupt 61 for RX eth0 iface ) 
I notice around 50 to 55 microseconds  in HARD IRQ processing of RX path IRQ processing. The soft IRQ processing may seem okay for deferred irq work. Noting the 'h' and 's' for hard and soft irqs for the kthread   irq/61-eth0 below.
Below is  shortened snippet of  trace pattern.  The system is 95% idle but load average is about 1.4 for a 2 cpu system, mostly due to RX irq processing. ( and the load avg drops to almost 0.2 when network iface is down or is disconnected) 

kernel trace for gianfar RX irq processing 

#                                _-------=> irqs-off
#                               / _------=> need-resched
#                              | / _-----=> need-resched-lazy
#                              || / _----=> hardirq/softirq
#                              ||| / _---=> preempt-depth
#                              |||| / _--=> preempt-lazy-depth
#                              ||||| / _-=> migrate-disable
#                              |||||| /     delay
#           TASK-PID     CPU#  |||||||  TIMESTAMP  FUNCTION
#              | |         |   |||||||      |         |
              sh-1138    [001] d..h211   907.775570: trigger_load_balance <-update_process_times
              sh-1138    [001] d..h211   907.775571: run_posix_cpu_timers <-tick_sched_timer
              sh-1138    [001] d..h211   907.775574: hrtimer_forward <-tick_sched_timer
              sh-1138    [001] d..h211   907.775576: _raw_spin_lock_irq <-__run_hrtimer
              sh-1138    [001] d..h211   907.775578: preempt_count_add <-_raw_spin_lock_irq
              sh-1138    [001] d..h311   907.775580: enqueue_hrtimer <-__run_hrtimer

< snipped output>

 irq/61-eth0_g0_-871     [000] d..s.13  2108.019569: irq_enter <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..s.13  2108.019571: irq_enter_rcu <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..s.13  2108.019572: preempt_count_add <-irq_enter_rcu         { X1 } <<<<< irq_enter( )
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019574: __irq_resolve_mapping <-handle_domain_irq      
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019576: __rcu_read_lock <-__irq_resolve_mapping
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019577: __rcu_read_unlock <-__irq_resolve_mapping
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019579: handle_irq_desc <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019580: handle_fasteoi_irq <-handle_irq_desc
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019582: _raw_spin_lock <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019583: preempt_count_add <-_raw_spin_lock
 irq/61-eth0_g0_-871     [000] d..H113  2108.019585: irq_may_run <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019586: mask_irq <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019588: gic_eoimode1_mask_irq <-mask_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019589: gic_mask_irq <-gic_eoimode1_mask_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019591: gic_poke_irq <-gic_eoimode1_mask_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019592: handle_irq_event <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019594: preempt_count_sub <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019596: handle_irq_event_percpu <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019597: __handle_irq_event_percpu <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019599: irq_default_primary_handler <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019600: __irq_wake_thread <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019602: wake_up_process <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019603: try_to_wake_up <-__handle_irq_event_percpu                                                                        
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019605: preempt_count_add <-try_to_wake_up
 irq/61-eth0_g0_-871     [000] d..H113  2108.019606: preempt_count_sub <-try_to_wake_up
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019608: add_interrupt_randomness <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019609: read_current_timer <-add_interrupt_randomness
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019611: arch_timer_read_counter_long <-read_current_timer
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019613: fast_mix <-add_interrupt_randomness
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019614: note_interrupt <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019616: _raw_spin_lock <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019617: preempt_count_add <-_raw_spin_lock
 irq/61-eth0_g0_-871     [000] d..H113  2108.019619: gic_eoimode1_eoi_irq <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..H113  2108.019621: preempt_count_sub <-__raw_spin_unlock
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019622: irq_exit <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..H.13  2108.019624: preempt_count_sub <-__irq_exit_rcu     { X2 }  <<<<< irq_exit ( )
 irq/61-eth0_g0_-871     [000] ...s.13  2108.019626: rt_spin_unlock <-gfar_poll_rx_sq
 irq/61-eth0_g0_-871     [000] ...s.13  2108.019627: migrate_enable <-rt_spin_unlock
 irq/61-eth0_g0_-871     [000] ...s.12  2108.019629: __rcu_read_unlock <-rt_spin_unlock
 irq/61-eth0_g0_-871     [000] ...s.12  2108.019631: hrtimer_run_softirq <-__do_softirq
 irq/61-eth0_g0_-871     [000] ...s.12  2108.019633: rt_spin_lock <-hrtimer_run_softirq
 irq/61-eth0_g0_-871     [000] ...s.12  2108.019634: __rcu_read_lock <-rt_spin_lock
 
==== SECOND snippet of RX irq processing 

< snipped output>  
 irq/61-eth0_g0_-871     [000] d...1..  2108.019096: preempt_count_sub <-migrate_enable
 irq/61-eth0_g0_-871     [000] .......  2108.019097: wake_threads_waitq <-irq_thread
 irq/61-eth0_g0_-871     [000] .......  2108.019099: __wake_up <-irq_thread
 irq/61-eth0_g0_-871     [000] d...2..  2108.019101: gic_handle_irq <-__irq_svc
 irq/61-eth0_g0_-871     [000] d...2..  2108.019103: handle_domain_irq <-gic_handle_irq
 irq/61-eth0_g0_-871     [000] d...2..  2108.019104: irq_enter <-handle_domain_irq       { Y1 }   <<<<<<< irq_enter( )
 irq/61-eth0_g0_-871     [000] d...2..  2108.019106: irq_enter_rcu <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d...2..  2108.019107: preempt_count_add <-irq_enter_rcu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019109: __irq_resolve_mapping <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019111: __rcu_read_lock <-__irq_resolve_mapping
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019112: __rcu_read_unlock <-__irq_resolve_mapping
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019114: handle_irq_desc <-handle_domain_irq
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019115: handle_fasteoi_irq <-handle_irq_desc
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019116: _raw_spin_lock <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019118: preempt_count_add <-_raw_spin_lock
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019119: irq_may_run <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019121: mask_irq <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019122: gic_eoimode1_mask_irq <-mask_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019124: gic_mask_irq <-gic_eoimode1_mask_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019125: gic_poke_irq <-gic_eoimode1_mask_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019127: handle_irq_event <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019128: preempt_count_sub <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019130: handle_irq_event_percpu <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019131: __handle_irq_event_percpu <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019132: irq_default_primary_handler <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019134: __irq_wake_thread <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019135: wake_up_process <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019137: try_to_wake_up <-__handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019138: preempt_count_add <-try_to_wake_up
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019140: preempt_count_sub <-try_to_wake_up
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019141: add_interrupt_randomness <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019143: read_current_timer <-add_interrupt_randomness
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019144: arch_timer_read_counter_long <-read_current_timer
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019146: fast_mix <-add_interrupt_randomness
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019148: note_interrupt <-handle_irq_event_percpu
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019149: _raw_spin_lock <-handle_irq_event
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019151: preempt_count_add <-_raw_spin_lock
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019152: gic_eoimode1_eoi_irq <-handle_fasteoi_irq
 irq/61-eth0_g0_-871     [000] d..h3..  2108.019154: preempt_count_sub <-__raw_spin_unlock
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019155: irq_exit <-handle_domain_irq           { Y2 }   <<<<<  irq_exit ( )
 irq/61-eth0_g0_-871     [000] d..h2..  2108.019157: preempt_count_sub <-__irq_exit_rcu
 irq/61-eth0_g0_-871     [000] .......  2108.019158: __wake_up_common_lock <-__wake_up
 irq/61-eth0_g0_-871     [000] .......  2108.019160: rt_spin_lock <-__wake_up_common_lock
Comment 12 Monthero Ronald 2024-12-23 10:41:32 UTC
Created attachment 307391 [details]
Kernel trace RX irq processing
Comment 13 Monthero Ronald 2024-12-23 10:59:26 UTC
>>> Note: A final experiment that solved and showed a 10x performance
>>> improvement. I experimented modifying the kernel timer interrupts per
>>> second, changed the timer ticks frequency from HZ 100 to HZ 1000 and
>>> rebuilt kernel and drivers. With this timer ticks freq of HZ 1000, the rx
>>> irq processing is really good, and the load average of the system is down
>>> to 0.1 to 0.3. That's almost a 8x to 10x performance improvement noted in
>>> my Real Time kernel workload and response to fsl gianfar drivers processing
>>> the RX and TX irqs.
Comment 14 Monthero Ronald 2024-12-23 11:01:07 UTC
Note: A final experiment that solved and showed a 10x performance improvement. I experimented modifying the kernel timer interrupts persecond, changed the timer ticks frequency from HZ 100 to HZ 1000 and rebuilt kernel and drivers. With this timer ticks freq of HZ 1000, the rx irq processing is really good, and the load average of the system is down to 0.1 to 0.3. That's almost a 8x to 10x performance improvement noted in my Real Time kernel workload and response to fsl gianfar drivers processing the RX and TX irqs.

Note You need to log in before you can comment on or make changes to this bug.