[RP-PPPoE] Very high CPU usage by the pppoe-server

Dardan Behluli Dardan.Behluli at ipko.com
Mon Apr 4 09:12:27 EDT 2011


Hi,
You are right, 'gettimeofday' and ntp don't "go hand in hand", but there is definitely a difference in CPU usage since I removed the NTP. I can not say that the NTP made the difference, but I can say that there is a difference in CPU usage. I'm watching the situation closely and I'll inform you about everything.

1) I did the tcpdump and I got 1000 packets for some 7 minutes. I couldn't find anything suspicious there;
2) I shared the output file of the strace with you on dropbox;
3) Here is the output of the ps auxwf:

[PPPOE08]# ps auxwf
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1304  476 ?        S    Apr01   0:06 init [3]
root         2  0.0  0.0     0    0 ?        SW   Apr01   0:00 [keventd]
root         3  0.0  0.0     0    0 ?        SW   Apr01   0:19 [kapmd]
root         4  0.2  0.0     0    0 ?        RWN  Apr01  10:55 [ksoftirqd_CPU0]
root         5  0.0  0.0     0    0 ?        SW   Apr01   0:00 [kswapd]
root         6  0.0  0.0     0    0 ?        SW   Apr01   0:00 [bdflush]
root         7  0.0  0.0     0    0 ?        SW   Apr01   0:01 [kupdated]
root       116  0.0  0.0     0    0 ?        SW   Apr01   0:00 [loop0]
root       581  0.0  0.0  1356  560 ?        S    Apr01   0:24 syslogd -m 0
root       586  0.0  0.0  1296  432 ?        S    Apr01   0:00 klogd -x -c 1
ntp        621  0.0  0.0  1812 1804 ?        SL   Apr01   0:03 ntpd -U ntp
root       631  0.0  0.0  3432 1480 ?        S    Apr01   0:14 /usr/sbin/sshd
root     19896  0.0  0.0  6664 1968 ?        S    14:32   0:00  \_ /usr/sbin/sshd
root     19904  0.0  0.0  2040 1132 pts/0    S    14:32   0:00      \_ -bash
root     24605  0.0  0.0  3852 1984 pts/0    R    15:08   0:00          \_ ps auxwf
root       648  0.0  0.0  1984  800 ?        S    Apr01   0:00 xinetd -stayalive -reuse -pidfile /var/run/xinetd.pid
root       659  0.0  0.0  1340  564 ?        S    Apr01   0:00 crond
root       689 18.2  0.1  5812 2708 ?        S    Apr01 863:00 pppoe-server -k -u -r -s -I -C PPPOE08 -L 46.99.88.1 -R 46.99.88.2 -N
root       725  0.0  0.0  1848  900 ?        S    Apr01   0:06  \_ pppd plugin /usr/lib/plugins/rp-pppoe.so eth1 rp_pppoe_sess 2724:
.......
root     24568  0.2  0.0  1848  896 ?        S    15:07   0:00  \_ pppd plugin /usr/lib/plugins/rp-pppoe.so eth1 rp_pppoe_sess 1925:
root     24588  0.5  0.0  1848  896 ?        S    15:07   0:00  \_ pppd plugin /usr/lib/plugins/rp-pppoe.so eth1 rp_pppoe_sess 2650:
root       696  0.0  0.0  1272  372 ?        S    Apr01   0:03 /etc/session_check 192.168.10.121
root       718  0.0  0.0  1284  404 ttyS0    S    Apr01   0:00 /sbin/agetty -h ttyS0 9600 vt100
root       720  0.0  0.0  1276  376 tty2     S    Apr01   0:00 /sbin/mingetty tty2
root      6103  0.0  0.0  5232 2096 ?        S    Apr01   1:33 pppoe-server
root      6337  0.0  0.0  1276  376 tty1     S    Apr01   0:00 /sbin/mingetty tty1

Thank you very much again,
Dardan

-----Original Message-----
From: rp-pppoe-bounces at lists.roaringpenguin.com [mailto:rp-pppoe-bounces at lists.roaringpenguin.com] On Behalf Of Insane Laughing Clown
Sent: Friday, April 01, 2011 5:51 PM
To: For users of RP-PPPoE client/server software
Subject: Re: [RP-PPPoE] Very high CPU usage by the pppoe-server


Hello,

        You are more than likely confused - ntp is not 'gettimeofday'. The
references here to 'gettimeofday' are system calls that take no time to
complete and are irrelevant, and so your statement about removing 'ntp'
making a difference is moot.

        1) Get a tcpdump of the control traffic so we can see how many times
the main loop of pppoe-server is being hit.

        2) Do an strace like I showed you and give us some better snippets. I
would like to amend my earlier suggestion so that this should look like
this:

        strace -xx -tt -f -p 3024 -vvv -s256

        Where the '-p <serverpid>' is your pppoe-server process id.


        I would pipe this out to a file, let it run for 60 seconds, and then
post the results somewhere since it's likely big (dropbox or such).


        3) For added fun -

        ps auxwf

        and what do you get?

-ILC

On 04/01/2011 01:44 AM, Dardan Behluli wrote:
> Hi,
> Thank you for your efforts. I got this output when I did the strace -f -p:
>
> gettimeofday({1301602017, 693009}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 147618}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 693813}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\fn\265\4\4\210d\21\0\5 \5\211\0!E"..., 1520, 0) = 1437
> gettimeofday({1301602017, 693967}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 146660}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 694176}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\37\341\212.<\210d\21\0\7H\5\230\0"..., 1520, 0) = 1452
> gettimeofday({1301602017, 694995}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145632}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 695253}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\34#%\220}\210d\21\0\3/\0w\0!E\0\0"..., 1520, 0) = 139
> gettimeofday({1301602017, 695481}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145146}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 695882}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\24\"$V\371\210d\21\0\0/\0*\0!E\0\0"..., 1520, 0) = 62
>
> I deleted the NTP servers from the configuration and the CPU utilization dropped down drastically. For the moment there are 1000 users online in that server and the CPU is idle 58%. I'll keep you updated how it goes today.
> Thanks again,
> Dardan
>
> -----Original Message-----
> From: rp-pppoe-bounces at lists.roaringpenguin.com [mailto:rp-pppoe-bounces at lists.roaringpenguin.com] On Behalf Of Insane Laughing Clown
> Sent: Thursday, March 31, 2011 4:58 PM
> To: For users of RP-PPPoE client/server software
> Subject: Re: [RP-PPPoE] Very high CPU usage by the pppoe-server
>
> On 03/31/2011 02:31 AM, Dardan Behluli wrote:
>> Hi,
>>
>> One of our several PPPoE servers has very high CPU usage. This occurs basically when the number of users terminated reaches 1300 - 1500. With more than 1500 users connected the server is practically inaccessible. When we do top we see that the process pppoe-server uses most of the CPU. The parameters of the pppoe-server are: pppoe-server -k -u -r -s -I -C Name -L local_IP -R remote_IP -N 3000.
>>
>> The difference in the configuration between this and the other PPPoE servers is that this one is not doing NAT, the RADIUS is giving public IP addresses to the clients connected to this NAS.
>>
>> We tried with mss clamping, synchronous ppp etc but it didn't help much.
>>
>> We would very much appreciate any insight on this issue.
>>
>>
>
> Hi,
>
>          As david noted in this thread, I also would recommend an 'strace' on
> the pppoe-server process to see what it's doing since that is the
> process that you say is the top consumer of cpu ('strace -f -p<pid of
> server process>') Really, pppoe-server does practically nothing - it
> sits idly in a select() (using no cpu at all) waiting for pppoe session
> requests, and quickly kicks out a new pppd in response to that, and then
> goes idle again waiting for the next one.
>
>          Off the top of my head without strace, I'd hazard a guess that a dos of
> some kind - some bad client flooding you with pppoe padi or padr
> perhaps. Have you done a tcpdump on the ethernet feeding you pppoe
> traffic? ('tcpdump -lni<interface>  ether proto 0x8863'). This will show
> you all of the control messages that pppoe-server has to concern itself
> with.
>
>          Please follow up on the list and let us know what your results are.
>
> -ILC
>
> _______________________________________________
> RP-PPPoE mailing list
> RP-PPPoE at lists.roaringpenguin.com
> http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe
> _______________________________________________
> RP-PPPoE mailing list
> RP-PPPoE at lists.roaringpenguin.com
> http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe

_______________________________________________
RP-PPPoE mailing list
RP-PPPoE at lists.roaringpenguin.com
http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe


More information about the RP-PPPoE mailing list