[RP-PPPoE] Very high CPU usage by the pppoe-server

Dardan Behluli Dardan.Behluli at ipko.com
Fri Apr 1 04:44:32 EDT 2011


Hi,
Thank you for your efforts. I got this output when I did the strace -f -p:

gettimeofday({1301602017, 693009}, NULL) = 0
select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 147618}) = 1 (in [11], left {2, 150000})
gettimeofday({1301602017, 693813}, NULL) = 0
recv(11, "\0000H\217\335\221\0\fn\265\4\4\210d\21\0\5 \5\211\0!E"..., 1520, 0) = 1437
gettimeofday({1301602017, 693967}, NULL) = 0
select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 146660}) = 1 (in [11], left {2, 150000})
gettimeofday({1301602017, 694176}, NULL) = 0
recv(11, "\0000H\217\335\221\0\37\341\212.<\210d\21\0\7H\5\230\0"..., 1520, 0) = 1452
gettimeofday({1301602017, 694995}, NULL) = 0
select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145632}) = 1 (in [11], left {2, 150000})
gettimeofday({1301602017, 695253}, NULL) = 0
recv(11, "\0000H\217\335\221\0\34#%\220}\210d\21\0\3/\0w\0!E\0\0"..., 1520, 0) = 139
gettimeofday({1301602017, 695481}, NULL) = 0
select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145146}) = 1 (in [11], left {2, 150000})
gettimeofday({1301602017, 695882}, NULL) = 0
recv(11, "\0000H\217\335\221\0\24\"$V\371\210d\21\0\0/\0*\0!E\0\0"..., 1520, 0) = 62

I deleted the NTP servers from the configuration and the CPU utilization dropped down drastically. For the moment there are 1000 users online in that server and the CPU is idle 58%. I'll keep you updated how it goes today.
Thanks again,
Dardan

-----Original Message-----
From: rp-pppoe-bounces at lists.roaringpenguin.com [mailto:rp-pppoe-bounces at lists.roaringpenguin.com] On Behalf Of Insane Laughing Clown
Sent: Thursday, March 31, 2011 4:58 PM
To: For users of RP-PPPoE client/server software
Subject: Re: [RP-PPPoE] Very high CPU usage by the pppoe-server

On 03/31/2011 02:31 AM, Dardan Behluli wrote:
> Hi,
>
> One of our several PPPoE servers has very high CPU usage. This occurs basically when the number of users terminated reaches 1300 - 1500. With more than 1500 users connected the server is practically inaccessible. When we do top we see that the process pppoe-server uses most of the CPU. The parameters of the pppoe-server are: pppoe-server -k -u -r -s -I -C Name -L local_IP -R remote_IP -N 3000.
>
> The difference in the configuration between this and the other PPPoE servers is that this one is not doing NAT, the RADIUS is giving public IP addresses to the clients connected to this NAS.
>
> We tried with mss clamping, synchronous ppp etc but it didn't help much.
>
> We would very much appreciate any insight on this issue.
>
>

Hi,

        As david noted in this thread, I also would recommend an 'strace' on
the pppoe-server process to see what it's doing since that is the
process that you say is the top consumer of cpu ('strace -f -p <pid of
server process>') Really, pppoe-server does practically nothing - it
sits idly in a select() (using no cpu at all) waiting for pppoe session
requests, and quickly kicks out a new pppd in response to that, and then
goes idle again waiting for the next one.

        Off the top of my head without strace, I'd hazard a guess that a dos of
some kind - some bad client flooding you with pppoe padi or padr
perhaps. Have you done a tcpdump on the ethernet feeding you pppoe
traffic? ('tcpdump -lni <interface> ether proto 0x8863'). This will show
you all of the control messages that pppoe-server has to concern itself
with.

        Please follow up on the list and let us know what your results are.

-ILC

_______________________________________________
RP-PPPoE mailing list
RP-PPPoE at lists.roaringpenguin.com
http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe


More information about the RP-PPPoE mailing list