[RP-PPPoE] Very high CPU usage by the pppoe-server

Insane Laughing Clown mike-rppppoe at tiedyenetworks.com
Fri Apr 1 11:51:13 EDT 2011


Hello,

	You are more than likely confused - ntp is not 'gettimeofday'. The 
references here to 'gettimeofday' are system calls that take no time to 
complete and are irrelevant, and so your statement about removing 'ntp' 
making a difference is moot.

	1) Get a tcpdump of the control traffic so we can see how many times 
the main loop of pppoe-server is being hit.

	2) Do an strace like I showed you and give us some better snippets. I 
would like to amend my earlier suggestion so that this should look like 
this:

	strace -xx -tt -f -p 3024 -vvv -s256

	Where the '-p <serverpid>' is your pppoe-server process id.


	I would pipe this out to a file, let it run for 60 seconds, and then 
post the results somewhere since it's likely big (dropbox or such).


	3) For added fun -

	ps auxwf

	and what do you get?

-ILC

On 04/01/2011 01:44 AM, Dardan Behluli wrote:
> Hi,
> Thank you for your efforts. I got this output when I did the strace -f -p:
>
> gettimeofday({1301602017, 693009}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 147618}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 693813}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\fn\265\4\4\210d\21\0\5 \5\211\0!E"..., 1520, 0) = 1437
> gettimeofday({1301602017, 693967}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 146660}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 694176}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\37\341\212.<\210d\21\0\7H\5\230\0"..., 1520, 0) = 1452
> gettimeofday({1301602017, 694995}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145632}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 695253}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\34#%\220}\210d\21\0\3/\0w\0!E\0\0"..., 1520, 0) = 139
> gettimeofday({1301602017, 695481}, NULL) = 0
> select(15, [5 7 9 10 11 13 14], NULL, NULL, {2, 145146}) = 1 (in [11], left {2, 150000})
> gettimeofday({1301602017, 695882}, NULL) = 0
> recv(11, "\0000H\217\335\221\0\24\"$V\371\210d\21\0\0/\0*\0!E\0\0"..., 1520, 0) = 62
>
> I deleted the NTP servers from the configuration and the CPU utilization dropped down drastically. For the moment there are 1000 users online in that server and the CPU is idle 58%. I'll keep you updated how it goes today.
> Thanks again,
> Dardan
>
> -----Original Message-----
> From: rp-pppoe-bounces at lists.roaringpenguin.com [mailto:rp-pppoe-bounces at lists.roaringpenguin.com] On Behalf Of Insane Laughing Clown
> Sent: Thursday, March 31, 2011 4:58 PM
> To: For users of RP-PPPoE client/server software
> Subject: Re: [RP-PPPoE] Very high CPU usage by the pppoe-server
>
> On 03/31/2011 02:31 AM, Dardan Behluli wrote:
>> Hi,
>>
>> One of our several PPPoE servers has very high CPU usage. This occurs basically when the number of users terminated reaches 1300 - 1500. With more than 1500 users connected the server is practically inaccessible. When we do top we see that the process pppoe-server uses most of the CPU. The parameters of the pppoe-server are: pppoe-server -k -u -r -s -I -C Name -L local_IP -R remote_IP -N 3000.
>>
>> The difference in the configuration between this and the other PPPoE servers is that this one is not doing NAT, the RADIUS is giving public IP addresses to the clients connected to this NAS.
>>
>> We tried with mss clamping, synchronous ppp etc but it didn't help much.
>>
>> We would very much appreciate any insight on this issue.
>>
>>
>
> Hi,
>
>          As david noted in this thread, I also would recommend an 'strace' on
> the pppoe-server process to see what it's doing since that is the
> process that you say is the top consumer of cpu ('strace -f -p<pid of
> server process>') Really, pppoe-server does practically nothing - it
> sits idly in a select() (using no cpu at all) waiting for pppoe session
> requests, and quickly kicks out a new pppd in response to that, and then
> goes idle again waiting for the next one.
>
>          Off the top of my head without strace, I'd hazard a guess that a dos of
> some kind - some bad client flooding you with pppoe padi or padr
> perhaps. Have you done a tcpdump on the ethernet feeding you pppoe
> traffic? ('tcpdump -lni<interface>  ether proto 0x8863'). This will show
> you all of the control messages that pppoe-server has to concern itself
> with.
>
>          Please follow up on the list and let us know what your results are.
>
> -ILC
>
> _______________________________________________
> RP-PPPoE mailing list
> RP-PPPoE at lists.roaringpenguin.com
> http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe
> _______________________________________________
> RP-PPPoE mailing list
> RP-PPPoE at lists.roaringpenguin.com
> http://lists.roaringpenguin.com/cgi-bin/mailman/listinfo/rp-pppoe



More information about the RP-PPPoE mailing list