BMC Atrium Discovery Community Forum

This forum is now closed. Please check sticky posts and announcements for further information.

Links to new community:

forgot password?
   
1 of 3
1
Reasoning performance
Posted: 28 October 2011 06:30 PM   [ Ignore ]  
BMC ADDM Staff
RankRankRank
Administrator
Total Posts:  53
Joined:  2008-01-23

This thread is a follow-on from this one. Please post your experiences of a slowdown in scannning over time, especially if you find it is resolved by restarting the reasoning service.

Profile
 
 
Posted: 28 October 2011 07:18 PM   [ Ignore ]   [ # 1 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

I found that it helps by modifying the JVM settings through the UI for the Topology, UI, and Export. We have set the max to 1024 for each. This has allowed the system to not need to be rebooted (restarting services only helps a little) as often. The system at first without the heap change was using minimal CPU, but now it looks like it is using the CPU mainly. From our stand point, we believe that the JVMs have a low priority for the garbage collection and was eating much of the memory to discover and wasn’t able to clean up after each endpoint was completed. This is just our thought process as we have seen this with other applications. Is there a way to tweak the JVMs to have a higher than 1024 heap size?

FYI, our consolidator does it as well when it tries to do the consolidation.

Profile
 
 
Posted: 29 October 2011 07:51 AM   [ Ignore ]   [ # 2 ]  
BMC ADDM Staff
RankRankRankRank
Administrator
Total Posts:  2089
Joined:  2008-02-12

I find it surprising that modifying those JVM settings has any beneficial effect.

UI – this is used only small parts of the UI and not at all by reasoning.
Export – this is only used if you use the old exporter, again not at all by reasoning.
Topology – will only be used if you use topology runs.

Are you using topology runs?

Profile
 
 
Posted: 31 October 2011 10:49 AM   [ Ignore ]   [ # 3 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

We are not doing topology runs, we do not use the old exporters, and mst of the time, we stay out of the UI during the scan. We thought it strange as well, but why complain with what works.

It only helps a little, but it stays at using about 7.5 GB out of 8 normally after a few scans without the reboot. We are still open to anything else that might help us. We are doing scans with only Windows and AIX credentials across a few subnets containing blanks and network devices as well. As I have stated in earlier posts, it seemed like the garbage collectio could not run as required, but now it is atleast running a little bit. If we wait for a day or so after the scan has completed before doing anything else, we get a little more RAM back, but not back to the 1.5 GB that the system uses when the services are started.

Any thoughts?

Profile
 
 
Posted: 31 October 2011 12:51 PM   [ Ignore ]   [ # 4 ]  
RankRankRankRank
Guru
Total Posts:  225
Joined:  2010-06-17

I have never just restarted the reasoning service alone and would like to test that. Is the command sudo /sbin/service reasoning stop and then start from the command line?

Craig

Profile
 
 
Posted: 03 November 2011 02:26 PM   [ Ignore ]   [ # 5 ]  
RankRank
Member
Total Posts:  26
Joined:  2008-02-20

Very interested to hear if anyone has seen the performance improve after restarting the reasoning service. We were restarting the appliances weekly to get round the degradation and stopping just a single service sounds a little more attractive and less disruptive.
This was the advice we got from support some months ago and prof services so I am keen to explore if this may no longer be the recommended approach.

Are there details about how the reasoning service was identified as causing the problem?

Profile
 
 
Posted: 03 November 2011 02:58 PM   [ Ignore ]   [ # 6 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

To restart the reasoning enigine, kill -9 the PID and then do the “sudo /sbin/service tideway start” command.

On a side note, it only returns a little memory and does not free it all back like a full “reboot”.

In my opinion, the reasoning engine is having some sort of problems, and I was under the impression that it is used for consolidation as well. We have seen the same issuse on our scann and conslidator so I do not think it matters which way you use the reasoning engine, but it looks like it is causing the problem. As I stated above, we tweaked some of the JVM settings, though they should not have helped… they did. We still ahve the problem, but not as often between reboots.

Hope this helps you all.

Profile
 
 
Posted: 03 November 2011 03:05 PM   [ Ignore ]   [ # 7 ]  
BMC ADDM Staff
RankRankRankRank
Administrator
Total Posts:  2089
Joined:  2008-02-12

Do NOT kill -9 the reasoning PID. This does not give it a chance to tidy up anything.

There are a couple of ways to restarting reasoning.

Before ADDM 8.2 you will need to restart reasoning directly:

tw_svc_reasoning --daemon stop
tw_svc_reasoning 
--daemon start 

From ADDM 8.2 it is possible to do this in a single step:

sudo /sbin/service tideway restart reasoning 

Profile
 
 
Posted: 03 November 2011 03:12 PM   [ Ignore ]   [ # 8 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

Nice to know. I wish I could have found this on one of the BMC websites.

Even using that method, it did not free up much… (500 MB out of the 6 GB used).

Profile
 
 
Posted: 03 November 2011 03:18 PM   [ Ignore ]   [ # 9 ]  
BMC ADDM Staff
RankRankRankRank
Administrator
Total Posts:  2089
Joined:  2008-02-12

This is not how modern operating systems work.

After a reboot they don’t have anything cached and hence have free memory. Over time they use this free memory as a cache. Hence restarting something does not necessarily free all the associated memory.

Profile
 
 
Posted: 10 November 2011 08:37 PM   [ Ignore ]   [ # 10 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

So has anyone come up with anything else to try for this?

Profile
 
 
Posted: 10 November 2011 08:42 PM   [ Ignore ]   [ # 11 ]  
RankRankRankRank
Guru
Total Posts:  225
Joined:  2010-06-17

Sorry Tim, I have been tied up in 8.3 issues and the ongoing separate SNMP thread.

Profile
 
 
Posted: 14 November 2011 04:51 PM   [ Ignore ]   [ # 12 ]  
RankRankRankRank
Guru
Total Posts:  2740
Joined:  2008-01-25
David Farthing - 03 November 2011 02:26 PM
Are there details about how the reasoning service was identified as causing the problem?

The short answer is David it hasn’t been. We are seeing a spread of symptoms across different accounts that may be one root cause or a couple, hence why we are suggesting different things to different folks to try and focus attention on the right area. Whatever is going on is subtle as it’s evaded some pretty close scrutiny so far.

Several scenarios suggest to us the issue may lie in Reasoning (but there are some counter examples too) so in some instances we’re asking folks to restart that instead of the whole appliance as an empirical test.

I can assure you a large effort is being made to get to the bottom of this, one of the reasons for my recent absence from these parts.

[ Edited: 14 November 2011 06:32 PM by Charles Oldham]
Profile
 
 
Posted: 28 November 2011 09:19 PM   [ Ignore ]   [ # 13 ]  
Rank
Newbie
Total Posts:  3
Joined:  2011-06-23

Any new updates on this? I haven’t opened a support ticket because I noticed most people seem to be stuck with no solution.

Profile
 
 
Posted: 29 November 2011 03:18 AM   [ Ignore ]   [ # 14 ]  
BMC ADDM Staff
RankRank
Member
Total Posts:  25
Joined:  2009-10-21

Please do open a support ticket if you are having a consolidation performance issue so we can investigate your context. Thank you.

Profile
 
 
Posted: 05 December 2011 02:16 PM   [ Ignore ]   [ # 15 ]  
RankRankRankRank
Guru
Total Posts:  597
Joined:  2011-03-16

I lowered the thread count as Craig posted (http://discovery.bmc.com/community/forum/viewthread/2077/) and it has made our appliance run much better. We have not used over 80% of the memory for some time now and the scans still take just as long as they used to (this is a good thing as they were quick.) We are not currently doing any SNMP scans (at least we have no credentials in the UI), but we have never done any. I hope this helps you all. I have a conference call with BMC today at 11:30 am EST because support keeps telling us that “This is how Linux works” but then in the same email states that “BMC expects to fix this issue in ADDM 8.3 SP2.” I have found much more help from all the people here on the forum and I am going to make sure that during this conference call I reference the forum posts in order to get them on the right path and make sure they know what all has been done with our appliance and what worked.

Profile
 
 
   
1 of 3
1