SAS Viya 3.4 host group memory usage: One year later

1 Like

A little more than a year ago we measured the memory footprint of non-CAS host groups and services for a SAS Viya 3.4 order shipped in June 2018 (18w30 ship event, in SAS parlance) in this blog. One would expect that the memory profile of subsequent ship events would change, but by how much? In this blog we will look at the shift in memory usage that has occurred in a deployment of a SAS Viya 3.4 order that went to production in May 2019 (19w21 ship event) when compared to the 18w30 order from a year ago.

Memory is cheap. Why do we care?

One of the architect’s tasks is to ensure that each host of SAS Viya services is configured with enough memory to reliably start and run the services with the expected load. In many environments that are memory-rich, this is not an issue. But if you read my recent blog about memory requirements, you know that when a machine does not have enough memory, certain servers might not start. Therefore memory allocation is one of many important steps of the planning for a SAS Viya 3.4 deployment.

One scenario we’ve seen where this can become a factor is when a customer has a limited selection of memory sizes for their virtual machines used for deployment. For example, when only virtual machines configured with 32GB of memory are available for SAS Viya services, the services must be spread across multiple machines. The mechanism for assigning services to machines is called host groups. In this scenario it is important to understand the host group memory profile and to distribute the host groups in a manner that ensures each machine has enough memory for the short and long term.

Most of the key points regarding host groups and topology were highlighted in my blog from a year ago, so we’ll get right into the comparison.

The Comparison

To recap the testing that we discussed in the previous blog, we wanted to determine whether SAS R&D had delivered on a promise to reduce the memory usage of SAS Viya 3.4 (July 2018), after memory requirements had increased in SAS Viya 3.3. We developed a method of finding the service associated with each SAS Viya component. We then mapped each service to its host group, which corresponds to a file name in the group_vars directory in the playbook. Finally, we used the PID for each service to check its memory usage.

Now we want to know whether the May 2019 release of SAS Viya 3.4 resulted in a further reduction in the memory footprint of its services.

Orders

The SAS Viya software orders that we used for comparison were orders that we used for deployment in our SAS workshop environment. Here are the details so that SAS employees can reproduce our tests (the ordering system used for testing is not available outside of the SAS network):

The preproduction 18w30 order from last year included the following products: 2019-09-12 16_59_41-SAS Viya Order 70180938-09MP31_ Viya34 - ImplWkshp - 18w30 - Port25May18 - RelVe.png The productionn order for 19w21 added more products: SAS Intelligent Decisioning, SAS Analytics for IoT and SAS Risk Engine.

The additional products mean that the comparison is not like-for-like, but the differences don’t make our reporting any less useful because the additional products brought along their own host groups.

An item of note is that while 18w30 was a fresh deployment, the 19w21 environment was not. It was based on the original 18w30 deployment which was subsequently updated to production, updated again at 18w39, and finally shipped again at 19w21. This should not impact the reporting. And in this environment all non-CAS services run on a single host with 96GB of memory.

Host Groups

If we look at the difference in host groups using the diff command, we see the following new host groups in the 19w21 order. (The diff command was used to discern the differences, but only the relevant host group information is displayed.) It is not surprising that there is one host group per added product.

# The AIoTServices host group contains the services for SAS Analytics for IoT.
[AIoTServices]

# Deploys the SAS High-Performance Risk analytics cluster installation script.
[HighPerformanceRiskGridInstaller]

# The subjectcontacts host group contains a service of SAS Intelligent Decisioning that records contacts and responses of treatments.
[subjectcontacts]

Total memory used for all host groups

Now that we know the difference in host groups, let’s look at a summary of overall memory usage. If you recall from the blog last year, the script walks through each service of each host group by scanning files in the group_vars directory of the SAS Viya playbook. It captures the rss (resident set size) and vm (virtual memory) size of each process tracked to a service. Some of the services are manually mapped to a host group as the mapping is not in the playbook. The script was executed not long after the services were all running and before any activity of significance was initiated. Therefore these measurements are a baseline.

Remember that these numbers change constantly, even on an idle system. An idle Viya system may be idle from a user perspective, but there is always some activity within the system.

In the following table we see the summary of memory used, real and virtual, as well as the total number of services when the two ship events are compared.

Notice that the amount of total memory used is basically the same even though three products were added and the deployment consists of eleven additional services. Based on this mix of products it appears that the memory footprint is roughly the same, but we can still assume that it shrank, given that the number of products and services had increased.

The total memory used is approximately 67GB (68,858 / 1024) but that value includes some double-counting of shared memory. For example, the pgpool processes, of which there are over one thousand, access shared memory, but the script that calculates the numbers is not aware of shared memory. Therefore the actual amount of required memory is slightly less. A little more on this topic later.

More details

If you read the blog from last year you recall that this information was placed in a spreadsheet to summarize information from the service/process level up to the host group level and then the overall level. The same process was used this time; new tabs were added for 19w21 and then a comparison was made between the two ship events.

The following screen shot from the spreadsheet comparing data from the SAS Viya software from the two ship events. The last three columns calculate the differences among the three metrics. The differences of significance have been highlighted using conditional highlighting, showing decreases or increases in memory. The highlighting is a little counterintuitive because negative is “good” (memory footprint shrank) and positive is “bad” (memory footprint grew).

A quick review of the chart shows that the host groups with the largest decrease in memory (>300MB) used are CoreServices, espServer, Model Services, pgpoolc, ReportViewerServices and VisualTextAnalytics. The host group with the largest decrease, espServer, consisted only of the esmagent at 18w30. The SAS Event Stream Manager agent is no longer shipped at 19w21. It should also be noted that the ESP server, although not a service, has the potential to consume a fair amount of memory.

However, in most deployments that include SAS Event Stream Processing, the ESP server will be deployed on a dedicated machine(s).

The host groups that increased the most (>300MB) were Advanced Analytics, DataMining, DataServices, ScoringServices and VisualForecasting. Notice that most of the host groups that increased in memory usage also increased in the number of services. Two of last three obviously show as an increase because they are new, AIoTServices and subjectcontacts. And although it is not new, espStreamviewer was not defined as a service at 18w30.

All things considered, the memory footprint has decreased between the two ship events. The difference is noticeable but not too significant.

What about growth?

An idle system is a good place to start, but it doesn’t provide insight into the services that will grow with user activity. The shared VLE environment used for SAS training workshops is a good environment to capture this information. Students regularly enroll in classes and connect to the environment. Once logged on, they run a wide variety of tasks, both in the web interfaces and via open source interfaces such as Python via JupyterHub. Although it doesn’t simulate a customer scenario, the VLE environment provides a wide variety of activity that drives a modest amount of resource usage.

Data was captured from this environment after eight days of usage. Over this period approximately 40+ users logged on. That data was then compared to the baseline shown above. The following screen shot shows the comparison of an idle versus active shared SAS testing environment.

As we would expect many of the host groups experienced memory growth. The Grand Total row shows that the total memory used grew nearly 10GB (~ 15%).

The host groups experiencing the greatest growth can be seen in red-shaded cells. Looks like VisualTextAnalytics is the winner. Its memory usage grew about 1.6GB. Upon drilling into that host group, we see that there is one service, sas-viya-parse-execution-provider-default, that accounts for a significant portion (~800MB) of that amount.

Several host groups had decreased memory usage. These changes are most like the result of stolen idle pages that are no longer in memory.

Although this data can provide some insight into the host groups whose memory requirements grew the most in a multi-user environment, chances are that it is not entirely representative of a realistic (customer) site.

Even more detail…

I could highlight additional details here, but I think you get the gist of the changes. Now is when the exploring begins. Open the spreadsheet via the link at the end of this blog. It contains nine tabs:

HG_Memory_18w30– raw data from the script (idle)
HG_Memory_19w21– raw data from the script (idle)
HG_Memory_19w21_Active– raw data from the script (active)
HG_Memory_Pivot_18w30– pivot table from data (idle)
HG_Memory_Pivot_19w21– pivot table from data (idle)
HG_Memory_Pivot_19w21_Active– pivot table from data (active)
CompareHG– host group comparison (screen shot above)
CompareHG_Active– host group comparison of 19w21 idle versus active
CompareHGServices– host group and service comparison of raw data (18w30 vs 19w21)

Within the pivot tables you can expand the host groups and view the services associated with each host group. The last tab, CompareHGServices, is an easy way to view the differences at the services level. This data in the various tabs can be used to group and assign host groups to multiple hosts in a multi-machine deployment of SAS Viya.

A useful tool

While revisiting this topic and looking at shared versus private memory, I came across a tool that I believe you may find useful if you have an interest in digging into process-level memory detail.

The tool, ps_mem, can be installed via yum or pip on Red Hat Enterprise Linux or CentOS. It is a Python package that provides numerous ways to summarize private and shared memory usage by process, user, etc.

https://github.com/pixelb/ps_mem

First, here is the command help:

ps_mem -h
Usage: ps_mem [OPTION]...
Show program core memory usage

-h, -help                   Show this help
-p <pid>[,pid2,...pidN]     Only show memory usage PIDs in the specified list
-s, --split-args            Show and separate by, all command line arguments
-t, --total                 Show only the total value
-d, --discriminate-by-pid   Show by process rather than by program
-S, --swap                  Show swap information
-w <N>                      Measure and show process memory every N seconds

The first example will show the private and shared memory of every process, sorted by ascending total. The output has been truncated as it is quite lengthy.

ps_mem -s
Private  +   Shared  =  RAM used       Program

 96.0 KiB +   8.5 KiB = 104.5 KiB       erl_child_setup 150000
124.0 KiB +  13.5 KiB = 137.5 KiB       /sbin/agetty --noclear tty1 linux
160.0 KiB +  14.5 KiB = 174.5 KiB       /usr/bin/lsmd -d
168.0 KiB +  25.5 KiB = 193.5 KiB       /usr/bin/rhsmcertd
168.0 KiB +  27.0 KiB = 195.0 KiB       rhnsd
192.0 KiB +  11.5 KiB = 203.5 KiB       /sbin/rngd -f
204.0 KiB +  14.5 KiB = 218.5 KiB       /opt/sas/viya/home/lib64/erlang/erts-9.2/bin/epmd -daemon
228.0 KiB +  32.0 KiB = 260.0 KiB       /usr/sbin/atd -f
236.0 KiB +  25.0 KiB = 261.0 KiB       sasels  34 31 35 2 f9dbe
232.0 KiB +  30.5 KiB = 262.5 KiB       sasels  7 4 8 2 7efd9
. . . .

Now let’s show memory usage for the RabbitMQ pid:

ps_mem -p 19216
Private  +   Shared  =  RAM used       Program

586.1 MiB + 410.0 KiB = 586.5 MiB       beam.smp
---------------------------------
                        586.5 MiB

Here is memory usage for the sas user account by process name, sorted by RAM used. The number following the process name indicates the number of processes.

ps_mem -p $(pgrep -d, -u sas)
Private  +   Shared  =  RAM used       Program

  6.6 MiB +   2.1 MiB =   8.6 MiB       objspawn
  6.8 MiB +   2.3 MiB =   9.1 MiB       cntspawn
  8.1 MiB +   1.3 MiB =   9.4 MiB       launcher
 10.3 MiB +  47.0 KiB =  10.4 MiB       sas-run-launcher
 17.0 MiB +  54.5 KiB =  17.1 MiB       sas-mbagent
 23.1 MiB +  57.5 KiB =  23.1 MiB       sas-alert-track
 33.4 MiB +  78.5 KiB =  33.5 MiB       sas-watch
 35.8 MiB +   7.0 KiB =  35.8 MiB       vault
 40.6 MiB +  76.5 KiB =  40.6 MiB       sas-stream
 66.2 MiB +   4.2 MiB =  70.4 MiB       consul-template (5)
 66.9 MiB +   6.1 MiB =  73.0 MiB       sas-ops-agent (2)
152.5 MiB +   8.5 KiB = 152.5 MiB       consul
514.1 MiB + 158.0 MiB = 672.0 MiB       postgres (220)
  1.3 GiB + 111.0 MiB =   1.4 GiB       pgpool (1027)
 64.3 GiB +  43.0 MiB =  64.3 GiB       java (159)
---------------------------------
                         66.9 GiB

Finally, the output shows totals by all users on this host.

for i in $(ps -e -o user= | sort | uniq); do   printf '%-20s%10s\n' $i $(sudo ps_mem --total -p $(pgrep -d, -u $i)); done
apache                41384960
chrony                  757760
dbus                   1084928
ldap                  61241856
libstoragemgmt          220672
polkitd                8229376
postfix                2945024
root                1115027968
rpc                     771072
rpcuser                1070592
sas                 71785367552
sasrabbitmq          614343168

As you can see, most of the memory is consumed by sas and sasrabbitmq. There are a few processes that run as root, but those have a very small memory footprint.

When the Python tool is run on an active system, we see processes initiated for individual users (the output has been truncated). Notice that the memory consumption varies from 4 to ~360MB.

for i in $(ps -e -o user= | sort | uniq); do   printf '%-20s%10s\n' $i $(sudo ps_mem --total -p $(pgrep -d, -u $i)); done
apache               155784704
chrony                  335872
dbus                    784896
gatedemo033          360659456
gatedemo034          271870976
gatedemo102          283167232
gatedemo106            4552192
gatedemo110            4660736
gatedemo115            4550144
gatedemo121          200544256
gatedemo122          341491200
gatedemo126            4544000
. . . .

If we look at user activity for one of the accounts, gatedemo033, we find that this user initiated a compute server. Keep this in mind when planning host group placement. An environment with many SAS Studio users will see at least one session per user. Other processes were the result of Python sessions initiated by JupyterHub.

ps_mem -p $(pgrep -d, -u gatedemo033)
Private  +   Shared  =  RAM used       Program

240.0 KiB +  65.5 KiB = 305.5 KiB       compsrv_start.s
340.4 MiB +   3.3 MiB = 343.7 MiB       compsrv
---------------------------------
                        344.0 MiB
=================================

If you have a SAS Viya environment where you want to know the memory footprint, this tool is a quick way to determine memory usage. We recommend using this Python tool before you plan your host group assignments.

Final Thoughts

This blog revealed a lot of data captured from three different systems, so let’s try to summarize what we discovered.

A SAS Viya 3.4 deployment at 19w21 used about the same amount of memory as an 18w30 order, despite including three additional products and new services that were added to existing host groups. Therefore we conclude that the memory footprint of existing services shrank over that period of time. However, the change (~2-3GB) is likely not significant enough to change the way host groups are distributed across multiple machines.

There is about a 15% increase in memory usage when an active 19w21 environment is compared to an idle system. Twelve of the 37 host groups account for most of this growth. Remember to keep growth in mind when planning to distribute host groups across a single machine or multiple machines. YMMV.

If you want to assess the shared and private memory of SAS Viya processes on a deployed system, be sure to check out the ps_mem Python package.

Finally, the conversion of services to the Go language in SAS Viya 3.5 is expected to dramatically change the memory landscape. We look forward to assessing the memory footprint of this change in the near future.