BookmarkSubscribeRSS Feed
Kroken
Calcite | Level 5

I have a category variable with age groups, such as 1-10, 11-20, 21-20, 31-, and want to graphically present the change in percent frequency of each class between two dates. Does anyone have an idea on how I can do this within the Report Designer in SAS visual analytics? The desired reslut (which should be presented grapically somehow) is something like this. 

 

The first three columns are percent frequency by data and the last two are growth in percent frequncey.

 

Age group2016-01-012016-06-012017-01-012016-01-01 - 2016-06-012016-01-01 - 2017-01-01
0-1030%20%20%-33%-33%
11-2020%30%20%50%0%
21-3020%30%40%50%100%
31-30%20%20%-33%-33%
Tot100%100%100%  

 

 

which shold be calculated from data something like this: 

 

DateIDAge group
2016-01-0110-10
2016-01-0120-10
2016-01-0130-10
2016-01-01411-20
2016-01-01511-20
2016-01-01621-30
2016-01-01721-30
2016-01-01831-
2016-01-01931-
2016-01-011031-
2016-06-0110-10
2016-06-0120-10
2016-06-01311-20
2016-06-01411-20
2016-06-01511-20
2016-06-01621-30
2016-06-01721-30
2016-06-01821-30
2016-06-01931-
2016-06-011031-
2017-01-0110-10
2017-01-0120-10
2017-01-01311-20
2017-01-01411-20
2017-01-01521-30
2017-01-01621-30
2017-01-01721-30
2017-01-01821-30
2017-01-01931-
2017-01-011031-

 

5 REPLIES 5
AhmedAl_Attar
Rhodochrosite | Level 12

Hey Kroken,

 

Her is an alternative approach for your consideration:

I fully understand the benifit of loading data in Memory and what have you, but sometimes, working with purpose built data, can yield lots more benifits without demanding tones of resources.

 

Build your base percent frequency aggregated table first, then loaded into the LASR server, from that point, your calculated columns should be more straight forward.

 

Here is sample code to build the Base percent frequency Aggregated table:

 

DATA have;
    LENGTH    Date 8
                    ID 4
                    Age_group $5;

    INPUT Date :yymmdd10. ID Age_group;
    FORMAT    date yymmdd10.;
DATALINES;
2016-01-01 1 0-10
2016-01-01 2 0-10
2016-01-01 3 0-10
2016-01-01 4 11-20
2016-01-01 5 11-20
2016-01-01 6 21-30
2016-01-01 7 21-30
2016-01-01 8 31-
2016-01-01 9 31-
2016-01-01 10 31-
2016-06-01 1 0-10
2016-06-01 2 0-10
2016-06-01 3 11-20
2016-06-01 4 11-20
2016-06-01 5 11-20
2016-06-01 6 21-30
2016-06-01 7 21-30
2016-06-01 8 21-30
2016-06-01 9 31-
2016-06-01 10 31-
2017-01-01 1 0-10
2017-01-01 2 0-10
2017-01-01 3 11-20
2017-01-01 4 11-20
2017-01-01 5 21-30
2017-01-01 6 21-30
2017-01-01 7 21-30
2017-01-01 8 21-30
2017-01-01 9 31-
2017-01-01 10 31-
;
RUN;

PROC TABULATE DATA=WORK.HAVE OUT=WORK.summary;
    
    CLASS Date /    ORDER=UNFORMATTED MISSING;
    CLASS Age_group /    ORDER=UNFORMATTED MISSING;
    TABLE
        /* Row Dimension */
        Age_group={LABEL=''},
        /* Column Dimension */
        Date={LABEL=''}*ColPctN={LABEL=''}*F=BEST4.
        /*Table Options */
        / BOX={LABEL='Age Group'} ;
        ;
RUN;


PROC TRANSPOSE DATA=WORK.SUMMARY OUT=WORK.WANT(DROP=_NAME_);
    BY    Age_group;
    ID        Date;
    IDLABEL    Date;

    VAR    PctN_10;
RUN;

 

 

Hope this helps,

Ahmed

Kroken
Calcite | Level 5

HI Ahmed -

 

Thanks for you reply. Are you an experienced user of VA? I am very new to the tool and I get the feeling that it lacks the flexibility that I was hoping for. It seems that I very often get stuck when I try to build a graphical solution (I am only working in report builder so far) that requires data to be processed over more than two or three dimensions (or "steps"). For this example my initial intention was to load the actual age variable into the LASR and then make custom catetories and calculate the % frequencey and then the development of each category. But it seems it has to be done outside of the VA (as you suggests). Other attempts to work with the devolpment of grouped data has also failed and I have been advised to create purpose build data outside VA.

 

It seems to me that I should change my mindset and start to work with smaller datasets with purpose build data instead of trying to work with a large datamart containing raw data. For example to calculate the growth per age group outside VA and add a maximum of one or mabye two additional dimensions (as for example date). And use different source data tables for my reports. It kind of removes the benifits I thought I would find in this tool but it seems it is the best way to avoid getting stuck and frustrated.

 

Do you think that this is a better approach whenever I want to present something that requires processing raw data more than two or three times within the report builder? 

 

Kind regards,

Kroken

 

AhmedAl_Attar
Rhodochrosite | Level 12

Hi Kroken,

 

Here is my personal opinion, and I'm by no means advanced user of this tools, but I wish I was.

 

I can not disagree with you with regards to, the lack of functionalities, but then again, which tool/software doesn't suffer from this? and that's why, R&D and software developers keep comping up with updates and enhancements in every new release. That's just the nature of software applications. This is by no means a justification, it's a fact, and that's why you see customers favor one Vendor over another, because of what the tools can offer out of the box.

 

When it comes to Report Design & Development, All BI tools vendors will agree on, having pre-aggregated data saves resources and speeds delivery of results and visuals to the end user. Their tools might have the ability to calculate these custom aggregates on the fly within the report, via custom calculations and what have you, but when start overloading/repeating these custom calculations over multiple reports and data sources, the performance starts taking a hit, and users start to notice delayes in response times and that leads to frustrations, despite all the Query Caching, and Parallel query capability thay may have put inplace!

 

When you watch or listen to their experts taking about techiques to enhance query/report performance, they all will advice on pushing the processing down to the database or utilize pre-aggregated data to minimize run-time processing.

The other thing to consider, most / if not all of these BI Tools, provide a data preparation tool of some sort for that purpose.

 

As a report designer, you would probably start with a scetch of the report on a peice of paper, then you start putting/building the visual objects together and assembling the final report. During that process, you would discover 

- Report Objective / Focus Area 

- Exact data requirements (details, aggregates, ratios, differences, ....etc)

- Visual Objects (native, custom built)

- Annotations

- Links

 

All I can say, don't get discouraged, no report happens in just few seconds, even the one you see in demos, they are all rehersed 😉

 

Check this blog for reference: How to design an infographic about volcanic eruptions using SAS Visual Analytics

For additional related blogs, Just go to http://blogs.sas.com/content/ and serch for "Visual Analytics"

 

Hope this helps,

Ahmed

Kroken
Calcite | Level 5

Thank you for you tips, time and insight Ahmed! Much appreciated. I'm going small!

 

Kind regards,

Kroken

AhmedAl_Attar
Rhodochrosite | Level 12

Hi Kroken,

 

I came across this paper and I thought it might be of interest to you.

Designing for Performance – Best Practices for SAS® Visual Analytics Reports

 

Regards,

Ahmed

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1940 views
  • 1 like
  • 2 in conversation