Given the following sample data
DATA test; INPUT level :6. name :$100. count :6.; INFILE DATALINES DSD; DATALINES; 1, Top-category 1, . 2, Sub-category 1.1, 10 2, Sub-category 1.2, 20 2, Sub-category 1.3, . 3, Sub-sub category 1.3.1, 5 3, Sub-sub category 1.3.2, 5 1, Top-category 2, . 2, Sub-category 2.1, 10 2, Sub-category 2.2, . 2, Sub-category 2.3, 5 ; RUN;
is there a way to calculate the sums of all sub-categories iteratively?
Examples:
The count-value is only given for the "lowest" categories. If the the value of the "lowest" category is missing "." this should actually be 0 (e.g. sub-category 2.2). This is why top-category 2 should be 15.
Why should top-category 2 have a sum of 10, when 2.1 = 10 and 2.3 = 5? Why would you discard the value of 2.3?
1) on your last input line there is a dot instead of a comma after the level.
2) Use next code to create separate variables to define categories and sub categories:
DATA test;
INFILE DATALINES dlm=',' truncover;
INPUT level :6. name :$30. count :6.;
length cat $10;
catx = scan(name,-1,' ');
array ct cat1-cat5;
do i=1 to countw(catx,'.');
ct(i) = scan(catx,i,'.');
end;
DATALINES;
1, Top-category 1, .
2, Sub-category 1.1, 10
2, Sub-category 1.2, 20
2, Sub-category 1.3, .
3, Sub-sub category 1.3.1, 5
3, Sub-sub category 1.3.2, 5
1, Top-category 2, .
2, Sub-category 2.1, 10
2, Sub-category 2.2, .
2, Sub-category 2.3, 5
;
RUN;
next step run proc summary to calculate totals and subtotals:
proc summary data=test;
class cat1-cat3;
var count;
output out=summary sum=;
run;
Thank you for your solution, which works fine! Unfortunately, I was just giving these numbers in the variable `name` to make my question more understandable here. In reality, my data are like this:
DATA test;
INPUT level :6. name :$100. count :6.;
INFILE DATALINES DSD;
DATALINES;
01, Top, .
0101, Sub, 10
0102, Sub, 20
0103, Sub, .
01031, Sub-sub, 5
01032, Sub-sub, 5
02, Top, .
0201, Sub, 10
0202, Sub, 20
0203, Sub, .
02031, Sub-sub, 10
02032, Sub-sub, .
020321, Sub-sub-sub, 6,
020322, Sub-sub-sub, 4
;
RUN;
So, 0101 should be used instead of 1.1, 02031 instead of 2.3.1, 020311 instead of 2.3.1.1 etc. Do you think you could adapt this for me?
So your categories are defined in the LEVEL variable, then
change next lines to separate the parent level from the sub-levels:
instead:
do i=1 to countw(catx,'.');
ct(i) = scan(catx,i,'.');
end;
use:
1) read level as $6 informat instead a numeric informat
then
array ct cat1-cat3;
do i=1 to length(level)-2 by 2;
ct(i) = substr(level,i,2);
end;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.