Hello,
I have a dataset in SAS that looks like the below but I'm looking to do the following and think an Array statement is the best way to do so...
My current data
SBR_ID Value 1234 A 1234 B 5432 A 6789 A 5432 C 1234 B 5432 B 6789 A 6789 C
I want my data set to end up like this whereby for each SBR_ID it will sum the amount of times that value appears.
SBR_ID A B C 1234 1 2 5432 1 1 1 6789 2 0 1
Thanks
Hi:
Although you could use a DATA step and may or may not need an ARRAY statement, if you want just a report, you could use PROC REPORT or PROC TABULATE. If you want a summary report AND a dataset, then PROC REPORT can do both for you with one pass through the data.
Here's the program:
data fakedata;
infile datalines;
input SBR_ID Value $;
return;
datalines;
1234 A
1234 B
5432 A
6789 A
5432 C
1234 B
5432 B
6789 A
6789 C
;
run;
title 'Report Showing Summary';
proc report data=fakedata
out=work.trans_sum(rename=(_c2_=A _c3_=B _c4_=C)drop=_break_);
column sbr_id n,value;
define sbr_id / group;
define value / across;
define n / ' ';
run;
proc print data=work.trans_sum;
title 'Data set created with PROC REPORT';
run;
And here's the report and a PROC PRINT of the dataset:
The key to understanding how PROC REPORT can do this means that you have to understand that, by default, the ACROSS items get absolute column numbers assigned to them when PROC REPORT designs the report in a pre-processing phase. So, conceptually, SBR_ID is column 1 and then the column for A is _c2_; the column for B is _C3_ and the column for C is _C4_. Usually, in your output dataset, you do NOT want the absolute column names, so the RENAME statement takes care of renaming the variables in work.trans_sum. And, then, PROC REPORT also makes a "helper" variable called _BREAK_ which isn't necessary, so it is dropped.
PROC TABULATE would do a report for you, but not an output dataset in transposed structure. Here's the sample TABULATE code:
title 'PROC TABULATE report';
proc tabulate data=fakedata out=work.tabout;
class sbr_id value;
table sbr_id,
n*value;
keylabel n=' ';
run;
proc print data=work.tabout;
title 'Dataset created by TABULATE';
run;
You can test it out for yourself -- if all you need is a report, then either TABULATE or REPORT would do. If you need a DATASET, then DATA step, or REPORT would do.
cynthia
Hi,
That doesn't work for me.
I just get blanks when I run your code - also, I want it to output a data set and not a table.
Thanks
Given that @Cynthia_sas generated sample data and the code runs on the sample data, your data either differs or you ran the code incorrectly.
WIthout seeing what you did, it's a toss up.
You can use proc freq to get the summaries by ID and then proc transpose to get the dataset desired.
Your want REPORT or want TABLE ?
data fakedata;
infile datalines;
input SBR_ID Value $;
return;
datalines;
1234 A
1234 B
5432 A
6789 A
5432 C
1234 B
5432 B
6789 A
6789 C
;
run;
proc freq data=fakedata noprint;
table sbr_id*value/out=freq list nopercent;
run;
proc transpose data=freq out=want;
by sbr_id;
var count;
id value;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.