11-24-2016 10:39 AM
I have a dataset in SAS that looks like the below but I'm looking to do the following and think an Array statement is the best way to do so...
My current data
SBR_ID Value 1234 A 1234 B 5432 A 6789 A 5432 C 1234 B 5432 B 6789 A 6789 C
I want my data set to end up like this whereby for each SBR_ID it will sum the amount of times that value appears.
SBR_ID A B C 1234 1 2 5432 1 1 1 6789 2 0 1
11-24-2016 11:23 AM
Although you could use a DATA step and may or may not need an ARRAY statement, if you want just a report, you could use PROC REPORT or PROC TABULATE. If you want a summary report AND a dataset, then PROC REPORT can do both for you with one pass through the data.
Here's the program:
data fakedata; infile datalines; input SBR_ID Value $; return; datalines; 1234 A 1234 B 5432 A 6789 A 5432 C 1234 B 5432 B 6789 A 6789 C ; run; title 'Report Showing Summary'; proc report data=fakedata out=work.trans_sum(rename=(_c2_=A _c3_=B _c4_=C)drop=_break_); column sbr_id n,value; define sbr_id / group; define value / across; define n / ' '; run; proc print data=work.trans_sum; title 'Data set created with PROC REPORT'; run;
And here's the report and a PROC PRINT of the dataset:
The key to understanding how PROC REPORT can do this means that you have to understand that, by default, the ACROSS items get absolute column numbers assigned to them when PROC REPORT designs the report in a pre-processing phase. So, conceptually, SBR_ID is column 1 and then the column for A is _c2_; the column for B is _C3_ and the column for C is _C4_. Usually, in your output dataset, you do NOT want the absolute column names, so the RENAME statement takes care of renaming the variables in work.trans_sum. And, then, PROC REPORT also makes a "helper" variable called _BREAK_ which isn't necessary, so it is dropped.
PROC TABULATE would do a report for you, but not an output dataset in transposed structure. Here's the sample TABULATE code:
title 'PROC TABULATE report'; proc tabulate data=fakedata out=work.tabout; class sbr_id value; table sbr_id, n*value; keylabel n=' '; run; proc print data=work.tabout; title 'Dataset created by TABULATE'; run;
You can test it out for yourself -- if all you need is a report, then either TABULATE or REPORT would do. If you need a DATASET, then DATA step, or REPORT would do.
11-24-2016 03:27 PM
Given that @Cynthia_sas generated sample data and the code runs on the sample data, your data either differs or you ran the code incorrectly.
WIthout seeing what you did, it's a toss up.
You can use proc freq to get the summaries by ID and then proc transpose to get the dataset desired.
11-24-2016 10:06 PM
Your want REPORT or want TABLE ?
data fakedata; infile datalines; input SBR_ID Value $; return; datalines; 1234 A 1234 B 5432 A 6789 A 5432 C 1234 B 5432 B 6789 A 6789 C ; run; proc freq data=fakedata noprint; table sbr_id*value/out=freq list nopercent; run; proc transpose data=freq out=want; by sbr_id; var count; id value; run;