DATA Step, Macro, Functions and more

Array statement help

Reply
Contributor
Posts: 57

Array statement help

Hello,

 

I have a dataset in SAS that looks like the below but I'm looking to do the following and think an Array statement is the best way to do so...

 

My current data

SBR_ID	Value
1234	A
1234	B
5432	A
6789	A
5432	C
1234	B
5432	B
6789	A
6789	C

I want my data set to end up like this whereby for each SBR_ID it will sum the amount of times that value appears. 

SBR_ID	A	B	C
1234	1	2	
5432	1	1	1
6789	2	0	1

Thanks 

SAS Super FREQ
Posts: 8,743

Re: Array statement help

Hi:

  Although you could use a DATA step and may or may not need an ARRAY statement,  if you want just a report, you could use PROC REPORT or PROC TABULATE. If you want a summary report AND a dataset, then PROC REPORT can do both for you with one pass through the data.

 

  Here's the program:

data fakedata;
  infile datalines;
  input SBR_ID	Value $;
return;
datalines;
1234	A
1234	B
5432	A
6789	A
5432	C
1234	B
5432	B
6789	A
6789	C
;
run;
  
title 'Report Showing Summary';
proc report data=fakedata 
            out=work.trans_sum(rename=(_c2_=A _c3_=B _c4_=C)drop=_break_);
  column sbr_id n,value;
  define sbr_id / group;
  define value / across;
  define n / ' ';
run;
  
proc print data=work.trans_sum;
  title 'Data set created with PROC REPORT';
run;

  And here's the report and a PROC PRINT of the dataset:

use_proc_report.png

 

The key to understanding how PROC REPORT can do this means that you have to understand that, by default, the ACROSS items get absolute column numbers assigned to them when PROC REPORT designs the report in a pre-processing phase. So, conceptually, SBR_ID is column 1 and then the column for A is _c2_; the column for B is _C3_ and the column for C is _C4_. Usually, in your output dataset, you do NOT want the absolute column names, so the RENAME statement takes care of renaming the variables in work.trans_sum. And, then, PROC REPORT also makes a "helper" variable called _BREAK_ which isn't necessary, so it is dropped.

 

  PROC TABULATE would do a report for you, but not an output dataset in transposed structure. Here's the sample TABULATE code:


title 'PROC TABULATE report';
proc tabulate data=fakedata out=work.tabout;
  class sbr_id value;
  table sbr_id,
        n*value;
  keylabel n=' ';
run;

proc print data=work.tabout;
title 'Dataset created by TABULATE';
run;

You can test it out for yourself -- if all you need is a report, then either TABULATE or REPORT would do. If you need a DATASET, then DATA step, or REPORT would do.

 

cynthia

Contributor
Posts: 57

Re: Array statement help

Hi,

 

That doesn't work for me.

 

I just get blanks when I run your code - also, I want it to output a data set and not a table.

 

Thanks

Super User
Posts: 17,898

Re: Array statement help

Given that @Cynthia_sas generated sample data and the code runs on the sample data, your data either differs or you ran the code incorrectly.

WIthout seeing what you did, it's a toss up.

 

You can use proc freq to get the summaries by ID  and then proc transpose to get the dataset desired.

Super User
Posts: 9,687

Re: Array statement help

Your want REPORT or want TABLE ?

 

data fakedata;
  infile datalines;
  input SBR_ID	Value $;
return;
datalines;
1234	A
1234	B
5432	A
6789	A
5432	C
1234	B
5432	B
6789	A
6789	C
;
run;
proc freq data=fakedata noprint;
 table sbr_id*value/out=freq list nopercent;
run;
proc transpose data=freq out=want;
by sbr_id;
var count;
id value;
run;
Ask a Question
Discussion stats
  • 4 replies
  • 189 views
  • 1 like
  • 4 in conversation