I have created the following dataset for reference :
data readin;
Input X$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
B 3
A 3;
run;
I want to calculate count of X by Y variable. Since X is a character variable, the aggregation cannot be done using PROC MEANS and PROC TABULATE. I dont want to create X into numerical groups. I know this can be easily done via PROC SQL (Code shown below)
PROC SQL;
Create table readin1 as
Select Y, count(X) from
readin
group by Y;
quit;
Is it possible via data step or procedures other than PROC SQL?
Well, first question would be that SQL does the job pretty well, why so reluctant to not use it? I suppose you could do it by creating a format, and then putting your character into that format so the underlying data would be numeric, and displayed as A or B. Then means would work. You could also use rank procedure to derive a value for each one:
data readin;
Input X$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
B 3
A 3
;
run;
data want;
set readin;
numversion=rank(x);
run;
Finally way I can think of is to retain the last value each time:
proc sort data=readin;
by y x;
run;
data want;
set readin;
retain count;
by Y X;
if first.X then count=1;
else count=count+1;
if last.X then output;
run;
You could use proc means without a variable and take the _freq_ output value.
proc sort data=readin;
by y x;
run;
proc means data=readin;
by y x;
output out=want;
run;
Your PROC SQL is counting the non-missing values of X grouped by Y. That can also be done with PROC SUMMARY and a WHERE statement.
I would be interested in seeing the PROC Tabulate that didn't work as I count character variables often using Tabulate.
@ballardw : Thank you for your email. The following PROC TABULATE code doesn't work for me as X is a character variable.
PROC TABULATE DATA = READIN;
CLASS Y;
VAR X;
TABLE X*N*Y;
RUN;
Var variables in tabulate are expected to be numeric. Make X a class variable and it should work.
If i make X a class variable, it will return cross tab. This is what i am not looking for. I want count of X grouped by Y.
Refer the code : -
PROC SQL;
Create table readin1 as
Select Y, count(X) from
readin
group by Y;
quit;
proc tabulate data=readin out=want(where=(x = ' ') drop=_:);
class x y;
table y,x all;
run;
proc print data=want;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.