I have created the following dataset for reference :
data readin;
Input X$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
B 3
A 3;
run;
I want to calculate count of X by Y variable. Since X is a character variable, the aggregation cannot be done using PROC MEANS and PROC TABULATE. I dont want to create X into numerical groups. I know this can be easily done via PROC SQL (Code shown below)
PROC SQL;
Create table readin1 as
Select Y, count(X) from
readin
group by Y;
quit;
Is it possible via data step or procedures other than PROC SQL?
Well, first question would be that SQL does the job pretty well, why so reluctant to not use it? I suppose you could do it by creating a format, and then putting your character into that format so the underlying data would be numeric, and displayed as A or B. Then means would work. You could also use rank procedure to derive a value for each one:
data readin;
Input X$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
B 3
A 3
;
run;
data want;
set readin;
numversion=rank(x);
run;
Finally way I can think of is to retain the last value each time:
proc sort data=readin;
by y x;
run;
data want;
set readin;
retain count;
by Y X;
if first.X then count=1;
else count=count+1;
if last.X then output;
run;
You could use proc means without a variable and take the _freq_ output value.
proc sort data=readin;
by y x;
run;
proc means data=readin;
by y x;
output out=want;
run;
Your PROC SQL is counting the non-missing values of X grouped by Y. That can also be done with PROC SUMMARY and a WHERE statement.
I would be interested in seeing the PROC Tabulate that didn't work as I count character variables often using Tabulate.
@ballardw : Thank you for your email. The following PROC TABULATE code doesn't work for me as X is a character variable.
PROC TABULATE DATA = READIN;
CLASS Y;
VAR X;
TABLE X*N*Y;
RUN;
Var variables in tabulate are expected to be numeric. Make X a class variable and it should work.
If i make X a class variable, it will return cross tab. This is what i am not looking for. I want count of X grouped by Y.
Refer the code : -
PROC SQL;
Create table readin1 as
Select Y, count(X) from
readin
group by Y;
quit;
proc tabulate data=readin out=want(where=(x = ' ') drop=_:);
class x y;
table y,x all;
run;
proc print data=want;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.