Quartz | Level 8

## Grouping by character variable

I have created the following dataset for reference :

Input X\$ Y;

cards;

D 1

B 2

A 2

A 3

A 3

B 3

A 3;

run;

I want to calculate count of X by Y variable. Since X is a character variable, the aggregation cannot be done using PROC MEANS and PROC TABULATE. I dont want to create X into numerical groups. I know this can be easily done via PROC SQL (Code shown below)

PROC SQL;

Select Y, count(X) from

group by Y;

quit;

Is it possible via data step or procedures other than PROC SQL?

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Grouping by character variable

where not missing(x);
class y;
tables y;
run;
9 REPLIES 9
Diamond | Level 26

## Re: Grouping by character variable

Well, first question would be that SQL does the job pretty well, why so reluctant to not use it?  I suppose you could do it by creating a format, and then putting your character into that format so the underlying data would be numeric, and displayed as A or B.  Then means would work.  You could also use rank procedure to derive a value for each one:

Input X\$ Y;

cards;

D 1

B 2

A 2

A 3

A 3

B 3

A 3

;

run;

data want;

numversion=rank(x);

run;

Finally way I can think of is to retain the last value each time:

by y x;

run;

data want;

retain count;

by Y X;

if first.X then count=1;

else count=count+1;

if last.X then output;

run;

Super User

## Re: Grouping by character variable

You could use proc means without a variable and take the _freq_ output value.

by y x;

run;

by y x;

output out=want;

run;

## Re: Grouping by character variable

Your PROC SQL is counting the non-missing values of X grouped by Y.  That can also be done with PROC SUMMARY and  a WHERE statement.

Input X\$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
. 3
B 3
A 3
;
run;

PROC SQL;

group by Y;
quit;

run;
proc summary nway;

where not missing(x);
class y;
output out=freq;
run;
proc print;

run;

Super User

## Re: Grouping by character variable

I would be interested in seeing the PROC Tabulate that didn't work as I count character variables often using Tabulate.

Quartz | Level 8

## Re: Grouping by character variable

@ballardw : Thank you for your email. The following PROC TABULATE code doesn't work for me as X is a character variable.

CLASS Y;

VAR X;

TABLE X*N*Y;

RUN;

Super User

## Re: Grouping by character variable

Var variables in tabulate are expected to be numeric. Make X a class variable and it should work.

Quartz | Level 8

## Re: Grouping by character variable

If i make X a class variable, it will return cross tab. This is what i am not looking for. I want count of X grouped by Y.

Refer the code : -

PROC SQL;

Select Y, count(X) from

group by Y;

quit;

## Re: Grouping by character variable

where not missing(x);
class y;
tables y;
run;
Ammonite | Level 13

## Re: Grouping by character variable

proc tabulate data=readin out=want(where=(x = ' ') drop=_:);

class x y;

table y,x all;

run;

proc print data=want;

run;

Discussion stats
• 9 replies
• 4047 views
• 2 likes
• 6 in conversation