SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Grouping by character variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 181
Accepted Solution

Grouping by character variable

I have created the following dataset for reference :

data readin;

Input X$ Y;

cards;

D 1

B 2

A 2

A 3

A 3

B 3

A 3;

run;

I want to calculate count of X by Y variable. Since X is a character variable, the aggregation cannot be done using PROC MEANS and PROC TABULATE. I dont want to create X into numerical groups. I know this can be easily done via PROC SQL (Code shown below)

PROC SQL;

Create table readin1 as

Select Y, count(X) from

readin

group by Y;

quit;

Is it possible via data step or procedures other than PROC SQL?


Accepted Solutions
Solution
‎08-01-2014 03:37 PM
Respected Advisor
Posts: 3,777

Re: Grouping by character variable

proc tabulate data=readin;
   where not missing(x);
   class y;
   tables y;
   run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,407

Re: Grouping by character variable

Well, first question would be that SQL does the job pretty well, why so reluctant to not use it?  I suppose you could do it by creating a format, and then putting your character into that format so the underlying data would be numeric, and displayed as A or B.  Then means would work.  You could also use rank procedure to derive a value for each one:

data readin;

Input X$ Y;

cards;

D 1

B 2

A 2

A 3

A 3

B 3

A 3

;

run;

data want;

  set readin;

  numversion=rank(x);

run;

Finally way I can think of is to retain the last value each time:

proc sort data=readin;

  by y x;

run;

data want;

  set readin;

  retain count;

  by Y X;

  if first.X then count=1;

  else count=count+1;

  if last.X then output;

run;

Super User
Posts: 6,946

Re: Grouping by character variable

You could use proc means without a variable and take the _freq_ output value.

proc sort data=readin;

by y x;

run;

proc means data=readin;

by y x;

output out=want;

run;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 3,777

Re: Grouping by character variable

Your PROC SQL is counting the non-missing values of X grouped by Y.  That can also be done with PROC SUMMARY and  a WHERE statement.

data readin;
Input X$ Y;
cards;
D 1
B 2
A 2
A 3
A 3
. 3
B 3
A 3
;
run;

PROC SQL;
  
Select Y, count(X) from readin
  
group by Y;
   quit;
  
run;
proc summary nway;
  
where not missing(x);
   class y;
   output out=freq;
   run;
proc print;
  
run;

8-1-2014 6-46-05 AM.png
Super User
Posts: 10,516

Re: Grouping by character variable

I would be interested in seeing the PROC Tabulate that didn't work as I count character variables often using Tabulate.

Regular Contributor
Posts: 181

Re: Grouping by character variable

@ballardw : Thank you for your email. The following PROC TABULATE code doesn't work for me as X is a character variable.

PROC TABULATE DATA = READIN;                                                                                                           

CLASS Y;                                                                                                                               

VAR X;                                                                                                                                 

TABLE X*N*Y;                                                                                                                           

RUN;

Super User
Posts: 10,516

Re: Grouping by character variable

Var variables in tabulate are expected to be numeric. Make X a class variable and it should work.

Regular Contributor
Posts: 181

Re: Grouping by character variable

If i make X a class variable, it will return cross tab. This is what i am not looking for. I want count of X grouped by Y.

Refer the code : -

PROC SQL;

Create table readin1 as

Select Y, count(X) from

readin

group by Y;

quit;

Solution
‎08-01-2014 03:37 PM
Respected Advisor
Posts: 3,777

Re: Grouping by character variable

proc tabulate data=readin;
   where not missing(x);
   class y;
   tables y;
   run;
Trusted Advisor
Posts: 1,204

Re: Grouping by character variable

proc tabulate data=readin out=want(where=(x = ' ') drop=_Smiley Happy;

class x y;

table y,x all;

run;

proc print data=want;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 607 views
  • 0 likes
  • 6 in conversation