BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

I want to count number of distinct values of subcategories/subgroups within a group. For example, I have dataset called data:

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 

For this data, group is individual, and first subgroup is school, and second subgroup is major. 

 

David attended two distinct schools Brown and UC Davis, out of 3 of his rows. He studied three distinct fields CS, blank (blank needs to be counted as its own subgroup), and Psych. I want to get his school_count to be 2 and major_count to be 3. 

 

Eventually, I want a dataset like this: 

 

data want;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 school_flag degree_flag 3.;
input Name $ School $ Degree $ school_flag degree_flag 3.;
datalines;
David, Brown, CS, 2, 3
David, UC Davis, , 2, 3
David,UC Davis, Psych, 2, 3
Linda,UPenn, Statistics, 1, 1
Robert,U of T , Statistics, 2, 2
Robert, UCLA , Math, 2, 2
Sally, Northwestern, English, 2, 1
Sally, NYU, English, 2, 1
;

 

 

Is there a way to do this using data steps or proc sql? Thank you! 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @gsk  Pretty straight forward SQL

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 
proc sql;
create table want as
select *,count(distinct School) as school_flag,count(distinct ifc(degree=' ','9999',degree)) as degree_flag 
from data
group by name;
quit;

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

Hi @gsk  Pretty straight forward SQL

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 
proc sql;
create table want as
select *,count(distinct School) as school_flag,count(distinct ifc(degree=' ','9999',degree)) as degree_flag 
from data
group by name;
quit;
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

Thank you!! I looked up the syntax but still don't understand this part

 

count(distinct ifc(degree=' ','9999',degree)) as degree_flag

 

Could you please explain what distinct ifc(degree=' ','9999',degree) mean..? 

 

novinosrin
Tourmaline | Level 20

Since you mentioned Missing as valid count,Basically I am resetting missing to a non missing junk value before I count distinct using IFC function. 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1141 views
  • 2 likes
  • 2 in conversation