BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

I want to count number of distinct values of subcategories/subgroups within a group. For example, I have dataset called data:

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 

For this data, group is individual, and first subgroup is school, and second subgroup is major. 

 

David attended two distinct schools Brown and UC Davis, out of 3 of his rows. He studied three distinct fields CS, blank (blank needs to be counted as its own subgroup), and Psych. I want to get his school_count to be 2 and major_count to be 3. 

 

Eventually, I want a dataset like this: 

 

data want;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 school_flag degree_flag 3.;
input Name $ School $ Degree $ school_flag degree_flag 3.;
datalines;
David, Brown, CS, 2, 3
David, UC Davis, , 2, 3
David,UC Davis, Psych, 2, 3
Linda,UPenn, Statistics, 1, 1
Robert,U of T , Statistics, 2, 2
Robert, UCLA , Math, 2, 2
Sally, Northwestern, English, 2, 1
Sally, NYU, English, 2, 1
;

 

 

Is there a way to do this using data steps or proc sql? Thank you! 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @gsk  Pretty straight forward SQL

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 
proc sql;
create table want as
select *,count(distinct School) as school_flag,count(distinct ifc(degree=' ','9999',degree)) as degree_flag 
from data
group by name;
quit;

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

Hi @gsk  Pretty straight forward SQL

 

data data;
infile datalines missover delimiter=',';
length Name $6 School $12 Degree $10 ;
input Name $ School $ Degree $ ;
datalines;
David, Brown, CS
David, UC Davis,
David,UC Davis, Psych
Linda,UPenn, Statistics
Robert,U of T , Statistics
Robert, UCLA , Math
Sally, Northwestern, English
Sally, NYU, English
;

 
proc sql;
create table want as
select *,count(distinct School) as school_flag,count(distinct ifc(degree=' ','9999',degree)) as degree_flag 
from data
group by name;
quit;
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

Thank you!! I looked up the syntax but still don't understand this part

 

count(distinct ifc(degree=' ','9999',degree)) as degree_flag

 

Could you please explain what distinct ifc(degree=' ','9999',degree) mean..? 

 

novinosrin
Tourmaline | Level 20

Since you mentioned Missing as valid count,Basically I am resetting missing to a non missing junk value before I count distinct using IFC function. 

sas-innovate-white.png

🚨 Early Bird Rate Extended!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Lock in the best rate now before the price increases on April 1.

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1097 views
  • 2 likes
  • 2 in conversation