BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ccp27
Calcite | Level 5

Hello.

I am very new to sas and am hoping someone can help me. I have a longitudinal survey data set and want to conduct simple descriptive statistics. When I run proc surveyfreq, sas is counting each ID as an observation. Is there an easy way to make sure sas only counts the unique subject IDs?

 

For example this is a snippet of my data :

SUBJECT_ID  GENDER  AGE  

1007197           1               45

1007197           1               45

1007197           1               45

1007813           2               62

1007813           2               62

1007925           1               53

1007925           1               53

 

So if I ran a freq on gender it would show total of 7 obs when in reality it should only be a total of 3. 

 

Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

"Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?"- I really like this idea.

SteveDenham
Jade | Level 19

Other ways might be to sort on subject_id, and SET the data using first. or last. criteria.  Pretty much the same thing as nodupkey, though.

 

SteveDenham

novinosrin
Tourmaline | Level 20

Sir @SteveDenham  What a pleasant surprise. I didn't know God of statistics is still active? Have you incarnated again? Believe it or not, Many of stat mates were missing you so badly. 

ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

Tom
Super User Tom
Super User

SQL is a simple way to do this without modifying the data.

You can use the DISTINCT keyword inside the COUNT() function. Examples:

proc sql;
select count(distinct subject_id) as nsubjects from have;
select age,count(distinct subject_id) as nsubjects
  from have
  group by age
;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2299 views
  • 2 likes
  • 5 in conversation