BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ccp27
Calcite | Level 5

Hello.

I am very new to sas and am hoping someone can help me. I have a longitudinal survey data set and want to conduct simple descriptive statistics. When I run proc surveyfreq, sas is counting each ID as an observation. Is there an easy way to make sure sas only counts the unique subject IDs?

 

For example this is a snippet of my data :

SUBJECT_ID  GENDER  AGE  

1007197           1               45

1007197           1               45

1007197           1               45

1007813           2               62

1007813           2               62

1007925           1               53

1007925           1               53

 

So if I ran a freq on gender it would show total of 7 obs when in reality it should only be a total of 3. 

 

Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

"Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?"- I really like this idea.

SteveDenham
Jade | Level 19

Other ways might be to sort on subject_id, and SET the data using first. or last. criteria.  Pretty much the same thing as nodupkey, though.

 

SteveDenham

novinosrin
Tourmaline | Level 20

Sir @SteveDenham  What a pleasant surprise. I didn't know God of statistics is still active? Have you incarnated again? Believe it or not, Many of stat mates were missing you so badly. 

ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

Tom
Super User Tom
Super User

SQL is a simple way to do this without modifying the data.

You can use the DISTINCT keyword inside the COUNT() function. Examples:

proc sql;
select count(distinct subject_id) as nsubjects from have;
select age,count(distinct subject_id) as nsubjects
  from have
  group by age
;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1854 views
  • 2 likes
  • 5 in conversation