BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ccp27
Calcite | Level 5

Hello.

I am very new to sas and am hoping someone can help me. I have a longitudinal survey data set and want to conduct simple descriptive statistics. When I run proc surveyfreq, sas is counting each ID as an observation. Is there an easy way to make sure sas only counts the unique subject IDs?

 

For example this is a snippet of my data :

SUBJECT_ID  GENDER  AGE  

1007197           1               45

1007197           1               45

1007197           1               45

1007813           2               62

1007813           2               62

1007925           1               53

1007925           1               53

 

So if I ran a freq on gender it would show total of 7 obs when in reality it should only be a total of 3. 

 

Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

"Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?"- I really like this idea.

SteveDenham
Jade | Level 19

Other ways might be to sort on subject_id, and SET the data using first. or last. criteria.  Pretty much the same thing as nodupkey, though.

 

SteveDenham

novinosrin
Tourmaline | Level 20

Sir @SteveDenham  What a pleasant surprise. I didn't know God of statistics is still active? Have you incarnated again? Believe it or not, Many of stat mates were missing you so badly. 

ghosh
Barite | Level 11

Lots of great suggestions, here is an SQL approach

 

data have; input 
SUBJECT_ID  GENDER  AGE;  
cards;
1007197           1               45
1007197           1               45
1007197           1               45
1007813           2               62
1007813           2               62
1007925           1               53
1007925           1               53
;
proc sql;
  create table want as
  select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID 
  from have
  group by SUBJECT_ID
  ;
 quit;

 

 

Tom
Super User Tom
Super User

SQL is a simple way to do this without modifying the data.

You can use the DISTINCT keyword inside the COUNT() function. Examples:

proc sql;
select count(distinct subject_id) as nsubjects from have;
select age,count(distinct subject_id) as nsubjects
  from have
  group by age
;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1260 views
  • 2 likes
  • 5 in conversation