Hello.
I am very new to sas and am hoping someone can help me. I have a longitudinal survey data set and want to conduct simple descriptive statistics. When I run proc surveyfreq, sas is counting each ID as an observation. Is there an easy way to make sure sas only counts the unique subject IDs?
For example this is a snippet of my data :
SUBJECT_ID GENDER AGE
1007197 1 45
1007197 1 45
1007197 1 45
1007813 2 62
1007813 2 62
1007925 1 53
1007925 1 53
So if I ran a freq on gender it would show total of 7 obs when in reality it should only be a total of 3.
Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?
Thanks in advance!
Lots of great suggestions, here is an SQL approach
data have; input
SUBJECT_ID GENDER AGE;
cards;
1007197 1 45
1007197 1 45
1007197 1 45
1007813 2 62
1007813 2 62
1007925 1 53
1007925 1 53
;
proc sql;
create table want as
select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID
from have
group by SUBJECT_ID
;
quit;
"Would it be correct to do a proc sort with nodupkey to create a separate dataset to conduct only the descriptive stats?"- I really like this idea.
Other ways might be to sort on subject_id, and SET the data using first. or last. criteria. Pretty much the same thing as nodupkey, though.
SteveDenham
Sir @SteveDenham What a pleasant surprise. I didn't know God of statistics is still active? Have you incarnated again? Believe it or not, Many of stat mates were missing you so badly.
Lots of great suggestions, here is an SQL approach
data have; input
SUBJECT_ID GENDER AGE;
cards;
1007197 1 45
1007197 1 45
1007197 1 45
1007813 2 62
1007813 2 62
1007925 1 53
1007925 1 53
;
proc sql;
create table want as
select distinct SUBJECT_ID, GENDER, AGE, count(SUBJECT_ID) as n_ID
from have
group by SUBJECT_ID
;
quit;
SQL is a simple way to do this without modifying the data.
You can use the DISTINCT keyword inside the COUNT() function. Examples:
proc sql;
select count(distinct subject_id) as nsubjects from have;
select age,count(distinct subject_id) as nsubjects
from have
group by age
;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.