- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc freq data=x nlevels;
table patient_id;
run;
I am trying to count the # of unique patients in this data.
Does the resulting output show the number of unique patient ids? or just the total amount of observations with non-missing ids.
Thank you, I appreciate the help.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you need count of unique IDS:
proc sql:
select count(*)
from (select DISTINCT patient_id from x);
quit;
If you need list of unique ids:
proc sql;
select distinct Patient_id
from x;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It will provide total number of observations for each level of patient_id. So if each patient_id has one frequency then we can conclude that patient_id is unique.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not quite, NLEVELS does produce the number of distinct observations in the number of levels table.
Be careful of how it treats missing and how you want missing values treated though.
proc freq data=sashelp.class nlevels;
table age;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you need count of unique IDS:
proc sql:
select count(*)
from (select DISTINCT patient_id from x);
quit;
If you need list of unique ids:
proc sql;
select distinct Patient_id
from x;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To answer your question OP, the nlevels option will count distinct values, including missing values. The following code can help you break down multiple variables into unique values missing and non-missings much easier than performing the same task with a proc sql:
ods output nlevels=LEVELS;
proc freq data=dataset nlevels;
tables var1 var2 var3 var4 / noprint ;
run;
proc print data=LEVELS;
run;