Hi Community,
I wanted to explore a way to get a # of all the Distinct ID's in all the datasets of a library. The final output should have the list of all the datasets in the library under column A and # of Distinct ID's in Column B. Please share possibilities or approachable logics.
Thank you for you time.
/* Find all data sets in a library named PAIGE */
proc sql;
create table dsnames as select distinct memname from dictionary.tables where libname='PAIGE';
quit;
/* Count distinct IDs in all of these data sets */
data _null_;
set dsnames;
length str $ 1024;
str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct id) as n from paige.',memname,';quit;');
str2='proc append base=all new=ids; run;';
str3='proc delete data=ids; run;';
call execute(cat(str,str2,str3));
run;
What if a data set in the library does not contain the variable ID?
Ok. Do the data sets share some naming convention?
Yes but do the actual data sets have similar names, like data1, data2 and so on?
Ok. Even though you got your answer, here is an alternative.
data one;
input id var1;
datalines;
1 10
1 20
3 40
3 50
3 60
;
data two;
input id var2;
datalines;
1 10
1 20
4 30
4 40
6 50
6 60
6 70
;
data three;
input id var3;
datalines;
7 10
7 20
7 30
7 40
;
data want(keep = memname distinct);
dcl hash hoh (ordered : "Y");
hoh.definekey("memname", "id");
hoh.definedata("h", "memname", "id");
hoh.definedone();
dcl hiter i ("hoh");
do until (z);
set sashelp.vcolumn(where=(libname = 'WORK' and upcase(name) = 'ID')) end = z;
id=.;
if hoh.find() ne 0 then do;
dcl hash h (dataset : memname, duplicate : 'r');
h.definekey ("id");
h.definedone();
hoh.add();
end;
end;
do while (i.next() = 0);
distinct = h.num_items;
output;
end;
run;
/* Find all data sets in a library named PAIGE */
proc sql;
create table dsnames as select distinct memname from dictionary.tables where libname='PAIGE';
quit;
/* Count distinct IDs in all of these data sets */
data _null_;
set dsnames;
length str $ 1024;
str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct id) as n from paige.',memname,';quit;');
str2='proc append base=all new=ids; run;';
str3='proc delete data=ids; run;';
call execute(cat(str,str2,str3));
run;
@shasank wrote:
HI Paige, Thank you for your help. I tried your code and I the dsnames came out as an empty dataset. The log has no errors.
Show us. Show us the log, and show us the incorrect results.
Hi Paige,
Here is the code.
Library name must be capital letters:
LABOUR
Thank you. The all caps of Library worked. I am trouble shooting the rest of the code.
/* Count distinct IDs in all of these data sets */
data _null_;
set dsnames;
length str $ 1024;
str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct ptid) as n from Labor.',memname,';quit;');
str2='proc append base=all new=ids; run;';
str3='proc delete data=ids; run;';
call execute(cat(str,str2,str3));
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.