BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shasank
Quartz | Level 8

Hi Community, 

 

I wanted to explore a way to get a # of all the Distinct ID's in all the datasets of a library. The final output should have the list of all the datasets in the library under column A and # of Distinct ID's in Column B. Please share possibilities or approachable logics. 

 

Thank you for you time.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
/* Find all data sets in a library named PAIGE */
proc sql;
    create table dsnames as select distinct memname from dictionary.tables where libname='PAIGE';
quit;

/* Count distinct IDs in all of these data sets */
data _null_;
    set dsnames;
    length str $ 1024;
    str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct id) as n from paige.',memname,';quit;');
    str2='proc append base=all new=ids; run;';
    str3='proc delete data=ids; run;';
    call execute(cat(str,str2,str3));
run;
--
Paige Miller

View solution in original post

21 REPLIES 21
PeterClemmensen
Tourmaline | Level 20

What if a data set in the library does not contain the variable ID?

shasank
Quartz | Level 8
Thank you for your question. If the dataset doesn't have the ID variable then the count could be shown as zero or missing. But, the current library has ID in all the datasets.
PeterClemmensen
Tourmaline | Level 20

Ok. Do the data sets share some naming convention?

shasank
Quartz | Level 8
Yes. They have the same name across all the datasets. The ID is the key variable across all the datasets.
PeterClemmensen
Tourmaline | Level 20

Yes but do the actual data sets have similar names, like data1, data2 and so on?

shasank
Quartz | Level 8
Oh no. They donot have a sequence. They are names differently.
PeterClemmensen
Tourmaline | Level 20

Ok. Even though you got your answer, here is an alternative.

 

data one;
input id var1;
datalines;
1 10
1 20
3 40
3 50
3 60
;

data two;
input id var2;
datalines;
1 10
1 20
4 30
4 40
6 50
6 60
6 70
;

data three;
input id var3;
datalines;
7 10
7 20
7 30
7 40
;


data want(keep = memname distinct);

   dcl hash hoh (ordered : "Y");
   hoh.definekey("memname", "id");
   hoh.definedata("h", "memname", "id");
   hoh.definedone();
   dcl hiter i ("hoh");

   do until (z);
      set sashelp.vcolumn(where=(libname = 'WORK' and upcase(name) = 'ID')) end = z;
      id=.;
      if hoh.find() ne 0 then do;
         dcl hash h (dataset : memname, duplicate : 'r');
         h.definekey ("id");
         h.definedone();
         hoh.add();
      end;
   end;

   do while (i.next() = 0);
      distinct = h.num_items;
      output;
   end;

run;
PaigeMiller
Diamond | Level 26
/* Find all data sets in a library named PAIGE */
proc sql;
    create table dsnames as select distinct memname from dictionary.tables where libname='PAIGE';
quit;

/* Count distinct IDs in all of these data sets */
data _null_;
    set dsnames;
    length str $ 1024;
    str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct id) as n from paige.',memname,';quit;');
    str2='proc append base=all new=ids; run;';
    str3='proc delete data=ids; run;';
    call execute(cat(str,str2,str3));
run;
--
Paige Miller
shasank
Quartz | Level 8
HI Paige, Thank you for your help. I tried your code and I the dsnames came out as an empty dataset. The log has no errors.
PaigeMiller
Diamond | Level 26

@shasank wrote:
HI Paige, Thank you for your help. I tried your code and I the dsnames came out as an empty dataset. The log has no errors.

Show us. Show us the log, and show us the incorrect results.

--
Paige Miller
shasank
Quartz | Level 8

Hi Paige, 

 

Here is the code. 

 

shasank_0-1611594151618.pngshasank_1-1611594291739.png

 

shasank
Quartz | Level 8
Nev And Labor are 2 libraries I am testing.
PaigeMiller
Diamond | Level 26

Library name must be capital letters:

 

LABOUR

--
Paige Miller
shasank
Quartz | Level 8

Thank you. The all caps of Library worked. I am trouble shooting the rest of the code. 


/* Count distinct IDs in all of these data sets */
data _null_;
set dsnames;
length str $ 1024;
str=cats('proc sql; create table ids as select "' ,memname,'" as memname length=32,count(distinct ptid) as n from Labor.',memname,';quit;');
str2='proc append base=all new=ids; run;';
str3='proc delete data=ids; run;';
call execute(cat(str,str2,str3));
run;

shasank_0-1611595193048.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 21 replies
  • 1701 views
  • 4 likes
  • 6 in conversation