BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shasank
Quartz | Level 8

Hi Community, 

 

I am trying to solve a unique requirement and need your help if an efficient way exists to get this done. 

I have >10 datasets in my library and each have an ID. I need to create a dataset to as a summary of if an ID is present in each dataset. 

 

 data CLASS(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M, 14, 69, 112.5
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 4, F, 14, 62.8, 102.5
 5, M, 14, 63.5, 102.5
 ;
 run;


  data SUBCLASS1(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M ,14, 69 ,112.5
 3, F ,13 ,65.3, 98
 5, M ,14 ,63.5, 102.5
 ;
 run;

   data SUBCLASS2(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 5, M, 14, 63.5, 102.5
 ;
 run;

Based on the datasets created above. 

I need my output to look like this:

IDCLASSSUBCLASS1SUBCLASS2
1ClassSubclass1 
2Class.Subclass2
3ClassSubclass1Subclass2
4Class..
5ClassSubclass1Subclass2

 

Thank you for your time and effort. 

 

Best, 

SC.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Study this code:

data CLASS(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M, 14, 69, 112.5
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
4, F, 14, 62.8, 102.5
5, M, 14, 63.5, 102.5
;

data SUBCLASS1(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M ,14, 69 ,112.5
3, F ,13 ,65.3, 98
5, M ,14 ,63.5, 102.5
;

data SUBCLASS2(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
5, M, 14, 63.5, 102.5
;
 
proc sql noprint;
select catx(".",libname,memname) into :datasets separated by " "
from dictionary.tables
where libname = "WORK";
quit;

data all / view=all;
length inname $41 dsname dname $32;
set &datasets. indsname=inname;
dsname = scan(inname,2,".");
dname = dsname;
keep id dsname dname;
run;

proc sort
  data=all
  out=want
  nodupkey
;
by id dsname;
run;

proc report data=want;
column id dsname,dname n;
define id /group;
define dsname / "" display;
define dname / "" across;
define n / noprint;
run;

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

Study this code:

data CLASS(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M, 14, 69, 112.5
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
4, F, 14, 62.8, 102.5
5, M, 14, 63.5, 102.5
;

data SUBCLASS1(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M ,14, 69 ,112.5
3, F ,13 ,65.3, 98
5, M ,14 ,63.5, 102.5
;

data SUBCLASS2(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
5, M, 14, 63.5, 102.5
;
 
proc sql noprint;
select catx(".",libname,memname) into :datasets separated by " "
from dictionary.tables
where libname = "WORK";
quit;

data all / view=all;
length inname $41 dsname dname $32;
set &datasets. indsname=inname;
dsname = scan(inname,2,".");
dname = dsname;
keep id dsname dname;
run;

proc sort
  data=all
  out=want
  nodupkey
;
by id dsname;
run;

proc report data=want;
column id dsname,dname n;
define id /group;
define dsname / "" display;
define dname / "" across;
define n / noprint;
run;
mkeintz
PROC Star

Using the INDSNAME option on a set command will deal with this in a single data step.  If you have a short list of datasets then they can be literally entered into the code.

data CLASS(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M, 14, 69, 112.5
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 4, F, 14, 62.8, 102.5
 5, M, 14, 63.5, 102.5
 run;


  data SUBCLASS1(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M ,14, 69 ,112.5
 3, F ,13 ,65.3, 98
 5, M ,14 ,63.5, 102.5
run;

   data SUBCLASS2(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 5, M, 14, 63.5, 102.5
run;


data want (drop=i);
  do until (last.id);
    set class subclass1 subclass2 indsname=dsn;
    by id;
    array dsn_dummys {*} $1  subclass1 subclass2 ;
    array dsn_names  {2} $20  _temporary_ ('WORK.SUBCLASS1','WORK.SUBCLASS2');
    if dsn='WORK.CLASS' then continue;
    i=whichc(dsn,of dsn_names{*});
    dsn_dummys{i}='Y';
  end;
  do i=1 to dim(dsn_dummys);
    if dsn_dummys{i}^='Y' then dsn_dummys{i}='N';
  end;
run;

Or you can put the dataset list (and corresponding dummy vars) in macrovars:

 

%let csv_list='WORK.SUBCLASS1','WORK.SUBCLASS2';
%LET var_list=subclass1 subclass2 ;
%let n_dsn=%sysfunc(countw(&var_list));

data want (drop=i);
  do until (last.id);
    set class &var_list indsname=dsn;
    by id;
    array dsn_dummys {*} $1  &var_list ;
    array dsn_names  {&n_dsn} $20  _temporary_ (&csv_list);
    if dsn='WORK.CLASS' then continue;
    i=whichc(dsn,of dsn_names{*});
    dsn_dummys{i}='Y';
  end;
  do i=1 to dim(dsn_dummys);
    if dsn_dummys{i}^='Y' then dsn_dummys{i}='N';
  end;
run;

And, as @Kurt_Bremser showed, you can use proc sql to create the macrovars.  This, slightly modified code, could prefix the code above, replacing the %LET statements:

proc sql noprint;
  select quote(catx(".",libname,memname)) into :csv_list  separated by ","
    from dictionary.tables
    where libname = "WORK" and memname^='CLASS';
  select memname into :var_list separated by ' '
    from dictionary.tables
    where libname = "WORK" and memname^='CLASS';
quit;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 766 views
  • 0 likes
  • 3 in conversation