BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shasank
Quartz | Level 8

Hi Community, 

 

I am trying to solve a unique requirement and need your help if an efficient way exists to get this done. 

I have >10 datasets in my library and each have an ID. I need to create a dataset to as a summary of if an ID is present in each dataset. 

 

 data CLASS(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M, 14, 69, 112.5
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 4, F, 14, 62.8, 102.5
 5, M, 14, 63.5, 102.5
 ;
 run;


  data SUBCLASS1(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M ,14, 69 ,112.5
 3, F ,13 ,65.3, 98
 5, M ,14 ,63.5, 102.5
 ;
 run;

   data SUBCLASS2(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 5, M, 14, 63.5, 102.5
 ;
 run;

Based on the datasets created above. 

I need my output to look like this:

IDCLASSSUBCLASS1SUBCLASS2
1ClassSubclass1 
2Class.Subclass2
3ClassSubclass1Subclass2
4Class..
5ClassSubclass1Subclass2

 

Thank you for your time and effort. 

 

Best, 

SC.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Study this code:

data CLASS(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M, 14, 69, 112.5
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
4, F, 14, 62.8, 102.5
5, M, 14, 63.5, 102.5
;

data SUBCLASS1(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M ,14, 69 ,112.5
3, F ,13 ,65.3, 98
5, M ,14 ,63.5, 102.5
;

data SUBCLASS2(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
5, M, 14, 63.5, 102.5
;
 
proc sql noprint;
select catx(".",libname,memname) into :datasets separated by " "
from dictionary.tables
where libname = "WORK";
quit;

data all / view=all;
length inname $41 dsname dname $32;
set &datasets. indsname=inname;
dsname = scan(inname,2,".");
dname = dsname;
keep id dsname dname;
run;

proc sort
  data=all
  out=want
  nodupkey
;
by id dsname;
run;

proc report data=want;
column id dsname,dname n;
define id /group;
define dsname / "" display;
define dname / "" across;
define n / noprint;
run;

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

Study this code:

data CLASS(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M, 14, 69, 112.5
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
4, F, 14, 62.8, 102.5
5, M, 14, 63.5, 102.5
;

data SUBCLASS1(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
1, M ,14, 69 ,112.5
3, F ,13 ,65.3, 98
5, M ,14 ,63.5, 102.5
;

data SUBCLASS2(label='Student Data');
infile datalines dsd truncover;
input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
datalines;
2, F, 13, 56.5, 84
3, F, 13, 65.3, 98
5, M, 14, 63.5, 102.5
;
 
proc sql noprint;
select catx(".",libname,memname) into :datasets separated by " "
from dictionary.tables
where libname = "WORK";
quit;

data all / view=all;
length inname $41 dsname dname $32;
set &datasets. indsname=inname;
dsname = scan(inname,2,".");
dname = dsname;
keep id dsname dname;
run;

proc sort
  data=all
  out=want
  nodupkey
;
by id dsname;
run;

proc report data=want;
column id dsname,dname n;
define id /group;
define dsname / "" display;
define dname / "" across;
define n / noprint;
run;
mkeintz
PROC Star

Using the INDSNAME option on a set command will deal with this in a single data step.  If you have a short list of datasets then they can be literally entered into the code.

data CLASS(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M, 14, 69, 112.5
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 4, F, 14, 62.8, 102.5
 5, M, 14, 63.5, 102.5
 run;


  data SUBCLASS1(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 1, M ,14, 69 ,112.5
 3, F ,13 ,65.3, 98
 5, M ,14 ,63.5, 102.5
run;

   data SUBCLASS2(label='Student Data');
   infile datalines dsd truncover;
   input ID:$2. Sex:$1. Age:2. Height:3.2 Weight:2.2;
 datalines;
 2, F, 13, 56.5, 84
 3, F, 13, 65.3, 98
 5, M, 14, 63.5, 102.5
run;


data want (drop=i);
  do until (last.id);
    set class subclass1 subclass2 indsname=dsn;
    by id;
    array dsn_dummys {*} $1  subclass1 subclass2 ;
    array dsn_names  {2} $20  _temporary_ ('WORK.SUBCLASS1','WORK.SUBCLASS2');
    if dsn='WORK.CLASS' then continue;
    i=whichc(dsn,of dsn_names{*});
    dsn_dummys{i}='Y';
  end;
  do i=1 to dim(dsn_dummys);
    if dsn_dummys{i}^='Y' then dsn_dummys{i}='N';
  end;
run;

Or you can put the dataset list (and corresponding dummy vars) in macrovars:

 

%let csv_list='WORK.SUBCLASS1','WORK.SUBCLASS2';
%LET var_list=subclass1 subclass2 ;
%let n_dsn=%sysfunc(countw(&var_list));

data want (drop=i);
  do until (last.id);
    set class &var_list indsname=dsn;
    by id;
    array dsn_dummys {*} $1  &var_list ;
    array dsn_names  {&n_dsn} $20  _temporary_ (&csv_list);
    if dsn='WORK.CLASS' then continue;
    i=whichc(dsn,of dsn_names{*});
    dsn_dummys{i}='Y';
  end;
  do i=1 to dim(dsn_dummys);
    if dsn_dummys{i}^='Y' then dsn_dummys{i}='N';
  end;
run;

And, as @Kurt_Bremser showed, you can use proc sql to create the macrovars.  This, slightly modified code, could prefix the code above, replacing the %LET statements:

proc sql noprint;
  select quote(catx(".",libname,memname)) into :csv_list  separated by ","
    from dictionary.tables
    where libname = "WORK" and memname^='CLASS';
  select memname into :var_list separated by ' '
    from dictionary.tables
    where libname = "WORK" and memname^='CLASS';
quit;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 321 views
  • 0 likes
  • 3 in conversation