08-30-2015 06:46 AM
I have several datasets, each containing a "subject id" column. One of the datasets includes only 1 variable, "subject id", but it doesn't store all subjects, but only those in the study, for example:
Note that subjects 3,4,7,8 are excluded.
Now I want to read the other datasets (from some library) into matching datasets in the work library, BUT, I want to read only the rows with a "subject id" that appear in the column of the subjects dataset (above).
How do I do that ?
Let's assume the dataset of subjects in the study is called "subjects" and I want to read a dataset from library 'a' called 'safety' (which has a subject id column with all subjects and other columns)
Thank you !
08-30-2015 11:38 AM
1) Get a list of the members in the library.
2) Use it to generate code to merge the subset dataset with each member and generate a work dataset.
Let's assume that the id variable is named ID and that the dataset with the the subject list is named SUBSET.
Also let's assume that each dataset is sorted (or has an index) by ID.
Here is example using PROC CONTENTS and a data _null_ step to generate a data step merge for each dataset.
%let idvar=ID ;
%let ds=SUBSET ;
proc contents data=&libref.._all_ noprint out=_contents ; run;
filename code temp ;
file code ;
set _contents ;
where upcase(name)="%upcase(&idvar)" and memname ne "%upcase(&ds)" ;
put 'data ' memname ';'
/ " merge &ds(in=in1) &libref.." memname ';'
/ " by &idvar ;"
/ ' if in1;'
%inc code / source2 ;