Hi,
I have two datasets. One has many variables, one of which is a unique ID variable (1 row=1 unique person, no dupes). The second dataset is just a list of IDs. What I want to do is create a variable in Dataset 1 for whether a person's ID also appears in Dataset 2. I know how to do this using joins, but I'm hoping there's a simpler way that I don't know about or haven't thought of.
Any help is much appreciated.
Thanks!
Hi,
/* Note assumes you have a variable FLAG in your original dataset to hold the update */
proc sql;
update table HAVE
set FLAG="Y"
where ID in (select distinct ID from DATASET2) ;
quit;
You could also do it by a normal select:
proc sql;
create table WANT as
select A.*,
case when exists(select THIS.ID from DATASET2 THIS where THIS.ID=A.ID) then "Y"
else "N" end as FLAG
from DATASET1;
quit;
A data step solution:
data A;
input ID X1 X2;
cards;
100 10 20
200 20 30
500 50 70
;
run;
data B;
input ID;
cards;
100
150
200
100
200
300
;
run;
data want;
if 0 then set B;
declare hash h(dataset:'B');
h.definekey('ID');
h.definedone();
do until(last);
set A end = last;
flag = 'N';
if h.find() = 0 then flag = 'Y';
output;
end;
stop;
run;
I think there might be a very simple solution to this using the built in features of SAS. There is a temporary variable created when you merge data sets called 'in', which keeps track of where a data record comes from. So, you could do something like this:
*sort the two data files;
proc sort data = work.first_file; by id; run;
proc sort data = work.second_file; by id; run;
*Merge the two files, keeping only the records in the first file, but create a variable that indicates if the ID was in the second file;
Data all;
merge work.first_file (in = A)
work.second_file (in = B);
*create the variable;
if B = 1 then in_second_file = 1;
*output only the original records;
if A = 1 then output;
run;
Good luck!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.