This is what I am trying to do -
Data_set_1:
PID
012345
012356
014589
Data_set_2:
PID Type
012345 A
012345 B
012356 A
014589 B
I'm looking to create a table, using the PID as the unique value between the datasets, which returns the PID from Data_set_1, and a variable field that returns a 1 value when ANY record in Data_Set_2 for that PID as a Type value of B.
So I want my final data set to look like:
PID Var
012345 1
012356 0
014589 1
Can't think of anything simpler than:
proc sql;
select
pid,
"B" in (select type from set2 where pid = set1.pid) as var
from set1;
quit;
Here is one way. There are better ways.
With data like
data set1;
input pid;
datalines;
012345
012356
014589
;
run;
data set2;
input pid type $1.;
datalines;
012345 A
012345 B
012356 A
014589 B
;
run;
you can do
data midstep;
merge set1 (in = a) set2;
by pid;
if a and type = 'B' then myvar = 1;
else myvar = 0;
run;
proc sql;
select pid, max(myvar)
from midstep
group by pid;
quit;
to get
| pid | |
|---|---|
| 12345 | 1 |
| 12356 | 0 |
| 14589 | 1 |
Can't think of anything simpler than:
proc sql;
select
pid,
"B" in (select type from set2 where pid = set1.pid) as var
from set1;
quit;
Aha.
So
"B" in (select type from set2 where pid = set1.pid)
is returning a TRUE (1) if it sees a "B" in the list of returned set2 types for each set1 PID, otherwise it is returning a FALSE (0). Would not have ever come up with that.
data set1;
input pid;
datalines;
012345
012356
014589
;
run;
data set2;
input pid type $1.;
datalines;
012345 A
012345 B
012356 A
014589 B
;
run;
data want;
if _n_=1 then do;
declare hash h(dataset:'set2(where=(type="B"))');
h.definekey('pid');
h.definedone();
end;
set set1;
var=ifn(h.check()=0,1,0);
run;
An another one:
data want;
merge
set1
set2(where=(type = 'B'))
;
by pid;
Var = not missing(type);
drop type;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.