Supposed I have a dataset work.id_score, and I intend to search it for a specific ID. If that ID is found in the dataset, then it will return me "Y". Otherwise, it will return a "N". I have written the codes below to do so if I only have one ID to search for.
However, should I have a list of ID, such as in work.id, then how should I go about searching for them at once? (Supposed that I have to make use of the codes written for the case of 1 ID of interest since this is a simplified example) Then desired result is work.id_results, where the results (Y or N) is returned as a new column next to the ID column. Thank you!
data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;
%let id_search = "abc";
proc sql; create table work.id_count as
select *
from work.id_score
where ID = &id_search;
quit;
proc sql; select case when count(*) > 0 then "Y" else "N" end as search_outcome
from work.id_count;
quit;
data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;
data work.id_results;
input ID$ search_outcome$;
cards;
abc Y
def Y
xyz N
pqr N
;
run;
With your method, you will need to create a macro variable with the list of Ids that you want to search. Then loop through the list and apply proc sql for each element. This will be doable but lengthy. Instead, you can use merge as follows.
data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;
data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;
proc sort data=id_score; by id; run;
proc sort data=id; by id; run;
data search;
merge id_score(in=a) id(in=b);
by id;
if a and b then search_outcome='Y';
else search_outcome='N';
if b;
keep id search_outcome;
run;
proc sort data=search nodupkey; by id; run;
proc print data=search; run;
There are multiple ways you could do it. Have you considered using a hash table? If you load your "look up" dataset, which would be work.id_score, into a hash table, you could do a read on the hash table. If you get a zero return code, you would set the result to 'Y'. If you get a non zero return code, that would indicate that there is no match, and you would set the result to 'N'. See example code, below, and, below that, the results.
Jim
data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;
data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;
DATA WORK.id_results;
DROP _:;
SET WORK.ID;
**------------------------------**;
** Define hash tables. **;
**------------------------------**;
IF _N_ = 1 THEN
DO;
DECLARE HASH Hsh_ID_Table (ORDERED: 'A', MULTIDATA: 'N', DATASET: 'work.id_score');
Hsh_ID_Table.DefineKey ('ID');
Hsh_ID_Table.DefineData ('ID');
Hsh_ID_Table.DefineDone ();
END;
**------------------------------**;
** Look up ID. **;
**------------------------------**;
_RC = Hsh_ID_Table.FIND();
IF _RC = 0 THEN
DO;
Search_Outcome = 'Y';
END;
ELSE
DO;
Search_Outcome = 'N';
END;
RUN;
@jlin4 wrote:
Supposed I have a dataset work.id_score, and I intend to search it for a specific ID. If that ID is found in the dataset, then it will return me "Y". Otherwise, it will return a "N". I have written the codes below to do so if I only have one ID to search for.
However, should I have a list of ID, such as in work.id, then how should I go about searching for them at once? (Supposed that I have to make use of the codes written for the case of 1 ID of interest since this is a simplified example) Then desired result is work.id_results, where the results (Y or N) is returned as a new column next to the ID column. Thank you!
data work.id_score; input ID$ score; cards; abc 10 def 20 ghi 30 abc 15 def 25 ; run;
%let id_search = "abc";
proc sql; create table work.id_count as select * from work.id_score where ID = &id_search; quit; proc sql; select case when count(*) > 0 then "Y" else "N" end as search_outcome from work.id_count; quit;data work.id; input ID$; cards; abc def xyz pqr ; run; data work.id_results; input ID$ search_outcome$; cards; abc Y def Y xyz N pqr N ; run;
Here is a way that creates a numeric 1/0 outcome variable. Since SAS will treat 1 as true (or yes) and 0 as false (or no) for logic the 1/0 coding is usually a better way to go then character values like 'Y' and 'N'.
proc sql; create table want as select a.id, (a.id=b.id) as outcome from work.id as a left join (select distinct id from work.id_score) as b on a.id=b.id ; quit;
You can get the count of "yes" with a sum function, the percent yes as a mean function result, any "yes" if the max is 1, all "yes" if the min is 1, all the same if the range is 0. Which could require some interesting coding to get with character values.
If you want to display Y/N Yes/No or similar than a custom format will do that just fine.
The simpler the better 😉
proc sql;
create table want as
select
id,
id in (select id from id) as outcome
from id_score;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.