BookmarkSubscribeRSS Feed
jlin4
Fluorite | Level 6

Supposed I have a dataset work.id_score, and I intend to search it for a specific ID. If that ID is found in the dataset, then it will return me "Y". Otherwise, it will return a "N". I have written the codes below to do so if I only have one ID to search for.

 

However, should I have a list of ID, such as in work.id, then how should I go about searching for them at once? (Supposed that I have to make use of the codes written for the case of 1 ID of interest since this is a simplified example) Then desired result is work.id_results, where the results (Y or N) is returned as a new column next to the ID column. Thank you!

 

data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;
%let id_search = "abc";

proc sql; create table work.id_count as select * from work.id_score where ID = &id_search; quit; proc sql; select case when count(*) > 0 then "Y" else "N" end as search_outcome from work.id_count; quit;
data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;

data work.id_results;
input ID$ search_outcome$;
cards;
abc Y
def Y
xyz N
pqr N
;
run;    
4 REPLIES 4
Rydhm
Obsidian | Level 7

With your method, you will need to create a macro variable with the list of Ids that you want to search. Then loop through the list and apply proc sql for each element. This will be doable but lengthy. Instead, you can use merge as follows. 

 

data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;

data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;

proc sort data=id_score; by id; run;
proc sort data=id; by id; run;

data search;
merge id_score(in=a) id(in=b);
   by id;
   if a and b then search_outcome='Y';
   else search_outcome='N';
   if b;
   keep id search_outcome;
run;

proc sort data=search nodupkey; by id; run;

proc print data=search; run;
jimbarbour
Meteorite | Level 14

There are multiple ways you could do it.  Have you considered using a hash table?  If you load your "look up" dataset, which would be work.id_score, into a hash table, you could do a read on the hash table.  If you get a zero return code, you would set the result to 'Y'.  If you get a non zero return code, that would indicate that there is no match, and you would set the result to 'N'.  See example code, below, and, below that, the results.

 

Jim

 

data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;

data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;

DATA	WORK.id_results;
	DROP	_:;
	SET	WORK.ID;

	**------------------------------**;
	**	Define hash tables.			**;
	**------------------------------**;
	IF	_N_							=	1	THEN
		DO;
			DECLARE	HASH	Hsh_ID_Table			(ORDERED: 'A', MULTIDATA: 'N', DATASET: 'work.id_score');
							Hsh_ID_Table.DefineKey 	('ID');
							Hsh_ID_Table.DefineData	('ID');
							Hsh_ID_Table.DefineDone	();
		END;

	**------------------------------**;
	**	Look up ID.					**;
	**------------------------------**;
	_RC								=	Hsh_ID_Table.FIND();
	IF	_RC							=	0	THEN
		DO;
			Search_Outcome			=	'Y';
		END;
	ELSE
		DO;
			Search_Outcome			=	'N';
		END;
RUN;

 

 

jimbarbour_0-1627068857735.png

 

 

ballardw
Super User

@jlin4 wrote:

Supposed I have a dataset work.id_score, and I intend to search it for a specific ID. If that ID is found in the dataset, then it will return me "Y". Otherwise, it will return a "N". I have written the codes below to do so if I only have one ID to search for.

 

However, should I have a list of ID, such as in work.id, then how should I go about searching for them at once? (Supposed that I have to make use of the codes written for the case of 1 ID of interest since this is a simplified example) Then desired result is work.id_results, where the results (Y or N) is returned as a new column next to the ID column. Thank you!

 

data work.id_score;
input ID$ score;
cards;
abc 10
def 20
ghi 30
abc 15
def 25
;
run;
%let id_search = "abc";

proc sql; create table work.id_count as select * from work.id_score where ID = &id_search; quit; proc sql; select case when count(*) > 0 then "Y" else "N" end as search_outcome from work.id_count; quit;
data work.id;
input ID$;
cards;
abc
def
xyz
pqr
;
run;

data work.id_results;
input ID$ search_outcome$;
cards;
abc Y
def Y
xyz N
pqr N
;
run;    

Here is a way that creates a numeric 1/0 outcome variable. Since SAS will treat 1 as true (or yes) and 0 as false (or no) for logic the 1/0 coding is usually a better way to go then character values like 'Y' and 'N'.

proc sql;
   create table want as 
   select a.id, (a.id=b.id) as outcome
   from work.id as a left join
        (select distinct id from work.id_score) as b
        on a.id=b.id
   ;
quit;

You can get the count of "yes" with a sum function, the percent yes as a mean function result, any "yes" if the max is 1, all "yes" if the min is 1, all the same if the range is 0. Which could require some interesting coding to get with character values.

 

If you want to display Y/N Yes/No or similar than a custom format will do that just fine.

PGStats
Opal | Level 21

The simpler the better 😉

proc sql;
create table want as 
select 
	id, 
	id in (select id from id) as outcome
from id_score;
PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 889 views
  • 8 likes
  • 5 in conversation