Hello all,
I would like to check if the ID numbers listed in one dataset are valid given a column of valid ID values in another dataset.
For example:
Check:
ID Name Age
A5 Liz 25
3B Bob 31
55 Bill 33
BE Ash 19
Valid: (this is an imported Excel file)
ID
A5
BE
18
3B
Desired output:
Discrepancies
ID Name Age
55 Bill 33
There is no set format for the ID (could be a string, integer, mixture) etc. Essentially, I want to cross reference each ID in Check with the entire column ID in Valid and keep the row if it is not found within Valid.
I have tried:
data match discrepancy;
set check;
if ID in valid then output match;
else output discrepancy;
run;
which returns an error that the right hand operator is not an array name or constant value list.
Any guidance would be appreciated.
1. Sort both tables by ID, the lookup table with valid IDs using option nodupkey
2. Data step merge. Something like:
data valid invalid; merge have valid_ids(in=in_valid); by id; if in_valid =1 then output valid; else if in_valid=0 then output invalid; run;
Best approach with modern SAS tools, requires no sorting:
data
match
discrepancy
;
set check;
if _n_ = 1
then do;
declare hash v (dataset:"valid");
v.definekey("id");
v.definedone();
end;
if v.check() = 0
then output match;
else output discrepancy;
run;
Untested; for tested code, provide usable example data in data steps with datalines.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.