I have two data sets. One data set contains a variable called code, which is the product code. The other data set has variable acode (for eastern US stores) and bcode (for western US stores), along with the item's description. acode usually is not the same as bocde.
Below is a trivial example.
data rawcode; input $3 code; cards; 001 026 099 126 523 525
; data ab_code; input $3 acode $3 bcode $7 desc; cards; 001 001 chairs 027 026 cups 099 027 plates 126 098 pencils 399 112 books 400 113 pens 522 523 clips 525 525 pillows ;
What I want to do is to see if code in rawcode matches a value for bcode in ab_code. If so, then code will not change, but if it doesn't then I want to replace it with the acode value. The final data set would look like this:
code desc
001 chairs
026 cups
099 plates
126 pencils
523 clips
525 pillows
Rather than do a bunch of sorts and merges, or use a hash table, is there a simple way to accomplish this using POC SQL?
Thanks!
Andrew
data rawcode;
input code $3.;
cards;
001
026
099
126
523
525
;
data ab_code;
input acode :$3. bcode :$7. desc :$10.;
cards;
001 001 chairs
027 026 cups
099 027 plates
126 098 pencils
399 112 books
400 113 pens
522 523 clips
525 525 pillows
;
proc sql;
create table want as
select a.code, coalesce(b.desc, c.desc) as desc
from rawcode a left join ab_code b
on a.code=b.bcode
left join ab_code c
on a.code=c.acode;
quit;
I hope the above helps
Thanks! I didn't know about the coalesce function in SQL.
coalesce is a normal sas function, that can also be used in proc sql, but afaik only if the sql is not passed to the database.
Imho a data step is almost always easier to read than any proc sql, i would use a hash-object
data want;
if 0 then set rawcode; /* load variable(s) defined in the dataset used as source for the hash */
set ab_code;
/* defining the hash */
if _n_ = 1 then do;
declare hash h(dataset: 'rawcode');
h.defineKey('code');
h.defineDone();
end;
/* the find-method returns 0 if something is found */
if h.find(key: bcode) = 0 then output;
else if h.find(key: acode) = 0 then output;
drop acode bcode;
run;
@andreas_lds Coalesce is one of the functions also available in data bases and though SAS can push this function to the DB. Which functions can get pushed will depend on the DB and its version and is documented. As an example here for Oracle.
The data step syntax you shared would drop observations which I assume is not what the OP wants. Below how I believe it should look like.
data want;
/* defining the hash */
if _n_ = 1 then
do;
if 0 then set rawcode(keep=code);
declare hash h(dataset: 'rawcode');
h.defineKey('code');
h.defineData('code');
h.defineDone();
end;
call missing(of _all_);
set ab_code;
/* the find-method returns 0 if something is found */
if h.find(key: bcode) ne 0 then code=acode;
drop acode bcode;
run;
I like your solution. However, mea culpa, I framed the problem way too simply. Here's what I really want.
Hospitalized patients can have a diagnosis coded in their medical record in one of two ways: as an APRDRG or an MSDRG. These are stored in the tables “APR” and ‘MS”.
* Tables apr and ms;
data apr; input @1 diagnosis $21. @25 aprdrg $3.;
cards;
Heart_Failure 001
Myocardial_Infarction 002
Stroke 103
Respiratory_Failure
GI_Bleeding 116
GI_Cancer 167
GI_Obstruction 172
Influenza 298
Head_Trauma 300
;
run;
data ms; input @1 diagnosis $21. @25 msdrg $3.;
cards;
Heart_Failure 001
Myocardial_Infarction 005
Stroke 098
Respiratory_Failure 103
GI_Bleeding 112
GI_Cancer
Asthma 174
Influenza 298
Head_Trauma 300
Pelvic_Fractue 302
;
run;
The main patient data set is :
data main; input @1 ptid $3. @5 code $3. @9 drg_type $1.;
cards;
111 002 A
112 103 A
113 112 A
114 298 A
115 300 A
116 005 M
117 103 M
118 112 M
119 167 M
120 222 M
;
run;
The final table I'd like to get is:
data final; infile @1 PatientID $3. @5 drg_type $1. @7 code $3 @11 Diagnosis $21.;
cards;
111 A 002 Myocardial_Infarction
112 A 103 Stroke
113 M 112 GI_Bleeding
114 A 298 Influenza
115 A 300 Head_Trauma
116 M 005 Myocardial_Infarction
117 M 103 Respiratory_Failure
118 M 112 GI_Bleeding
119 A 167 GI_Cancer
120 222
;
Any help is greatly appreciated!
Thanks.
data rawcode; input code $3.; cards; 001 026 099 126 523 525 ; data ab_code; input acode :$3. bcode :$7. desc :$10.; cards; 001 001 chairs 027 026 cups 099 027 plates 126 098 pencils 399 112 books 400 113 pens 522 523 clips 525 525 pillows ; run; proc sql; create table want as select c.code, ab.desc from rawcode C inner join ab_code ab on (c.code =ab.acode or c.code =ab.bcode); quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.