BookmarkSubscribeRSS Feed
DocMartin
Quartz | Level 8

I have two data sets. One data set contains a variable called code, which is the product code. The other data set has variable acode (for eastern US stores) and bcode (for western US stores), along with the item's description. acode usually is not the same as bocde.

Below is a trivial example.

data rawcode; input $3 code;
cards;
001  
026  
099  
126  
523  
525  
; data ab_code; input $3 acode $3 bcode $7 desc; cards; 001 001 chairs 027 026 cups 099 027 plates 126 098 pencils 399 112 books 400 113 pens 522 523 clips 525 525 pillows ;

What I want to do is to see if code in rawcode matches a value for bcode in ab_code. If so, then code will not change, but if it doesn't then I want to replace it with the acode value. The final data set would look like this:

 

code  desc

001   chairs

026   cups

099   plates

126   pencils

523   clips

525   pillows

 

Rather than do a bunch of sorts and merges, or use a hash table, is there a simple way to accomplish this using POC SQL?

 

Thanks!

 

Andrew

 

 

 

 

6 REPLIES 6
novinosrin
Tourmaline | Level 20

data rawcode; 
input  code $3.;
cards;
001  
026  
099  
126  
523  
525  
;

data ab_code; 
input  acode :$3. bcode :$7. desc :$10.;
cards;
001 001  chairs
027 026  cups
099 027  plates
126 098  pencils
399  112   books
400  113   pens
522  523  clips
525  525  pillows
;

proc sql;
  create table want as
  select a.code, coalesce(b.desc, c.desc) as desc
  from rawcode a left join ab_code b
  on a.code=b.bcode
  left join ab_code c
  on a.code=c.acode;
quit;

I hope the above helps

DocMartin
Quartz | Level 8

Thanks! I didn't know about the coalesce function in SQL.

andreas_lds
Jade | Level 19

coalesce is a normal sas function, that can also be used in proc sql, but afaik only if the sql is not passed to the database.

 

Imho a data step is almost always easier to read than any proc sql, i would use a hash-object

data want;
	if 0 then set rawcode; /* load variable(s) defined in the dataset used as source for the hash */
	
	set ab_code;
	
    /* defining the hash */
	if _n_ = 1 then do;
		declare hash h(dataset: 'rawcode');
		h.defineKey('code');
		h.defineDone();		
	end;
	
    /* the find-method returns 0 if something is found */
	if h.find(key: bcode) = 0 then output; 
	else if h.find(key: acode) = 0 then output;
	
	drop acode bcode;
run;
Patrick
Opal | Level 21

@andreas_lds Coalesce is one of the functions also available in data bases and though SAS can push this function to the DB. Which functions can get pushed will depend on the DB and its version and is documented. As an example here for Oracle.

 

The data step syntax you shared would drop observations which I assume is not what the OP wants. Below how I believe it should look like.

data want;
  /* defining the hash */
  if _n_ = 1 then
    do;
      if 0 then set rawcode(keep=code);
      declare hash h(dataset: 'rawcode');
      h.defineKey('code');
      h.defineData('code');
      h.defineDone();
    end;
  call missing(of _all_);

  set ab_code;

  /* the find-method returns 0 if something is found */
  if h.find(key: bcode) ne 0 then code=acode;

  drop acode bcode;
run;
DocMartin
Quartz | Level 8

I like your solution. However, mea culpa, I framed the problem way too simply. Here's what I really want. 

Hospitalized patients can have a diagnosis coded in their medical record in one of two ways: as an APRDRG or an MSDRG.  These are stored in the tables “APR” and ‘MS”.  

* Tables apr and ms;

data apr; input @1 diagnosis $21. @25 aprdrg $3.;
cards;
Heart_Failure	        001
Myocardial_Infarction   002
Stroke	                103
Respiratory_Failure	
GI_Bleeding	            116
GI_Cancer	            167
GI_Obstruction          172
Influenza	            298
Head_Trauma	            300
;
run;


data ms; input @1 diagnosis $21. @25 msdrg $3.;
cards;
Heart_Failure           001
Myocardial_Infarction	005
Stroke	                098
Respiratory_Failure	    103
GI_Bleeding	            112
GI_Cancer	            
Asthma                  174
Influenza	            298
Head_Trauma	            300
Pelvic_Fractue          302
;
run;

The main patient data set is :

data main; input @1 ptid $3.  @5 code $3. @9 drg_type $1.;
cards;
111	002	A
112	103	A
113	112	A
114	298	A
115	300	A
116	005	M
117	103	M
118	112	M
119	167	M
120	222	M
;
run;  

The final table I'd like to get is:

data final; infile @1 PatientID $3.	@5 drg_type $1. @7 code $3 @11	Diagnosis $21.;
cards;
111	A 002  Myocardial_Infarction
112	A 103  Stroke
113	M 112  GI_Bleeding
114	A 298  Influenza
115	A 300  Head_Trauma
116	M 005  Myocardial_Infarction
117	M 103  Respiratory_Failure
118	M 112  GI_Bleeding
119	A 167  GI_Cancer
120	  222   
;	

Any help is greatly appreciated!

Thanks.

 

AndreaVianello
Obsidian | Level 7
data rawcode; 
input  code $3.;
cards;
001  
026  
099  
126  
523  
525  
;

data ab_code; 
input  acode :$3. bcode :$7. desc :$10.;
cards;
001 001  chairs
027 026  cups
099 027  plates
126 098  pencils
399  112   books
400  113   pens
522  523  clips
525  525  pillows
;
run; 
  
proc sql;
 create table want as 
  select c.code, ab.desc
   from rawcode C
   inner join ab_code ab 
   on (c.code =ab.acode  or  c.code =ab.bcode);
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1672 views
  • 1 like
  • 5 in conversation