BookmarkSubscribeRSS Feed
CP2
Pyrite | Level 9 CP2
Pyrite | Level 9

I had to match two files based on 4 different keys from most strict to loose to add a cost variable. I did this in steps using proc sql and created 4 different tables. Then I appended the tables together with the non-matches so that I had the table I started with but with match_flag1, match_flag2, match_flag3, match_flag4.  I only want to count the first occurrence of the data that I retrieved from the other table based on the hierarchy of best match.  I can't figure out how to do this and keep the original data. I feel like Hash Add() could do this if I output to 4 different tables and add them together again but I need to keep all the data in place so that duplicated reasoning can be maintained for each row. In other words I cannot output a separate table. I just want to update the value if it is the first occurrence of the match.  For example, if match_flag3=1 for one row of the key and match_flag4=1 for a diffferent row, I only want to keep the value in the row where match_flag3 =1 for that key. I tried doing this with first. and last. logic but becuase there are 4 different types of matches I could not get the right sort and logic. 

 

The following logic almost gets me there but there could still be duplicates keys when match_flag# variables =1 so I need to still identify the first occurrence of the key  within the following logic. 

 

data test;
if _N_ = 1 then do;
declare hash m1 (dataset: "Have");
m1.defineKey("id", "dos", "diag", "provider", "lob");
m1.defineData("match_flag1");
m1.defineDone();
 
declare hash m2 (dataset: "Have");
m2.defineKey("id", "dos", "diag", "provider");
m2.defineData("match_flag2");
m2.defineDone();
 
declare hash m3 (dataset: "Have");
m3.defineKey("id", "dos", "diag", "lob");
m3.defineData("match_flag3");
m3.defineDone();
 
declare hash m4 (dataset: "Have");
m4.defineKey("id", "dos", "diag");
m4.defineData("match_flag4");
m4.defineDone();
end; 
 
set Have (drop=match_flag1 match_flag2 match_flag3 match_flag4 );
 
length match_flag1 8. match_flag2 8. match_flag3 8. match_flag4 8.;
if m1.find() = 0 then do;
value_revised =value;
match_keep=1;
end;
else if m2.find()=0 then do;
value_revised = value;
match_keep=1;
end;
else if m3.find()=0 then do;
value_revised=value;
match_keep=1;
end;
else if m4.find()=0 then do;
value_revised =value;
match_keep=1;
end;
else do;
match_keep= 0;
value_revised = 0;
end;
;
run;

 

2 REPLIES 2
ballardw
Super User

It seems sort of obvious that if you have multiple variables (???) then perhaps combine them or set the value of a single variable to 1,2,3 and 4 then sort the data by your identification variables and that new variable.

I would think first. logic would then make sense. (If this were my data knowing this was a later step I might of set that "order of priority" or whatever at each stage when combining the data sets.

 

But no example data provided so can't be sure.

mkeintz
PROC Star

I for one, need sample data in the form of a working data step, and also showing what the result should look like.  The sample data preferably would have an instance of the problem of duplicates you are trying to manage.

 

Question:  are data items in the hash objects expected to satisfy a match more than once?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 197 views
  • 0 likes
  • 3 in conversation