BookmarkSubscribeRSS Feed
Jamie_H
Fluorite | Level 6

Morning all, 

 

I'd like to know how you would approach the following problem which I am trying to automate:

We have a lookup table in excel that has 4 variables plus a start date and an end date and finally a value to be returned.  For example...

 

Have

data have;
infile datalines dsd missover;
input county $ town $ school $ class $ start_date :ddmmyy10. end_date :ddmmyy10. Value_To_Return;
format start_date end_date date9.;
cards;
"Essex", "Braintree", "Tabor", "Blue", 12/10/2017, 31/07/2018, 100
"Essex", "Braintree", "Tabor", "Blue", 01/08/2018, 31/12/2018, 150
"Essex", "Braintree", "Tabor", "Blue", 01/01/2019, 15/04/2019, 125
"Essex", "Braintree", "Tabor", "Blue", 16/04/2019, 31/12/2099, 140
"Essex", "Braintree", "Tabor", "Red", 12/10/2017, 31/07/2018, 100
"Essex", "Braintree", "Tabor", "Red", 01/08/2018, 31/12/2018, 140
"Essex", "Braintree", "Tabor", "Red", 01/01/2019, 15/04/2019, 130
"Essex", "Braintree", "Tabor", "Red", 16/04/2019, 31/12/2099, 140
;
run;

Want

I want to get the Value_To_Return field from any dataset that has the first 4 variables to match on, plus a variable called Event_Date which should be used as the data look up for between Start_Date and End_Date. 

 

Is there a multi variable Format Lookup I could use?  

 

Obviously the example above only has 8 rows, but the reality is many thousands.  

 

Many thanks

 

 

 

 

 

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I want to get the Value_To_Return field from any dataset that has the first 4 variables to match on, plus a variable called Event_Date which should be used as the data look up for between Start_Date and End_Date. 

 

I don't understand this. Explain step-by-step, with an example or two or three. Thanks.

--
Paige Miller
Jamie_H
Fluorite | Level 6

I have the dataset in the SAS code (albeit it exists in Excel at the moment)

 

I want to be able to use any other dataset I have to lookup the final value (Value_To_Return).    So if I have another dataset that has:

 

County - "Essex", Town - "Braintree", School - "Tabor", Class - "Blue", Event_Date - 01/10/2018   Then I would like the value of 150 to be returned.  

 

PaigeMiller
Diamond | Level 26

Okay, thank you, I think I understand. This seems to be an SQL join between the first table and the second table, by date range. Assuming your dates are actual SAS date values and not character strings, then this ought to work

 

proc sql;
    create table want as select h.*,l.event_date from
    have as h left join lookup as l 
    on h.county=l.county and h.town=l.town and h.school=l.school
        and h.class=l.class and l.event_date>=h.start_date and l.event_date<=h.end_date;
quit;
--
Paige Miller
Jamie_H
Fluorite | Level 6
Thank you.

Do you know if there is a way of implementing the lookup from within a Data Step rather than having the table joining code separately. Ie, from within existing code so you get the variable added to the dataset you are already working with.. as in, you call a format or function you've created to return the Value_To_Return variable. eg;

Data blah;
Set blahblah;
x = Put(record, Lookup_Format.);
or...
x = Call Get_Value_To_Return(County, Town, School, Class, Event_Date);

?
PaigeMiller
Diamond | Level 26

You could do this in a DATA step with the MERGE command.

--
Paige Miller
Jamie_H
Fluorite | Level 6

Thank you. 

 

Yes, if you can do it in SQL you can also do it as a merge, but i'm specifically after a method to do it as a lookup from within any datastep as per my request. 

 

Cheers

PaigeMiller
Diamond | Level 26
data want;
    merge have lookup;
    by county town school class;
    if start_date<=event_date<=end_date;
run;

This assumes both data sets are sorted by COUNTY TOWN SCHOOL CLASS.

--
Paige Miller
Jamie_H
Fluorite | Level 6

You're not getting what I mean by lookup within a datastep Paige...

 

 

data Lookup;
infile datalines dsd missover;
input county $ town $ school $ class $ start_date :ddmmyy10. end_date :ddmmyy10. Value_To_Return;
format start_date end_date date9.;
cards;
"Essex", "Braintree", "Tabor", "Blue", 12/10/2017, 31/07/2018, 100
"Essex", "Braintree", "Tabor", "Blue", 01/08/2018, 31/12/2018, 150
"Essex", "Braintree", "Tabor", "Blue", 01/01/2019, 15/04/2019, 125
"Essex", "Braintree", "Tabor", "Blue", 16/04/2019, 31/12/2099, 140
"Essex", "Braintree", "Tabor", "Red", 12/10/2017, 31/07/2018, 120
"Essex", "Braintree", "Tabor", "Red", 01/08/2018, 31/12/2018, 140
"Essex", "Braintree", "Tabor", "Red", 01/01/2019, 15/04/2019, 130
"Essex", "Braintree", "Tabor", "Red", 16/04/2019, 31/12/2099, 140
;
run;


Data Have;
infile datalines dsd missover;
input county $ town $ school $ class $ eventdate :ddmmyy10.;
format eventdate  date9.;
cards;
"Essex", "Braintree", "Tabor", "Blue", 15/3/2019
"Essex", "Braintree", "Tabor", "Blue", 01/10/2018
"Essex", "Braintree", "Tabor", "Red", 12/01/2019
;
run;

 

 

For this example, I would want to be working with the Have dataset and get the Value_To_Return where all 4 character variables match and also where eventdate was <= start_date and <= end_date.
The data in these instances would return 125 on row 1, 150 on row 2 and 130 on row 3.

 

I have written the Proc FCMP that looks up the character variables, but I dont know how to adapt it to include the date range lookup as well.

 

proc fcmp outlib=work.functions.samples;;
function get_score(county $, town $, school $, class $);
declare hash Lkup(dataset: "work.Lookup");
rc = lkup.definedata("Value_To_Return");
rc=lkup.definekey("county", "town", "school", "class");
 rc=lkup.definedone();
 rc=lkup.find();
 if rc eq 0 then return(Value_To_Return);
 else return(.);
 endsub;
quit;
options cmplib=work.functions;  

data want;
set have;
returned_value = get_score(county, town, school, class);
run;

Any suggestions on how to include a clause within the hash lookup to search for eventdate being between start_date and end_date?

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1111 views
  • 0 likes
  • 2 in conversation