BookmarkSubscribeRSS Feed
Yakatan
Calcite | Level 5

I have a large dataset that has hundreds of months. I am trying to merge on data depending upon certain criteria, however due to the numberof lines of code i need to write i am trying to automate so it doesnt require continually updating the code.

 

I have included a sample of my data and the code below.

 

I have defined multiple macro varibles for the dates below but have just included a small sample with the % let statements below.

 

The code below only outputs the last line of the %month statement, so the dataset appears to be being overwritten by the subsequent invocation of the macro. However that isnt what i want to happen, can i write this in such a way that i only need a summary few lines of code that will work for the multiple date criteria i need?

 

%let month00 = 01jun2020;
%let month01 = 01jul2020;
%let month02 = 01aug2020;
%let date00 = 04jun2020;
%let date01 = 08jul2020;
%let date02 = 06aug2020; data trial; input id $ date1 date9.; datalines; 1 01Jun2020 2 01Jul2020 3 01aug2020 ; run; data trial2; input id $ date2 date9.; datalines; 1 14Jun2020 2 18Jul2020 3 24aug2020 ; run; %macro month(dt1,dt2,dt3); proc sql; create table test_merge as select a.*, b.* from trial as a left join trial2 as b on a.id=b.id and (a.date1 = &dt1. and b.date2 between &dt2. and &dt3.) quit; %mend month; %month ("month00."d, "date00."d, "date01."d); %month ("month01."d, "date01."d, "date02.".d); ....
14 REPLIES 14
PaigeMiller
Diamond | Level 26

Every time %month executes, it creates a data set named TEST_MERGE. So this new TEST_MERGE overwrites any earlier data sets of the same name.

 

You need to make the data set created by PROC SQL dynamic such that it changes every time you call the macro. This can be done by adding a macro variable in place of TEST_MERGE having the value you want this new data set to take.

--
Paige Miller
Yakatan
Calcite | Level 5
Spoiler
I dont want to create a new dataset everytime i execute the macro though.I want one dataset but only want the merge to work when the criteria within the respective macro is correst if that makes sense.  I am trying to avoid having to write hundreds of lines of code for the merge criteria.
PaigeMiller
Diamond | Level 26

So please explain what you do want in a lot more detail.

 

Better yet, since this should be your first step anyway, writing code without macros and without macro variables for two of your months, show me code without macros and without macro variables for two of your months that does exactly what you want for those two months.

--
Paige Miller
Yakatan
Calcite | Level 5

If i wasnt using a macro i would do the following - but i would have to write hundreds of statements within the OR logic to account for the number of months within my dataset. So my macro was an attempt to try to retain the logic of below but without iterating the individual lines of code.

 

proc sql;
create table test_merge as select 

a.*,
b.*

from trial as a 
left join trial2 as b 
on a.id=b.id 
and
((a.date1 = "month00."d,  and b.date2 between "date00."d, and "date01."d)
or
(a.date1 = "month01."d,  and b.date2 between "date01."d, and "date02."d)
or 
(a.date1 = "month02."d,  and b.date2 between "date02."d, and "date03."d)
quit;

PaigeMiller
Diamond | Level 26

This is exactly why you need to write code without macros and without macro variables, to get code that works for 2 or 3 situations. This clearly indicates your macro above, which keeps creating TEST_MERGE over and over again, isn't the right logic. And there are syntax errors which you need to fix, because if your code won't run without macros and without macro variables, it will never run with macros and with macro variables.

 

Your code seems to translate to

 

proc sql;
    create table test_merge as select a.*,b.*
    from trial as a left join trial2 as b 
         on a.id=b.id and
    ((a.date1 = "01jun2020"d,  and b.date2 between "04jun2020"d, and "08jul2020"d) or
    (a.date1 = "01jul2020"d,  and b.date2 between " 08jul2020"d, and "06aug2020"d) or 
    (a.date1 = "01aug2020"d,  and b.date2 between "06aug2020"d, and "????"d));
quit;

 

where your date03 isn't given so I replaced it with ????

 

This leads to a bunch of questions, and again this should lead to your writing correct code without macros and without macro variables. So June 4, 2020 to July 8, 2020 is matched with 01JUN2020? Is that exactly what you want, that one month in TRIAL is matched with a month plus four days in TRIAL2? Or do you want any day in Jun 2020 in TRIAL2 to be matched with 01JUN2020 in TRIAL??? Next, in data set TRIAL2 a date of 08JUL2020 will match both 01JUN2020 and 01JUL2020 in TRIAL. Is that what you want? 

 

Please, create working code that does exactly what you want without macros and without macro variables for two or three months, before trying to turn this into a macro.

 

--
Paige Miller
Yakatan
Calcite | Level 5

Apologies, as i have tried to sample down the data i have made some minor errors in the dates. To your point on what i am after, i am only wanting the data to merge when the conditions within the statements are met.

 

e.g. if date1 = 01jun2020 and date2 = 03jun2020 then i wouldnt want this to merge. I have included the amended code below;

 

%let month00 = 01jun2020;
%let month01 = 01jul2020;
%let month02 = 01aug2020;
%let date00 = 04jun2020;
%let date01 = 08jul2020;
%let date02 = 06aug2020;
%let date_end00 = 07jul2020;
%let date_end01 = 05aug2020;
%let date_end02 = 08sep2020;

proc sql;
    create table test_merge as select a.*,b.*
    from trial as a left join trial2 as b 
         on a.id=b.id and
    ((a.date1 = "01jun2020"d,  and b.date2 between "04jun2020"d, and "07jul2020"d) or
    (a.date1 = "01jul2020"d,  and b.date2 between " 08jul2020"d, and "05aug2020"d) or 
    (a.date1 = "01aug2020"d,  and b.date2 between "06aug2020"d, and "08sep2020"d));
quit;

The macro dates equivilent would be;

 

proc sql;
create table test_merge as select 

a.*,
b.*

from trial as a 
left join trial2 as b 
on a.id=b.id 
and
((a.date1 = "month00."d,  and b.date2 between "date00."d, and "date_end00."d)
or
(a.date1 = "month01."d,  and b.date2 between "date01."d, and "date_end01."d)
or 
(a.date1 = "month02."d,  and b.date2 between "date02."d, and "date_end02."d)
quit;
PaigeMiller
Diamond | Level 26
%let month00 = 01jun2020;
%let month01 = 01jul2020;
%let month02 = 01aug2020;
%let date00 = 04jun2020;
%let date01 = 08jul2020;
%let date02 = 06aug2020;
%let date_end00 = 07jul2020;
%let date_end01 = 05aug2020;
%let date_end02 = 08sep2020;

Where is this information stored? Data set? Excel files? Somewhere else? 

--
Paige Miller
Yakatan
Calcite | Level 5

These are macro variables i have created to avoid me having to hardcode hundreds of dates. I've included them as a %let just as an example.

PaigeMiller
Diamond | Level 26

@Yakatan wrote:

These are macro variables i have created to avoid me having to hardcode hundreds of dates. I've included them as a %let just as an example.


Please answer the question, where are they stored? If they're not stored somewhere, then you will have to type them in either to your macro variables or type them into code. Was that the plan to type hundreds of dates into macro variables?

--
Paige Miller
PaigeMiller
Diamond | Level 26

Now that I have a clearer view of what you are trying to do, let's do this without macros at all.

 

data dates;
   input number (month date date_end)(:date9.);
   format month date date_end date9.;
   cards;
0 01jun2020 04jun2020 07jul2020
1 01jul2020 08jul2020 05aug2020
2 01aug2020 06aug2020 07sep2020
3 01sep2020 08sep2020 30sep2020
;

data trial;
input id $ date1 date9.;
format date1 :date9.;
datalines;
1 01Jun2020
2 01Jul2020
3 01aug2020
;
run;

data trial2;
input id $ date2 date9.;
format date2 date9.;
datalines;
1 14Jun2020
2 18Jul2020
3 24aug2020
;
run;

proc sql;
    create table trial_1 as select 
        a.date1,b.month,b.date,b.date_end
        from trial as a left join dates as b
        on a.date1=b.month;
quit;

proc sql;
    create table want as select
    a.*,b.* from trial_1 as a right join trial2 as b
    on b.date2 between a.date and a.date_end;
quit;
--
Paige Miller
Yakatan
Calcite | Level 5

Thank you - Am i not going to need to hardcode the multiple dates into the dataset though? Which is what i wanted to avoid as this data is going to go back over 10 years.

PaigeMiller
Diamond | Level 26

@Yakatan wrote:

Thank you - Am i not going to need to hardcode the multiple dates into the dataset though? Which is what i wanted to avoid as this data is going to go back over 10 years.


You still haven't told me where these dates are stored. 

 

If they are stored in your head or on a piece of paper, then yes you will have to type them into an electronic form somehow.

 

If they are already stored in a SAS data set or other electronic form, then no you won't have to hardcode anything.

--
Paige Miller
Yakatan
Calcite | Level 5

They are in macro that creates these dates, but they are not in a SAS dataset. I was looking for a way where i didnt need to hardcode the dates as it will involved writing over a hundred lines of code.

PaigeMiller
Diamond | Level 26

A macro that creates the dates? Simply have that macro create a SAS data set containing the dates instead of creating macro variables containing the dates.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 934 views
  • 0 likes
  • 2 in conversation