Hi,
I have two datasets (cd.dataset_one and cd.dataset_two) and would like to create a table by left joining cd.dataset_one onto cd.dataset_one (the code I currently have is below). As I am matching the 'tracking_month' variable from dataset 1 and 'month_end' variable from dataset 2, would it be possible for someone to provide me with code to format the dates to recognise each other when I run the proc sql (the preference is to use the 'tracking_month' format so that dataset 2 has 'month_end' 30APR2017 converted to 201704, for example)? Thanks!
Current code:
data cd.dataset_two_update;
set cd.dataset_two;
[NOTE: Can the date format for 'month_end' variable be changed in this line of code here, or after 'run;' below so that the proc sql table below takes on the preferred date format?]
run;
proc sql;
create table FLAG_NUMBERS as
select a.tracking_month, a.ACCT_ID, a.flag
from cd.dataset_two_update as a
left join cd.dataset_one as b
on a.month_end = b.tracking_month and a.ACCOUNT_ID = b.ACCT_ID
order by a.ACCT_ID
;
quit;
cd.dataset_one (dataset 1)
cd.dataset_two (dataset 2)
The value in your dataset_two is actually a DATE value which can be determined by the DATE format. The other value in the first set is a simple numeric value and not a date.
Your tracking_month variable does not have a day of the month so you would very likely not actually match the month_end.
The first step would be to create a date value from the tracking month. That looks like you should be able to do that with
input(put(tracking_month,6 -L), yymmn6.) which will create a date that has the first day of the month assumed.
If you want the actual month and year to match between the two then you would need to get the same actual date which would look like
input(put(tracking_month,6 -L), yymmn6.) = intnx('month',month_end,0,'B')
The intnx function is used to increment date, time or datetime values. The first parameter is the interval, month in this case, second is the starting date value, 3rd is the number of intervals to increment, 0 means the same month here, and the optional last parameter is alignment with 'B' means "beginning" so the intnx function returns the first day of the month in the same month and year.
Assigned formats of numeric values never affect comparisons as the actual values are used.
The value in your dataset_two is actually a DATE value which can be determined by the DATE format. The other value in the first set is a simple numeric value and not a date.
Your tracking_month variable does not have a day of the month so you would very likely not actually match the month_end.
The first step would be to create a date value from the tracking month. That looks like you should be able to do that with
input(put(tracking_month,6 -L), yymmn6.) which will create a date that has the first day of the month assumed.
If you want the actual month and year to match between the two then you would need to get the same actual date which would look like
input(put(tracking_month,6 -L), yymmn6.) = intnx('month',month_end,0,'B')
The intnx function is used to increment date, time or datetime values. The first parameter is the interval, month in this case, second is the starting date value, 3rd is the number of intervals to increment, 0 means the same month here, and the optional last parameter is alignment with 'B' means "beginning" so the intnx function returns the first day of the month in the same month and year.
Assigned formats of numeric values never affect comparisons as the actual values are used.
[Edited]
on input(put(a.tracking_month,6. -L), yymmn6.) = intnx('month',b.month_end,0,'B')
Which begs the question: Have you tried to understand the reply?
@jeremy4 wrote:
Hi, thanks for your reply. I added in your code and edited it to reference the two datasets (adding a. and b.) but SAS provided me with the following error relating to the '6 -L' part of the code:
on input(put(a.tracking_month,6 -L), yymmn6.) = intnx('month',b.month_end,0,'B')
_
22
76
ERROR 22-322: Syntax error, expecting one of the following: a format name, ?.
ERROR 76-322: Syntax error, statement will be ignored.
Post any error message with the entire data step or procedure call. Copy from the LOG and paste into a code box opened with the forum's {I} or "running man" icon. The _ character will be under the part with the problem most of the time.
In this case, because of a distraction on my part I missed the . after 6 that indicates the format to use for put
on input(put(a.tracking_month,6. -L), yymmn6.) = intnx('month',b.month_end,0,'B')
Since I did not have any actual data set (HINT) to test code with I missed the error.
Tracking_Month does not look like a date. It looks like a 6 digit number. Sounds like you want to treat the ones and tens place as a month number and the other four places as the year number.
So either convert TRACK_MONTH to a date value that is the end of the month
on a.month_end = intnx('month',input(put(b.tracking_month,z6.),yymmn6.),0,'end')
Or convert the date MONTH_END to a number using that YYY,YMM style.
on input(put(a.month_end,yymmn6.),6.) = b.tracking_month
Example:
1016 data one(keep=month_end) two(keep=tracking_month); 1017 month_end='30APR2017'd ; 1018 format month_end date9.; 1019 tracking_month=201704; 1020 run; NOTE: The data set WORK.ONE has 1 observations and 1 variables. NOTE: The data set WORK.TWO has 1 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.02 seconds cpu time 0.01 seconds 1021 1022 proc sql; 1023 create table test1 as select * 1024 from one a inner join two b 1025 on a.month_end = intnx('month',input(put(b.tracking_month,z6.),yymmn6.),0,'end') 1026 ; NOTE: Table WORK.TEST1 created, with 1 rows and 2 columns. 1027 create table test2 as select * 1028 from one a inner join two b 1029 on input(put(a.month_end,yymmn6.),6.) = b.tracking_month 1030 ; NOTE: Table WORK.TEST2 created, with 1 rows and 2 columns. 1031 quit; NOTE: PROCEDURE SQL used (Total process time): real time 0.03 seconds cpu time 0.03 seconds
or transform each a bit:
on put(a.month_end,yymmn6.) = put(b.tracking_month,6.)
If tracking_month is effectively a 6-digit number (e.g. 201804), I would rather deal with numbers (instead of going to characters and back)
proc sql;
select *
from one left join two
on intnx("month", month_end, 0, "beginning") =
mdy(mod(tracking_month, 100), 1, floor(tracking_month/100));
quit;
Note Since you are joining on months, doing so on beginnings or ends is equivalent. In this case, joining on beginnings is a bit simpler because the day is always 1.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.