Hi,
I am stuck with a problem where I have two tables, one at the months and one at the weeks. Here's the format of the tables:
Table1
Customer Date1 Sales
1 Jan2018 1110
1 Feb2018 1245
1 Mar2018 1320
1 Apr2018 1100
...
Table2
Customer Date2
1 01Jan2018
1 08Jan2018
1 15Jan2018
1 22Jan2018
1 29Jan2018
1 05Feb2018
1 12Feb2018
1 19Feb2018
1 26Feb2018
1 05Mar2018
...
I want to create a new column for sales in Table2 that will hold the disaggregated values of sales from Table1. I want to divide the sales by the number of days in that month and then assign the values to the weeks accordingly. Thus the sales in week 01Jan2018 is (1110/31)*7. The weeks that are in transition will get values from both the months. For example 29Jan2018 has 3 days in Jan2018 and 4 days in Feb2018. The sales of one day in Jan2018 is 1110/31 and the sales of one day in Feb2018 is 1245/28.
So the sales in week 29Jan2018 will be 3*(1110/31) + 4*(1245/28)
I want to do this for each distinct customer.
The resulting table should be
Result Table
Customer Date Sales
1 01Jan2018 250.6 i.e (1110/31)*7
1 08Jan2018 250.6
1 15Jan2018 250.6
1 22Jan2018 250.6
1 29Jan2018 282.27
1 05Feb2018 311.25
1 12Feb2018 311.25
1 19Feb2018 311.25
1 26Feb2018 133.39 + 170.32
...
Thanks!
Is your date1 in table1 a numeric SAS date formatted with monyy7. or is it a char value ?
Do you have a licence to SAS/ETS? Proc expand is meant for this type of conversion.
Since I am at home today and don't have access to SAS to test this out or write example code, here is the process.
As mentioned above by @novinosrin, variable DATE1 needs to be an actual SAS date for this to work, otherwise you will have to convert DATE1 to the proper SAS date value.
Something like this, assuming DATE1 is actually a SAS date representing the first day of the month.
UNTESTED CODE
data want;
set have;
last_day_of_month=intnx('month',date1,0,'e');
do day=1 to last_day_of_month;
thisday=mdy(month(date1),day,year(date1));
sales_this_day=sales/last_day_of_month;
seven_day_period = floor((thisday-'01JAN18'd)/7);
start_of_seven_day_period = intnx('week','01JAN18'd,seven_day_period,'s');
output;
end;
format start_of_seven_day_period date7.;
run;
Then run PROC SUMMARY to get the sum of SALES_THIS_DAY for each value of start_of_seven_day_period. I'll leave that as work that you can do yourself.
Here is a Data Step Approach to the solution. Requires data to be pre-sorted by Customer / grouped as shown in your sample Data Sets.
Added Customer 2 for checking the Data Step.
Using Tab1, we get the Month as Number from Date1 and compute perdaySales as Sales by the number of days of the month.
Using Tab2, we again get the Month as Number from Date2 and find the leftdays as the difference between number days of the Month and Date2.
In the final Data Step:
In the first Do-Loop, we store the perdaySales in an array (K) indexed by Month (here 1 to 4) . In the second Do-Loop, if the leftdays is 7 or more, it is a simple lookup for the month in the array to get perdaySales and multiplication by 7. If leftdays is less than 7, we take the perdaySales for the current month and multiply it by leftdays PLUS take the perdaySales for the NEXT month and multiply by (7 - leftdays).
The array can be sized using the macro variable (&size) for changing number of months in Tab1. Hence, the Data Step is dynamic.
data tab1;
input Customer Date1 :monyy7. Sales;
eom = intnx('month', date1,0,'end');
numdays = day(eom);
mon = month(date1);
perdaySales = Sales / numdays;
datalines;
1 Jan2018 1110
1 Feb2018 1245
1 Mar2018 1320
1 Apr2018 1100
2 Jan2018 1110
2 Feb2018 1245
2 Mar2018 1320
2 Apr2018 1100
;
run;
proc print data = tab1;
format date1 monyy7.;
run;
data tab2;
input Customer Date2 date10.;
leftdays = day(intnx('month',date2, 0, 'end')) -
day(intnx('day', Date2, 0, 'end')) + 1;
mon = month(date2);
datalines;
1 01Jan2018
1 08Jan2018
1 15Jan2018
1 22Jan2018
1 29Jan2018
1 05Feb2018
1 12Feb2018
1 19Feb2018
1 26Feb2018
1 05Mar2018
2 01Jan2018
2 08Jan2018
2 15Jan2018
2 22Jan2018
2 29Jan2018
2 05Feb2018
2 12Feb2018
2 19Feb2018
2 26Feb2018
2 05Mar2018
;
run;
proc print data = tab2;
format date2 date10.;
run;
%let size = 4;
data want;
array k[&size] _temporary_;
do until(last.Customer);
set tab1;
by Customer notsorted;
k[mon] = perdaySales;
end;
do until(last.Customer);
set tab2;
by Customer notsorted;
if leftdays >= 7 then weeksales = k[mon] * 7;
else weeksales = k[mon] * leftdays + k[mon+1] * (7 - leftdays);
output;
end;
keep Customer Date2 weeksales;
run;
proc print data = want;
format date2 date10.;
run;
The Ouput is shown. The first over-lapping interval, the weeksales is slightly different here and for second interval it is same.
Obs Customer Date2 weeksales 1 1 01JAN2018 250.645 2 1 08JAN2018 250.645 3 1 15JAN2018 250.645 4 1 22JAN2018 250.645 5 1 29JAN2018 285.276 6 1 05FEB2018 311.250 7 1 12FEB2018 311.250 8 1 19FEB2018 311.250 9 1 26FEB2018 303.715 10 1 05MAR2018 298.065 11 2 01JAN2018 250.645 12 2 08JAN2018 250.645 13 2 15JAN2018 250.645 14 2 22JAN2018 250.645 15 2 29JAN2018 285.276 16 2 05FEB2018 311.250 17 2 12FEB2018 311.250 18 2 19FEB2018 311.250 19 2 26FEB2018 303.715 20 2 05MAR2018 298.065
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.