BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
new_sas_user_4
Obsidian | Level 7

Hi, 

I want to create a data model within SAS based on percents and data I have from Prior Month.

Eg I have day percents and totals orders by day from May and I want to create a data drop model for June to estimate the orders that will close by day in June.

I know that for 31st May I have 900 orders.

I want to take 900 and split it by day (for June) based on the percents by Day(Day1-30).

So, 115 orders would close on 1Jun, 111 on 2Jun and so forth until end of June.

I want to do this for the past 10 days (from May). and then get the totals per day for June.

.Attached is the excel for explaining the calculation. A1-D32 is what I have. 

Column E and Q is my expected Output.

 

Can this be done in SAS?

 

Thank you in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Some of the values in the dataset "want" created by the following code don't match the values you have posted, unfortunately i can spend any more time to debug the code. And yes, the last data step does not read any data from work.have, all values are generated by looking up percent-value and number of orders using the formats.

 

data DayPercentFmt;
	set have(keep=Day Percent);
	
	length 
		FmtName $ 32 
		Type $ 1
	;
	retain 
		FmtName 'Day2Percent' 
		Type "i"
	;
	format Percent;
	rename 
		Day = Start
		Percent = Label
	;
run;

proc format cntlin=DayPercentFmt;
run;

proc sort data=have out=sorted(keep= Date Orders);
	by descending Date;
run;

data DayOrdersFmt;
	set Sorted(drop=Date);
	length
		FmtName $ 32
		Start 8
		Type $ 1
	;
	retain
		FmtName "Day2Orders"
		Type "i"
	;
	
	Start = _n_;
	
	rename Orders = Label;
run;

proc format cntlin=DayOrdersFmt;
run;


data want;
	length
		Date 8
		Orders1-Orders10 8
		FinishedOrders 8
		_start _end _i _z _p _o 8
	;
	
	format Date date9.;
	
	array allOrders[10] Orders1-Orders10;
	
	do Date = '01Jun2020'd to '30Jun2020'd;
		call missing(of Orders:);
		
		_start = 1;
		_end = min(10, day(Date));
		
		do _i = _start to _end;
			_z = day(Date) - _i + 1;
			_p = input(cats(_z), Day2Percent.);
			_o = input(cats(_i), Day2Orders.);
			allOrders[_i] = int(_o * _p);
		end;
		
		FinishedOrders = sum(of Orders:);
		
		output;
	end;
run;

View solution in original post

15 REPLIES 15
andreas_lds
Jade | Level 19

Start by reading the xlsx-file and as soon as all variables have the correct type and length, post the data as data step with datalines statement, so we actually have data to use. See the links below the post of @Kurt_Bremser if you need help to create the data step.

ballardw
Super User

Since your example "data" did not show any dates, I just see XXX, in the dates prior to Jun there is no way do tell "previous month" properly especially considering that May has 31 days and June has 30 there is almost always going to be some issue with this "previous month" percentage approach.

Assuming anything about non-shown values is poor modeling. And the percentages appear to be rounded, so questionable as well. July/August and December/January are the only previous month/current month where the number of days are the same.

new_sas_user_4
Obsidian | Level 7

Here are the datalines for May:

In the atttched excel, E32-P62 is my expected output. I want the totals by each day as in Column P(showing -the splits in SAS(column F-O is optional)).

I want to split the order numbers for last 10days of May based on the percents.

For May we have 31 percent values, but since June has only 30days we will do the split for each order from past 10 days in May for 30days in June.

 

data have ;
input Day Percent Date:date9. Orders;
format Percent percent10. Date date9.;

datalines;
1 .13 01May2020 56
2 .12 02May2020 21
3 .12 03May2020 26
4 0.09 04May2020 44
5 0.08 05May2020 62
6 0.07 06May2020 57
7 0.06 07May2020 92
8 0.04 08May2020 142
9 0.05 09May2020 71
10 0.03 10May2020 45
11 0.02 11May2020 68
12 0.02 12May2020 57
13 0.02 13May2020 113
14 0.01 14May2020 83
15 0.01 15May2020 117
16 0.01 16May2020 100
17 0.01 17May2020 100
18 0.01 18May2020 150
19 0.01 19May2020 200
20 0.01 20May2020 250
21 0.01 21May2020 300
22 0.01 22May2020 350
23 0.01 23May2020 100
24 0.01 24May2020 200
25 0.00 25May2020 300
26 0.00 26May2020 400
27 0.00 27May2020 500
28 0.00 28May2020 600
29 0.01 29May2020 700
30 0.01 30May2020 800
31 0.01 31May2020 900

;

 

Thank you 🙂

new_sas_user_4
Obsidian | Level 7

So, for the output, I want the first column to be dates for June. and then taking the last day for May (31May2020 900 orders), multiplying 900 with 13% (percent from day 1 ) , I get 115 for 01June2020, multiplying 900 with 12%(percent from day 2), I get 111 (for 02June2020) and continuing this way for 900 until 30June2020.
Similarly, I want to do this calculation for the Orders number from 30May2020 which is 800 and split it based on percents from Day1-30 for June 02-June30.

Continuing this way for the last 10 orders from May. 


The last column below is just the sum of the row for each date from 01June-30June.

The first column (dates) and the last column (Finished Orders) are my desired output.

                      Finished Orders
01Jun2020     115                              115
02Jun2020     111     102                            212
03Jun2020     104       98       89                          291
04Jun2020       84       92       86      76                        338
05Jun2020       74       75       81      74      64                      367
06Jun2020       64       66       65      69      61      51                    377
07Jun2020       50       57       58      56      58      49      38                  366
08Jun2020       39       45       50      49      47      46      37      25                338
09Jun2020       43       35       39      43      41      37      35      25               13              310
10Jun2020       26       38       30      33      36      33      28      23               12               45            304
11Jun2020       20       23       34      26      28      29      25      19               12               43            257
12Jun2020       16       18       20      29      22      22      21      16                 9               40            214
13Jun2020       16       14       16      17      24      17      17      14                 8               33            175
14Jun2020       13       14       12      13      14      19      13      11                 7               29            146
15Jun2020       12       12       12      10      11      11      14        9                 6               25            123
16Jun2020          9       11       10      10        9        9        9      10                 4               20            100
17Jun2020       10          8       10        9        9        7        7        6                 5               15               84
18Jun2020          6          8          7        8        7        7        5        4                 3               17               74
19Jun2020          8          5          7        6        7        6        5        3                 2               10               61
20Jun2020          5          7          5        6        5        6        4        3                 2                 8               52
21Jun2020          6          5          6        4        5        4        4        3                 2                 6               45
22Jun2020          7          5          4        5        3        4        3        3                 1                 6               42
23Jun2020          6          6          4        4        4        3        3        2                 1                 5               38
24Jun2020          9          5          5        4        3        4        2        2                 1                 5               39
25Jun2020          3          8          4        4        3        2        3        1                 1                 4               34
26Jun2020          4          3          7        4        4        2        2        2                 1                 4               32
27Jun2020          4          4          2        6        3        3        2        1                 1                 2               29
28Jun2020          4          4          3        2        5        3        2        1                 1                 3               27
29Jun2020          5          3          3        3        2        4        2        1                 1                 2               26
30Jun2020          5          5          3        3        2        1        3        1                 1                 2               26

 

andreas_lds
Jade | Level 19

Interesting approach. What would you do with the data of February when you have to create the data for March?

andreas_lds
Jade | Level 19

There is another problem with your description: all calculations seem to be wrong:

900 * 0.13 = 117, not 115

900 * 0.12 = 108, not 111

and later on for 25th June you should have 0, not 3, simply because the percent-value for day 25 is zero.

 

Please re-check the logic!

new_sas_user_4
Obsidian | Level 7

Please see below the revised numbers for the last 10 days from May split for June(using percents from May1-30):

Can this be created in SAS?

I want the first column(dates for June) and the last column (Finished Orders) which is a sum of that row as my output.

 

                      Finished Orders
01Jun2020     117                              117
02Jun2020     108     104                            212
03Jun2020     108       96       91                          295
04Jun2020       84       96       84      78                        342
05Jun2020       74       75       84      72      65                      370
06Jun2020       64       66       65      72      60      52                    379
07Jun2020       50       57       58      56      60      48      39                  368
08Jun2020       39       45       50      49      47      48      36      26                339
09Jun2020       43       35       39      43      41      37      36      24        13              311
10Jun2020       26       38       30      33      36      33      28      24        12        46            306
11Jun2020       20       23       34      26      28      29      25      19        12        42            256
12Jun2020       16       18       20      29      22      22      21      16          9        42            215
13Jun2020       16       14       16      17      24      17      17      14          8        33            175
14Jun2020       13       14       12      13      14      19      13      11          7        29            146
15Jun2020       12       12       12      10      11      11      14        9          6        25            123
16Jun2020          9       11       10      10        9        9        9      10          4        20            100
17Jun2020       10          8       10        9        9        7        7        6          5        15               84
18Jun2020          6          8          7        8        7        7        5        4          3        17               74
19Jun2020          8          5          7        6        7        6        5        3          2        10               61
20Jun2020          5          7          5        6        5        6        4        3          2          8               52
21Jun2020          6          5          6        4        5        4        4        3          2          6               45
22Jun2020          7          5          4        5        3        4        3        3          1          6               42
23Jun2020          6          6          4        4        4        3        3        2          1          5               38
24Jun2020          9          5          5        4        3        4        2        2          1          5               39
25Jun2020        -            8          4        4        3        2        3        1          1          4               31
26Jun2020        -          -            7        4        4        2        2        2          1          4               25
27Jun2020        -          -          -          6        3        3        2        1          1          2               18
28Jun2020        -          -          -         -          5        3        2        1          1          3               15
29Jun2020          5        -          -         -         -          4        2        1          1          2               15
30Jun2020          5          5        -         -         -         -          3        1          1          2               17
new_sas_user_4
Obsidian | Level 7
Do you think it could be done using two dimensional arrays?
I do understand 1-dimensional arrays but don't have knowledge about 2-D arrays!

Thanks!
andreas_lds
Jade | Level 19

@new_sas_user_4 wrote:
Do you think it could be done using two dimensional arrays?
I do understand 1-dimensional arrays but don't have knowledge about 2-D arrays!

Thanks!

An array may be useful to store the results, but arrays in sas are somewhat different from arrays in other languages, so i don't see a 2d array in the solution. I don't have the time to work on this right now, maybe later during the week; i would start by creating a format mapping day to percent, making it easier to get the percent-value while creating the wanted dataset. I don't think that it is possible to create the desired form in one step. Maybe starting with a dataset that has

  • Date
  • Day (value 1-0)
  • Orders (number of orders for that Date/Day combination)

is easier to code.

Transposing the data afterwards is no problem.

new_sas_user_4
Obsidian | Level 7

Thanks a lot!! 

 

I was starting with a structure like this:

%let start='01Jun2020'd;

data Distribution(drop=i);
format date date9. ;
do i = &start. to intnx('month',&start,0,'e');
Date=i ;
array Days{30} Day1-Day30;
output;
end;
run;

 

@andreas_lds Would really appreciate your help 🙂

andreas_lds
Jade | Level 19

Some of the values in the dataset "want" created by the following code don't match the values you have posted, unfortunately i can spend any more time to debug the code. And yes, the last data step does not read any data from work.have, all values are generated by looking up percent-value and number of orders using the formats.

 

data DayPercentFmt;
	set have(keep=Day Percent);
	
	length 
		FmtName $ 32 
		Type $ 1
	;
	retain 
		FmtName 'Day2Percent' 
		Type "i"
	;
	format Percent;
	rename 
		Day = Start
		Percent = Label
	;
run;

proc format cntlin=DayPercentFmt;
run;

proc sort data=have out=sorted(keep= Date Orders);
	by descending Date;
run;

data DayOrdersFmt;
	set Sorted(drop=Date);
	length
		FmtName $ 32
		Start 8
		Type $ 1
	;
	retain
		FmtName "Day2Orders"
		Type "i"
	;
	
	Start = _n_;
	
	rename Orders = Label;
run;

proc format cntlin=DayOrdersFmt;
run;


data want;
	length
		Date 8
		Orders1-Orders10 8
		FinishedOrders 8
		_start _end _i _z _p _o 8
	;
	
	format Date date9.;
	
	array allOrders[10] Orders1-Orders10;
	
	do Date = '01Jun2020'd to '30Jun2020'd;
		call missing(of Orders:);
		
		_start = 1;
		_end = min(10, day(Date));
		
		do _i = _start to _end;
			_z = day(Date) - _i + 1;
			_p = input(cats(_z), Day2Percent.);
			_o = input(cats(_i), Day2Orders.);
			allOrders[_i] = int(_o * _p);
		end;
		
		FinishedOrders = sum(of Orders:);
		
		output;
	end;
run;
new_sas_user_4
Obsidian | Level 7

Thank you so much @andreas_lds 

 

How does SAS refer to the percent and order data through the below statements? Is it becoz of proc format cntlin= ? I am sorry, I haven't used the cntlin yet.

_p = input(cats(_z), Day2Percent.);
_o = input(cats(_i), Day2Orders.); 

 

Thanks again 🙂

andreas_lds
Jade | Level 19

@new_sas_user_4 wrote:

Thank you so much @andreas_lds 

 

How does SAS refer to the percent and order data through the below statements? Is it becoz of proc format cntlin= ? I am sorry, I haven't used the cntlin yet.

_p = input(cats(_z), Day2Percent.);
_o = input(cats(_i), Day2Orders.); 

 

Thanks again 🙂


With cntlin you can create a format using a dataset, so have a look at the datasets named SomethingFmt: they define which day is mapped to which percent-value (and order-data). Instead of the formats using hash-objects would have been another way to process the data.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2642 views
  • 0 likes
  • 4 in conversation