DATA Step, Macro, Functions and more

Sum of values based on date range

Accepted Solution Solved
Reply
Frequent Contributor
Frequent Contributor
Posts: 83
Accepted Solution

Sum of values based on date range

[ Edited ]

Hello,

 

I have a table of dates [Btbl_Dates] with 365 rows and 2 columns. Column1 has date values (Jan1,2016 to Dec 31, 2016) and Column2 has either a 1 or 0.

 

I have a sql query [Qry_Btbl] that returns a few hundred rows of data and 4 columns: Field1, Field2, Date1, Date2.

 

I would like to add a calculation to [Qry_Btbl] and create a 5th column that uses Date1 and Date2 to sum the values in Column2 of [Btbl_Dates] but I'm having some difficulty doing this.

 

Thoughts? Appreciate any assistance. 


Accepted Solutions
Solution
‎10-13-2016 08:19 AM
Super User
Posts: 10,044

Re: Sum of values based on date range

If you don't have a big table , it is easy for SQL.

 


data Qry_Btbl;
infile datalines dsd truncover expandtabs;
input 
Field1:$3.
Field2:3.	
Date1:mmddyy.	
Date2:mmddyy.
;
format date1 date2 date9.;
datalines;
rhw,	553,	10/3/2016,	10/11/2016
ghj,	251,	10/3/2016,	10/11/2016
rwr,	115,	10/3/2016,	10/12/2016
zsf,	555,	10/3/2016,	10/13/2016
nbf,	513,	10/6/2016,	10/14/2016
nws,	844,	10/6/2016,	10/18/2016
cxm,	350,	10/6/2016,	10/19/2016
olp,	422,	10/6/2016,	10/17/2016
;;;;



data Btbl_Dates;
infile datalines dsd truncover expandtabs;
input 
Column1:date9.
Column2	
;

datalines;
01Oct2016,	0
02Oct2016,	0
03Oct2016,	1
04Oct2016,	1
05Oct2016,	1
06Oct2016,	1
07Oct2016,	1
08Oct2016,	0
09Oct2016,	0
10Oct2016,	0
11Oct2016,	1
12Oct2016,	1
13Oct2016,	1
14Oct2016,	1
15Oct2016,	0
16Oct2016,	0
17Oct2016,	1
18Oct2016,	1
19Oct2016,	1
20Oct2016,	1
21Oct2016,	1
22Oct2016,	0
23Oct2016,	0
24Oct2016,	1
25Oct2016,	1
26Oct2016,	1
27Oct2016,	1
28Oct2016,	1
29Oct2016,	0
30Oct2016,	0
31Oct2016,	1
;;;;



proc sql;
create table want as
 select a.*,sum(column2) as new_column
  from Qry_Btbl as a,Btbl_Dates as b
   where b.column1 between a.date1 and a.date2
    group by 1,2,3,4 ;
quit;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Sum of values based on date range

You should provide some example data from both data sets and what the final result should be. It also wouldn't hurt to show how you are joining the sets with your current code.

 

Also you may need to make sure that your dates are actual SAS date valued numerics and not character variables depending on how you are using the dates.

 

Super User
Super User
Posts: 7,988

Re: Sum of values based on date range

Please post example test data - in the form of a datastep - and what the output should look like.  It is not clear from the explanation what you are doing with "that uses Date1 and Date2 to sum the values".

Super User
Posts: 5,516

Re: Sum of values based on date range

It looks like you would need something along the lines of:

 

sum(column2) as total where (date1 <= column1 and column1 <= date2)

 

perhaps with a group by Field1 or Field2 (or both).

 

Are your dates all SAS dates (not character strings)? 

 

Does it take both Field1 and Field2 to uniquely identify an observation?

 

There are plenty of SQL programmers here that can give you the right syntax.

Frequent Contributor
Frequent Contributor
Posts: 83

Re: Sum of values based on date range

Posted in reply to Astounding
Yes, the date values are numeric. You're hitting on exactly what I'm trying to do. I'm just struggling with the syntax.
Frequent Contributor
Frequent Contributor
Posts: 83

Re: Sum of values based on date range

I have attached sample data.

 

Sheet 'Qry_Btbl' is what the original sql query result would look like.

Sheet 'Btbl_Dates' is a sample of the date values.

Sheet 'updated Qry_Btbl' is the result I would like to achieve.

 

I am trying to use the date values from 'Qry_Btbl' and find the sum of Column2 in 'Btbl_Dates'. I call this sum NewColumn in 'updated Qry_Btbl'.

 

I'm not sure how to really join the tables. 'Qry_Btbl' is more of a look up table I want to use to sum values between dates. Hopefully this isnt too confusing. 

 

Super User
Posts: 11,343

Re: Sum of values based on date range

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... has instructions on how to create datastep code that can be pasted here or attached as a TEXT file that we can execute to have the example data to test code against.

Trusted Advisor
Posts: 1,022

Re: Sum of values based on date range

If you have a small table then the PROC SQL is good.  If you have a large table, then ideally you would like a 365*365 matrix (call it N_DAYS) populated with the number of weighted days from the row date (date1) to the column date (date2).  I.e. you would want:

 

data want2 (drop=date column2 from to);
  set Qry_Btbl;
  new_column=n_days{date1,date2};
run;

So the issue is how to make the array N_DAYS.  Here's how

 

%let d1=01oct2016;
%let d2=31oct2016;
%let n_cells=%eval(%eval(%sysfunc(inputn(&d2,date9.))-%sysfunc(inputn(&d1,date9.))+1)**2);

data want2 (drop=date column2 from to);
  array n_days {%sysfunc(inputn(&d1,date9.)):%sysfunc(inputn(&d2,date9.))
               ,%sysfunc(inputn(&d1,date9.)):%sysfunc(inputn(&d2,date9.))
               } _temporary_ ;

  if _n_=1 then  do until (end_of_dates);    /*  Populate the N_DAYS matrix*/
    set btbl_dates (rename=(column1=date)) end=end_of_dates;  
    do from="&d1"d to "&d2"d ;
      do to=from to "&d2"d;
        if (from <= date) and (date <= to) then n_days{from,to}=sum(n_days{from,to},column2); 
      end;
    end;
    n_days{date,date}=1;
  end;

  set Qry_Btbl;
  new_column=n_days{date1,date2};
run;

 Basically all you need to do is assign date values (without the usual 'd) to macrovars D1 and D2.

 

regards,

Mark

Frequent Contributor
Frequent Contributor
Posts: 83

Re: Sum of values based on date range

[ Edited ]

Understandable request. I thought you didnt need to download the file to view it but it would mae sense to have some data to play with.

 

Here are the datasets. Please ignore the differnet date syntax. My goal is to use Date1 and Date2 from the first dataset and calculate NewColum found in the thrir dataset by using the values in column2 of the second dataset. Sounds more confusing than it is I think.  

 


data Qry_Btbl;
infile datalines dsd truncover;
input 
Field1:$3.
Field2:3.	
Date1:mmddyy6.	
Date2:mmddyy6.
;

datalines;
rhw,	553,	10/3/2016,	10/11/2016
ghj,	251,	10/3/2016,	10/11/2016
rwr,	115,	10/3/2016,	10/12/2016
zsf,	555,	10/3/2016,	10/13/2016
nbf,	513,	10/6/2016,	10/14/2016
nws,	844,	10/6/2016,	10/18/2016
cxm,	350,	10/6/2016,	10/19/2016
olp,	422,	10/6/2016,	10/17/2016
;;;;



data Btbl_Dates;
infile datalines dsd truncover;
input 
Column1:mmddyy6.
Column2:1.	
;

datalines;
01Oct2016,	0
02Oct2016,	0
03Oct2016,	1
04Oct2016,	1
05Oct2016,	1
06Oct2016,	1
07Oct2016,	1
08Oct2016,	0
09Oct2016,	0
10Oct2016,	0
11Oct2016,	1
12Oct2016,	1
13Oct2016,	1
14Oct2016,	1
15Oct2016,	0
16Oct2016,	0
17Oct2016,	1
18Oct2016,	1
19Oct2016,	1
20Oct2016,	1
21Oct2016,	1
22Oct2016,	0
23Oct2016,	0
24Oct2016,	1
25Oct2016,	1
26Oct2016,	1
27Oct2016,	1
28Oct2016,	1
29Oct2016,	0
30Oct2016,	0
31Oct2016,	1
;;;;





data updated_Qry_Btbl;
infile datalines dsd truncover;
input 
Field1:$3.
Field2:3.	
Date1:mmddyy6.	
Date2:mmddyy6.
NewColumn:2.
;

datalines;
rhw,	553,	10/3/2016,	10/11/2016, 6
ghj,	251,	10/3/2016,	10/11/2016, 6
rwr,	115,	10/3/2016,	10/12/2016, 7
zsf,	555,	10/3/2016,	10/13/2016, 8
nbf,	513,	10/6/2016,	10/14/2016, 6
nws,	844,	10/6/2016,	10/18/2016, 8
cxm,	350,	10/6/2016,	10/19/2016, 9
olp,	422,	10/6/2016,	10/17/2016, 7
;;;;

 

Solution
‎10-13-2016 08:19 AM
Super User
Posts: 10,044

Re: Sum of values based on date range

If you don't have a big table , it is easy for SQL.

 


data Qry_Btbl;
infile datalines dsd truncover expandtabs;
input 
Field1:$3.
Field2:3.	
Date1:mmddyy.	
Date2:mmddyy.
;
format date1 date2 date9.;
datalines;
rhw,	553,	10/3/2016,	10/11/2016
ghj,	251,	10/3/2016,	10/11/2016
rwr,	115,	10/3/2016,	10/12/2016
zsf,	555,	10/3/2016,	10/13/2016
nbf,	513,	10/6/2016,	10/14/2016
nws,	844,	10/6/2016,	10/18/2016
cxm,	350,	10/6/2016,	10/19/2016
olp,	422,	10/6/2016,	10/17/2016
;;;;



data Btbl_Dates;
infile datalines dsd truncover expandtabs;
input 
Column1:date9.
Column2	
;

datalines;
01Oct2016,	0
02Oct2016,	0
03Oct2016,	1
04Oct2016,	1
05Oct2016,	1
06Oct2016,	1
07Oct2016,	1
08Oct2016,	0
09Oct2016,	0
10Oct2016,	0
11Oct2016,	1
12Oct2016,	1
13Oct2016,	1
14Oct2016,	1
15Oct2016,	0
16Oct2016,	0
17Oct2016,	1
18Oct2016,	1
19Oct2016,	1
20Oct2016,	1
21Oct2016,	1
22Oct2016,	0
23Oct2016,	0
24Oct2016,	1
25Oct2016,	1
26Oct2016,	1
27Oct2016,	1
28Oct2016,	1
29Oct2016,	0
30Oct2016,	0
31Oct2016,	1
;;;;



proc sql;
create table want as
 select a.*,sum(column2) as new_column
  from Qry_Btbl as a,Btbl_Dates as b
   where b.column1 between a.date1 and a.date2
    group by 1,2,3,4 ;
quit;
Frequent Contributor
Frequent Contributor
Posts: 83

Re: Sum of values based on date range

Verified. Ksharp to the rescue again.

Thanks for the help. This was giving me such a hard time.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 543 views
  • 0 likes
  • 6 in conversation