- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a table of dates [Btbl_Dates] with 365 rows and 2 columns. Column1 has date values (Jan1,2016 to Dec 31, 2016) and Column2 has either a 1 or 0.
I have a sql query [Qry_Btbl] that returns a few hundred rows of data and 4 columns: Field1, Field2, Date1, Date2.
I would like to add a calculation to [Qry_Btbl] and create a 5th column that uses Date1 and Date2 to sum the values in Column2 of [Btbl_Dates] but I'm having some difficulty doing this.
Thoughts? Appreciate any assistance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you don't have a big table , it is easy for SQL.
data Qry_Btbl;
infile datalines dsd truncover expandtabs;
input
Field1:$3.
Field2:3.
Date1:mmddyy.
Date2:mmddyy.
;
format date1 date2 date9.;
datalines;
rhw, 553, 10/3/2016, 10/11/2016
ghj, 251, 10/3/2016, 10/11/2016
rwr, 115, 10/3/2016, 10/12/2016
zsf, 555, 10/3/2016, 10/13/2016
nbf, 513, 10/6/2016, 10/14/2016
nws, 844, 10/6/2016, 10/18/2016
cxm, 350, 10/6/2016, 10/19/2016
olp, 422, 10/6/2016, 10/17/2016
;;;;
data Btbl_Dates;
infile datalines dsd truncover expandtabs;
input
Column1:date9.
Column2
;
datalines;
01Oct2016, 0
02Oct2016, 0
03Oct2016, 1
04Oct2016, 1
05Oct2016, 1
06Oct2016, 1
07Oct2016, 1
08Oct2016, 0
09Oct2016, 0
10Oct2016, 0
11Oct2016, 1
12Oct2016, 1
13Oct2016, 1
14Oct2016, 1
15Oct2016, 0
16Oct2016, 0
17Oct2016, 1
18Oct2016, 1
19Oct2016, 1
20Oct2016, 1
21Oct2016, 1
22Oct2016, 0
23Oct2016, 0
24Oct2016, 1
25Oct2016, 1
26Oct2016, 1
27Oct2016, 1
28Oct2016, 1
29Oct2016, 0
30Oct2016, 0
31Oct2016, 1
;;;;
proc sql;
create table want as
select a.*,sum(column2) as new_column
from Qry_Btbl as a,Btbl_Dates as b
where b.column1 between a.date1 and a.date2
group by 1,2,3,4 ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You should provide some example data from both data sets and what the final result should be. It also wouldn't hurt to show how you are joining the sets with your current code.
Also you may need to make sure that your dates are actual SAS date valued numerics and not character variables depending on how you are using the dates.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please post example test data - in the form of a datastep - and what the output should look like. It is not clear from the explanation what you are doing with "that uses Date1 and Date2 to sum the values".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It looks like you would need something along the lines of:
sum(column2) as total where (date1 <= column1 and column1 <= date2)
perhaps with a group by Field1 or Field2 (or both).
Are your dates all SAS dates (not character strings)?
Does it take both Field1 and Field2 to uniquely identify an observation?
There are plenty of SQL programmers here that can give you the right syntax.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have attached sample data.
Sheet 'Qry_Btbl' is what the original sql query result would look like.
Sheet 'Btbl_Dates' is a sample of the date values.
Sheet 'updated Qry_Btbl' is the result I would like to achieve.
I am trying to use the date values from 'Qry_Btbl' and find the sum of Column2 in 'Btbl_Dates'. I call this sum NewColumn in 'updated Qry_Btbl'.
I'm not sure how to really join the tables. 'Qry_Btbl' is more of a look up table I want to use to sum values between dates. Hopefully this isnt too confusing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... has instructions on how to create datastep code that can be pasted here or attached as a TEXT file that we can execute to have the example data to test code against.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have a small table then the PROC SQL is good. If you have a large table, then ideally you would like a 365*365 matrix (call it N_DAYS) populated with the number of weighted days from the row date (date1) to the column date (date2). I.e. you would want:
data want2 (drop=date column2 from to);
set Qry_Btbl;
new_column=n_days{date1,date2};
run;
So the issue is how to make the array N_DAYS. Here's how
%let d1=01oct2016;
%let d2=31oct2016;
%let n_cells=%eval(%eval(%sysfunc(inputn(&d2,date9.))-%sysfunc(inputn(&d1,date9.))+1)**2);
data want2 (drop=date column2 from to);
array n_days {%sysfunc(inputn(&d1,date9.)):%sysfunc(inputn(&d2,date9.))
,%sysfunc(inputn(&d1,date9.)):%sysfunc(inputn(&d2,date9.))
} _temporary_ ;
if _n_=1 then do until (end_of_dates); /* Populate the N_DAYS matrix*/
set btbl_dates (rename=(column1=date)) end=end_of_dates;
do from="&d1"d to "&d2"d ;
do to=from to "&d2"d;
if (from <= date) and (date <= to) then n_days{from,to}=sum(n_days{from,to},column2);
end;
end;
n_days{date,date}=1;
end;
set Qry_Btbl;
new_column=n_days{date1,date2};
run;
Basically all you need to do is assign date values (without the usual 'd) to macrovars D1 and D2.
regards,
Mark
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Understandable request. I thought you didnt need to download the file to view it but it would mae sense to have some data to play with.
Here are the datasets. Please ignore the differnet date syntax. My goal is to use Date1 and Date2 from the first dataset and calculate NewColum found in the thrir dataset by using the values in column2 of the second dataset. Sounds more confusing than it is I think.
data Qry_Btbl;
infile datalines dsd truncover;
input
Field1:$3.
Field2:3.
Date1:mmddyy6.
Date2:mmddyy6.
;
datalines;
rhw, 553, 10/3/2016, 10/11/2016
ghj, 251, 10/3/2016, 10/11/2016
rwr, 115, 10/3/2016, 10/12/2016
zsf, 555, 10/3/2016, 10/13/2016
nbf, 513, 10/6/2016, 10/14/2016
nws, 844, 10/6/2016, 10/18/2016
cxm, 350, 10/6/2016, 10/19/2016
olp, 422, 10/6/2016, 10/17/2016
;;;;
data Btbl_Dates;
infile datalines dsd truncover;
input
Column1:mmddyy6.
Column2:1.
;
datalines;
01Oct2016, 0
02Oct2016, 0
03Oct2016, 1
04Oct2016, 1
05Oct2016, 1
06Oct2016, 1
07Oct2016, 1
08Oct2016, 0
09Oct2016, 0
10Oct2016, 0
11Oct2016, 1
12Oct2016, 1
13Oct2016, 1
14Oct2016, 1
15Oct2016, 0
16Oct2016, 0
17Oct2016, 1
18Oct2016, 1
19Oct2016, 1
20Oct2016, 1
21Oct2016, 1
22Oct2016, 0
23Oct2016, 0
24Oct2016, 1
25Oct2016, 1
26Oct2016, 1
27Oct2016, 1
28Oct2016, 1
29Oct2016, 0
30Oct2016, 0
31Oct2016, 1
;;;;
data updated_Qry_Btbl;
infile datalines dsd truncover;
input
Field1:$3.
Field2:3.
Date1:mmddyy6.
Date2:mmddyy6.
NewColumn:2.
;
datalines;
rhw, 553, 10/3/2016, 10/11/2016, 6
ghj, 251, 10/3/2016, 10/11/2016, 6
rwr, 115, 10/3/2016, 10/12/2016, 7
zsf, 555, 10/3/2016, 10/13/2016, 8
nbf, 513, 10/6/2016, 10/14/2016, 6
nws, 844, 10/6/2016, 10/18/2016, 8
cxm, 350, 10/6/2016, 10/19/2016, 9
olp, 422, 10/6/2016, 10/17/2016, 7
;;;;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you don't have a big table , it is easy for SQL.
data Qry_Btbl;
infile datalines dsd truncover expandtabs;
input
Field1:$3.
Field2:3.
Date1:mmddyy.
Date2:mmddyy.
;
format date1 date2 date9.;
datalines;
rhw, 553, 10/3/2016, 10/11/2016
ghj, 251, 10/3/2016, 10/11/2016
rwr, 115, 10/3/2016, 10/12/2016
zsf, 555, 10/3/2016, 10/13/2016
nbf, 513, 10/6/2016, 10/14/2016
nws, 844, 10/6/2016, 10/18/2016
cxm, 350, 10/6/2016, 10/19/2016
olp, 422, 10/6/2016, 10/17/2016
;;;;
data Btbl_Dates;
infile datalines dsd truncover expandtabs;
input
Column1:date9.
Column2
;
datalines;
01Oct2016, 0
02Oct2016, 0
03Oct2016, 1
04Oct2016, 1
05Oct2016, 1
06Oct2016, 1
07Oct2016, 1
08Oct2016, 0
09Oct2016, 0
10Oct2016, 0
11Oct2016, 1
12Oct2016, 1
13Oct2016, 1
14Oct2016, 1
15Oct2016, 0
16Oct2016, 0
17Oct2016, 1
18Oct2016, 1
19Oct2016, 1
20Oct2016, 1
21Oct2016, 1
22Oct2016, 0
23Oct2016, 0
24Oct2016, 1
25Oct2016, 1
26Oct2016, 1
27Oct2016, 1
28Oct2016, 1
29Oct2016, 0
30Oct2016, 0
31Oct2016, 1
;;;;
proc sql;
create table want as
select a.*,sum(column2) as new_column
from Qry_Btbl as a,Btbl_Dates as b
where b.column1 between a.date1 and a.date2
group by 1,2,3,4 ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the help. This was giving me such a hard time.