data have;
infile datalines dsd truncover;
input ID $13. Transacion_date date9. Amount 8. Replicate $3. month $3;
datalines4;
1212121,10/21/2016,5000,1,10
1212121,10/21/2016,45000,1,10
1515151,22/06/2017,50000,1,6
1212185,23/05/2017,60000,1,5
21212121,15/06/2015,70000,1,6
258523,15/08/2017,80000,1,8
1212121,10/21/2016,5000,2,10
1212121,10/21/2016,45000,2,10
1515151,22/06/2017,50000,2,6
1212185,23/05/2017,60000,2,5
21212121,15/06/2015,70000,2,6
258523,15/08/2017,80000,2,8;
run; Hello, I am trying to merge a dataset with itself in order to handle difference in dates. I have a way to do it in proc sql. However, I want to do it in a datastep to allow for the indexing that I put on the dataset to be used. I read https://communities.sas.com/t5/General-SAS-Programming/data-step-to-merge-for-a-date-range/td-p/191050 and I attempted to follow it. Long story short, am not getting results I expected. Here is what I attempted data perm.havenew;
merge perm.have (in=a drop= cash_wd cash_wdtri dtotal dtotaltri) perm.have (in=b rename=(pdate=b_pdate));
by replicate id;
if (pdate - b_pdate) ge 0 and (pdate - b_pdate) le 6 ;
run; and here is the warning message I received. OTE: MERGE statement has more than one data set with repeats of BY values. 4 The SAS System 10:10 Monday, March 12, 2018 NOTE: There were 167298200 observations read from the data set perm.have. NOTE: There were 167298200 observations read from the data set perm.have. NOTE: The data set perm.have has 167298200 observations and 9 variables. NOTE: Compressing data set perm.CASHWD2 decreased size by 26.97 percent. Compressed is 194248 pages; un-compressed would require 265975 pages. NOTE: DATA statement used (Total process time): real time 3:25.11 cpu time 3:20.16 proc sql; create table have as
select a.replicate, a.aml_party_id, a.pdate, a.mnth
from perm.cashwd1 a
left join perm.cashwd1 b
on a.id = b.id and a.replicate = b.replicate
and a.pdate between b.pdate and b.pdate + 6
group by a.replicate, a.aml_party_id, a.pdate, a.mnth; quit; Please note that the sql code does more but it also takes far more time. here is some dummy data. Also, here is the number of records, I should get CASHWDTMP created, with 512251800 rows and 8 columns
... View more