BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KamikazeBassi
Fluorite | Level 6

Hi,

 

I am new to using macros and sas, so any help is much appreciated.  

I have two tables.  

one table lists accounts, start date and end date.

Accounts

ACCT

START_DATE

END_DATE

1234

15JAN2020

15MAY2020

6789

27MAR2020

27MAY2020

 

Another table is a typical daily table for each account with a date and balance.  There is a row for each account, on each date.

    

Data

ACCT

BAL

ID_DATE

1234

7.00

15JAN2020

1234

6.59

16JAN2020

1234

7.56

17JAN2020

6789

43.54

27MAR2020

6789

45.66

28MAR2020

 

What i am trying to do is take the accounts data and for each account loop through the data able from between the start and end date to bring back the balance. 

 

I only want to bring back the balance on a monthly basis between the start and end date from when start date starts from. so I end up something like this for example.

 

ACCT

BAL

ID_DATE

1234

7.00

15JAN2020

1234

15.00

15FEB2020

1234

32.33

15MAR2020

1234

43.24

15APR2020

1234

12.43

15MAY2020

6789

43.54

27MAR2020

6789

43.52

27MAR2020

6789

23.42

27APR2020

 

Im not sure if i am making it over complicated by using loops or can do it using sql and some built in SAS functions.

 

any help very much appreciated. 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20


data have1;
input ACCT	(START_DATE	END_DATE) (:date9.);
format START_DATE	END_DATE date9.;
cards;
1234	15-Jan-20	15-May-20
6789	27-Mar-20	27-May-20
;


data have2;
input ACCT	BAL	ID_DATE :date9.;
format id_date date9.; 
cards;
1234	7	15-Jan-20
1234	6.59	16-Jan-20
1234	7.56	17-Jan-20
6789	43.54	27-Mar-20
6789	45.66	28-Mar-20
;

data want ;
 if _n_=1 then do;
  dcl hash H () ;
  h.definekey  ("ID_DATE") ;
  h.definedata ("bal","ID_DATE") ;
  h.definedone () ;
 end;
 do until(last.acct);
  set have2;
  by acct;
  h.add();
 end;
 set have1;
 do until(START_DATE>END_DATE);
  if h.find(key:START_DATE) ne 0 then call missing(id_date,bal);
  output;
  START_DATE=intnx('mon',START_DATE,1,'s');
 end;
 h.clear();
 keep acct id_date bal START_DATE;
 rename START_DATE=date;
run;

 This assumes the datasets are sorted just like your example

View solution in original post

7 REPLIES 7
Kurt_Bremser
Super User

No loop or something like that needed, only a SQL join:

proc sql;
create table want as
  select b.*
  from accounts a left join data b
  on a.acct = b.acct and b.id_date between a.start_date and a.end_date
;
quit;

(untested, posted from my tablet)

KamikazeBassi
Fluorite | Level 6
Thanks for the quick reply...not sure what happened to the formatting of my tables in the question.

your solution would work, but would bring back all dates between a.start_date and a.end_data. 15JAN 16JAN 17JAN...

I would only need to bring back data for the followings months on the same date as the start date. 15JAN 15FEB 15MAR..

thanks

Kurt_Bremser
Super User

Add an additional condition to the ON clause:

on a.acct = b.acct and b.id_date between a.start_date and a.end_date and day(b.id_date) = day(a.start_date)
RichardDeVen
Barite | Level 11

SQL does not have a looping process to create multiple rows from a single row.  DATA step does.

 

Create a DATA step view that outputs rows for each mid-month point from START to END.  Then join the view with the DATA on view's dates

 

Example:

data accounts;
input acct start_date: date9. end_date: date9.;
format start_date end_date date9.;
datalines;
1234 15JAN2020 15MAY2020
6789 27MAR2020 27MAY2020
;

data dates_v / view=dates_v;
  set accounts;
  do index = 0 by 1 until (date >= end_date or index >= 1000);
    date = intnx('month', start_date, index, 'SAME');
    output;
  end;
  format date date9.;
  keep acct date;
run;

Then 

create table want as 
select 
  data.*
from
  dates_v as V
left join 
  data
on 
  data.acct = V.acct and
  data.id_date = V.date

 

novinosrin
Tourmaline | Level 20


data have1;
input ACCT	(START_DATE	END_DATE) (:date9.);
format START_DATE	END_DATE date9.;
cards;
1234	15-Jan-20	15-May-20
6789	27-Mar-20	27-May-20
;


data have2;
input ACCT	BAL	ID_DATE :date9.;
format id_date date9.; 
cards;
1234	7	15-Jan-20
1234	6.59	16-Jan-20
1234	7.56	17-Jan-20
6789	43.54	27-Mar-20
6789	45.66	28-Mar-20
;

data want ;
 if _n_=1 then do;
  dcl hash H () ;
  h.definekey  ("ID_DATE") ;
  h.definedata ("bal","ID_DATE") ;
  h.definedone () ;
 end;
 do until(last.acct);
  set have2;
  by acct;
  h.add();
 end;
 set have1;
 do until(START_DATE>END_DATE);
  if h.find(key:START_DATE) ne 0 then call missing(id_date,bal);
  output;
  START_DATE=intnx('mon',START_DATE,1,'s');
 end;
 h.clear();
 keep acct id_date bal START_DATE;
 rename START_DATE=date;
run;

 This assumes the datasets are sorted just like your example

s_lassen
Meteorite | Level 14

Something like this?

proc sql;
  create table want as select data.* 
   from data join accounts
   on data.ID_DATE between accounts.START_DATE and accounts.END_DATE
        and day(data.ID_DATE)=day(accounts.START_DATE);
quit;
KamikazeBassi
Fluorite | Level 6

Thanks for all the solutions, I am just trying them out to find which works most efficiently for my dataset.   The dataset is quite large. 

 

thanks

 

Ninder

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2365 views
  • 0 likes
  • 5 in conversation