DATA Step, Macro, Functions and more

Insert duplicate rows based on latest months

Reply
Contributor
Posts: 23

Insert duplicate rows based on latest months

Hi ,

 

i have a issue in a dataset where i need to insert a duplicate rows on the months where the clm_key is missing betwwen the first.clm_key and last.clm_key based on the previous latest months row.

 

sample data is below.]

 

data table1;

informat Notified_date sett_date rep_mnth ddmmyy10.;

format Notified_date sett_date rep_mnth ddmmyy10.;

input clm_key$ Notified_date sett_date sett_cnt out_cnt rep_mnth rep_mnthname $ Notif_year sett_year Notif_cnt reopen_cnt reopen_sett_cnt sett_reopnen_cnt;

cards;

CLM1 28/08/2014 21/02/2015 1 0 01/02/2015 Feb 2014 2015 . . 1 . .

CLM1 28/08/2014 29/08/2014 . 1 01/11/2014 Nov 2014 2014 . . . . .

CLM1 28/08/2014 29/08/2014 . 1 01/10/2014 Oct 2014 2014 . . . . .

CLM1 28/08/2014 29/08/2014 . 1 01/10/2014 Oct 2014 2014 . . . . .

CLM1 28/08/2014 29/08/2014 . 1 01/09/2014 Sep 2014 2014 . 1 . . .

CLM1 28/08/2014 29/08/2014 . 1 01/09/2014 Sep 2014 2014 . 1 . . .

CLM1 28/08/2014 29/08/2014 . 1 01/08/2014 Aug 2014 2014 1 . 1 . .

CLM2 28/08/2014 21/02/2015 1 0 01/02/2015 Feb 2014 2015 . . 1 .

CLM2 28/08/2014 21/02/2015 1 0 01/02/2015 Feb 2014 2015 . . 1 .

CLM2 28/08/2014 21/02/2015 1 0 01/02/2015 Feb 2014 2015 . . 1 .

CLM2 28/08/2014 29/08/2014 . 1 01/10/2014 Oct 2014 2014 . . . .

CLM2 28/08/2014 29/08/2014 . 1 01/10/2014 Oct 2014 2014 . . . .

CLM2 28/08/2014 29/08/2014 . 1 01/09/2014 Sep 2014 2014 . 1 . .

CLM2 28/08/2014 29/08/2014 . 1 01/08/2014 Aug 2014 2014 1 . 1 .

;

run;

 

expected output should have 25 rows with inserted rows as below.

 

clm_keyNotified_datesett_datesett_cntout_cntrep_mnthrep_mnthnameNotif_yearsett_yearNotif_cntreopen_cntreopen_sett_cntsett_reopnen_cnt
CLM128/08/201421/02/20151001/02/2015Feb20142015..1.
CLM128/08/201429/08/2014.101/01/2015Nov20142014....
CLM128/08/201429/08/2014.101/12/2014Nov20142014....
CLM128/08/201429/08/2014.101/11/2014Nov20142014....
CLM128/08/201429/08/2014.101/01/2015Nov20142014....
CLM128/08/201429/08/2014.101/12/2014Nov20142014....
CLM128/08/201429/08/2014.101/11/2014Nov20142014....
CLM128/08/201429/08/2014.101/10/2014Oct20142014....
CLM128/08/201429/08/2014.101/10/2014Oct20142014....
CLM128/08/201429/08/2014.101/09/2014Sep20142014.1..
CLM128/08/201429/08/2014.101/09/2014Sep20142014.1..
CLM128/08/201429/08/2014.101/08/2014Aug201420141.1.
CLM228/08/201421/02/20151001/02/2015Feb20142015..1.
CLM228/08/201421/02/20151001/02/2015Feb20142015..1.
CLM228/08/201421/02/20151001/02/2015Feb20142015..1.
CLM228/08/201429/08/2014.101/01/2015Oct20142014....
CLM228/08/201429/08/2014.101/01/2015Oct20142014....
CLM228/08/201429/08/2014.101/12/2014Oct20142014....
CLM228/08/201429/08/2014.101/12/2014Oct20142014....
CLM228/08/201429/08/2014.101/11/2014Oct20142014....
CLM228/08/201429/08/2014.101/11/2014Oct20142014....
CLM228/08/201429/08/2014.101/10/2014Oct20142014....
CLM228/08/201429/08/2014.101/10/2014Oct20142014....
CLM228/08/201429/08/2014.101/09/2014Sep20142014.1..
CLM228/08/201429/08/2014.101/08/2014Aug201420141.1.
Super User
Posts: 5,256

Re: Insert duplicate rows based on latest months

If you are licencing ETS, you could probably solve this with PROC EXPAND.

Else, try to create a lookup table with all date/id combinations (select distinct id - Cartesian join with your date range), and then full/left join with your original data.

Data never sleeps
Regular Contributor
Posts: 194

Re: Insert duplicate rows based on latest months

I don't get the logic behind this. For CLM1, you fill the missing rows between February 2015 and November 2014. But why are those three lines repeated below ?
Regular Contributor
Posts: 194

Re: Insert duplicate rows based on latest months

You can adapt the following program to suit your exact needs

 

data not_quite_what_you_want(rename=(mnth=rep_mnth));
	set table1;
	by clm_key;

	drop rep_mnth;
	format mnth ddmmyy10.;

	mnth=lag(rep_mnth);

	if first.clm_key then do;
		mnth=rep_mnth;
		output;
	end;

	do while (mnth gt rep_mnth);
		mnth=intnx("month",mnth,-1);
		output;
		if mnth ne rep_mnth then output;	
	end;
run;
Ask a Question
Discussion stats
  • 3 replies
  • 144 views
  • 0 likes
  • 3 in conversation