BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Myurathan
Quartz | Level 8

Hi Experts, 

I have two tables. I will have to append table B to table A and continue to calculate the variable DPD. 

Table A

CID SIDDateFinal_flgDPD
aab20200101Y5
aab20200102Y6
aab20200103Y7
aab20200104N0
bbc20200101Y25
bbc20200102Y26
bbc20200103Y27
aad20200101N0
aad20200102N0
aad20200103Y1
aad20200104Y2

 

Table B

CID SIDDateFinal_flg
aab20200105Y
aab20200106Y
aab20200107Y
aad20200105Y
aad20200106Y
aad20200107Y
bbc20200105N
bbc20200106Y
bbc20200107Y
eef20200105N
eef20200106Y

 

Table Want

CID SIDDateFinal_flgDPD
aab20200101Y5
aab20200102Y6
aab20200103Y7
aab20200104N0
aab20200105Y1
aab20200106Y2
aab20200107Y3
aad20200101N0
aad20200102N0
aad20200103Y1
aad20200104Y2
aad20200105Y3
aad20200106Y4
aad20200107Y5
bbc20200101Y25
bbc20200102Y26
bbc20200103Y27
bbc20200105N0
bbc20200106Y1
bbc20200107Y2
eef20200105N0
eef20200106Y1

 

DPD variable has to count whenever there is Final_flag ="Y", it resets to 0 when Final_flag ="N". Following code is used to calculate DPD.

data Table_A;
set input;
by CID SID;
	if first.CID | final_flg=0 then DPD=0;
	if final_flg then DPD+1;
run;

I cannot rerun the whole table for DPD variable. Once we appended Table B, DPD needs to be updated for the data derived from table B. count should start from least DPD count available for the CID and SID. 

 

Thank you so much for your time and expertise help. 

 

Regards,

Myu

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data a;
input (CID 	SID) ($)	Date yymmdd10.	Final_flg $	DPD ;
format date yymmdd10.;
cards;
aa	b	20200101	Y	5
aa	b	20200102	Y	6
aa	b	20200103	Y	7
aa	b	20200104	N	0
bb	c	20200101	Y	25
bb	c	20200102	Y	26
bb	c	20200103	Y	27
aa	d	20200101	N	0
aa	d	20200102	N	0
aa	d	20200103	Y	1
aa	d	20200104	Y	2
;

data b;
input (CID 	SID) ($)	Date yymmdd10.	Final_flg $;
format date yymmdd10.;
cards;
aa	b	20200105	Y
aa	b	20200106	Y
aa	b	20200107	Y
aa	d	20200105	Y
aa	d	20200106	Y
aa	d	20200107	Y
bb	c	20200105	N
bb	c	20200106	Y
bb	c	20200107	Y
ee	f	20200105	N
ee	f	20200106	Y
;
proc sort data=a out=a_sort;
by cid sid date;
run;

proc sort data=b out=b_sort;
by cid sid date;
run;
data want;
 set a_sort b_sort;
 by cid sid  ;
 if first.sid then _dpd=.;
 If dpd>. then _dpd=dpd;
 else if dpd=. then if Final_flg='N' then _dpd=0;
 else _dpd+1;
 rename _dpd=dpd;
 drop dpd;
run;
proc print noobs;run;
CID SID Date Final_flg dpd
aa b 2020-01-01 Y 5
aa b 2020-01-02 Y 6
aa b 2020-01-03 Y 7
aa b 2020-01-04 N 0
aa b 2020-01-05 Y 1
aa b 2020-01-06 Y 2
aa b 2020-01-07 Y 3
aa d 2020-01-01 N 0
aa d 2020-01-02 N 0
aa d 2020-01-03 Y 1
aa d 2020-01-04 Y 2
aa d 2020-01-05 Y 3
aa d 2020-01-06 Y 4
aa d 2020-01-07 Y 5
bb c 2020-01-01 Y 25
bb c 2020-01-02 Y 26
bb c 2020-01-03 Y 27
bb c 2020-01-05 N 0
bb c 2020-01-06 Y 1
bb c 2020-01-07 Y 2
ee f 2020-01-05 N 0
ee f 2020-01-06 Y 1

View solution in original post

3 REPLIES 3
novinosrin
Tourmaline | Level 20

data a;
input (CID 	SID) ($)	Date yymmdd10.	Final_flg $	DPD ;
format date yymmdd10.;
cards;
aa	b	20200101	Y	5
aa	b	20200102	Y	6
aa	b	20200103	Y	7
aa	b	20200104	N	0
bb	c	20200101	Y	25
bb	c	20200102	Y	26
bb	c	20200103	Y	27
aa	d	20200101	N	0
aa	d	20200102	N	0
aa	d	20200103	Y	1
aa	d	20200104	Y	2
;

data b;
input (CID 	SID) ($)	Date yymmdd10.	Final_flg $;
format date yymmdd10.;
cards;
aa	b	20200105	Y
aa	b	20200106	Y
aa	b	20200107	Y
aa	d	20200105	Y
aa	d	20200106	Y
aa	d	20200107	Y
bb	c	20200105	N
bb	c	20200106	Y
bb	c	20200107	Y
ee	f	20200105	N
ee	f	20200106	Y
;
proc sort data=a out=a_sort;
by cid sid date;
run;

proc sort data=b out=b_sort;
by cid sid date;
run;
data want;
 set a_sort b_sort;
 by cid sid  ;
 if first.sid then _dpd=.;
 If dpd>. then _dpd=dpd;
 else if dpd=. then if Final_flg='N' then _dpd=0;
 else _dpd+1;
 rename _dpd=dpd;
 drop dpd;
run;
proc print noobs;run;
CID SID Date Final_flg dpd
aa b 2020-01-01 Y 5
aa b 2020-01-02 Y 6
aa b 2020-01-03 Y 7
aa b 2020-01-04 N 0
aa b 2020-01-05 Y 1
aa b 2020-01-06 Y 2
aa b 2020-01-07 Y 3
aa d 2020-01-01 N 0
aa d 2020-01-02 N 0
aa d 2020-01-03 Y 1
aa d 2020-01-04 Y 2
aa d 2020-01-05 Y 3
aa d 2020-01-06 Y 4
aa d 2020-01-07 Y 5
bb c 2020-01-01 Y 25
bb c 2020-01-02 Y 26
bb c 2020-01-03 Y 27
bb c 2020-01-05 N 0
bb c 2020-01-06 Y 1
bb c 2020-01-07 Y 2
ee f 2020-01-05 N 0
ee f 2020-01-06 Y 1
Myurathan
Quartz | Level 8
I owe you big time for your help. Thank you so much Sir.
s_lassen
Meteorite | Level 14

You say that your TABLE_A is created by setting an input BY CID SID, but your TABLE_A is not sorted by those two variables. 

 

I assume that TABLE_A is a large master table, which has been modified in some other way after appending, and that you need to append more data and update it without recreating (e.g. sorting) the master table. In that case, indexed access may be the way to go:

data TABLE_A;
  informat CID $2. 	SID SID $1.	Date yymmdd8.	Final_flg $1.	DPD 2.;
  input CID  	SID	Date	Final_flg	DPD;
  format date yymmddn8.;
cards;
aa b 20200101 Y 5
aa b 20200102 Y 6
aa b 20200103 Y 7
aa b 20200104 N 0
bb c 20200101 Y 25
bb c 20200102 Y 26
bb c 20200103 Y 27
aa d 20200101 N 0
aa d 20200102 N 0
aa d 20200103 Y 1
aa d 20200104 Y 2
;run;

data TABLE_B;
  informat CID $2. 	SID SID $1.	Date yymmdd8.	Final_flg $1.	;
  input CID  	SID	Date	Final_flg	;
  format date yymmddn8.;
cards;
aa b 20200105 Y
aa b 20200106 Y
aa b 20200107 Y
aa d 20200105 Y
aa d 20200106 Y
aa d 20200107 Y
bb c 20200105 N
bb c 20200106 Y
bb c 20200107 Y
ee f 20200105 N
ee f 20200106 Y
;run;

proc sql;
  create unique index idx on TABLE_A(CID,SID,Date);
quit;

proc append base=TABLE_A data=TABLE_B force; run; data TABLE_A; set TABLE_A(cntllev=rec); by CID SID; if first.CID | final_flg='N' then _DPD=0; if final_flg='Y' then _DPD+1; if missing(DPD); modify TABLE_A(cntllev=rec) key=idx; DPD=_DPD; replace; run;

The TABLE_A does not have the same sort order as your WANT table, but you can still access it BY CID SID DATE, using the index:

data something;
  set TABLE_A;
  by CID SID DATE;
  /* more statements */

It is also possible to do the append and the update in one step, if the transaction table (TABLE_B) is sorted:

data TABLE_A;
  if 0 then modify TABLE_A(cntllev=rec);
  set TABLE_A(cntllev=rec) TABLE_B;
  by CID SID date;
  if first.CID | final_flg='N' then _DPD=0;
  if final_flg='Y' then _DPD+1;
  if missing(DPD);
  DPD=_DPD;
  output;  
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1282 views
  • 1 like
  • 3 in conversation