BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
altijani
Quartz | Level 8

Greetings,

 

I have a complex financial dataset that has three, or even more, robust examples. These examples are shown below in the Have and Want datasets. Please note that for the last example, ID 32145 which shifts to 32145DF once in default, the amount owed is reduced from $300 to $200 because the case was forwarded to default, where it reopens with a) a new ID that ends with DF and b) reduced the principal to the previous principal minus any collections to date. Thus, there should be data for only 32145CN and not both 32145 and 32145DF. The rationale is that it is the same ID/Customer and should not have the ID twice to double count the principal amounts.

 

Have:

IDDateStatusCollection dateAmount owedever collected
123429-Feb-16Open.82.440
123431-Mar-16Open.82.440
123430-Apr-16Open.82.440
123431-May-16Open.82.440
123430-Jun-16Open.82.440
123431-Jul-16Closed20-Jul-1682.4482.44
123431-May-18reopened20-Jul-1682.4482.44
56478929-Feb-16Open02-Feb-1610010
56478931-Mar-16Open02-Feb-1610010
56478930-Apr-16Open04-Apr-1610030
56478931-May-16Open15-May-1610040
56478930-Jun-16Open12-Jun-1610070
56478931-Jul-16Open12-Jun-1610070
56478931-May-18Open22-May-18100100
3214529-Feb-16Open.3000
3214531-Mar-16Open.30055
3214530-Apr-16Open15-Apr-16300100
3214531-May-16Open15-Apr-16300100
32145DF31-May-16default12-May-16200120
32145DF30-Jun-16default06-Jun-16200130

 

Want:

IDDateStatusCollection dateAmount owedever collectedthis month amount
123429-Feb-16Open.82.4400
123431-Mar-16Open.82.4400
123430-Apr-16Open.82.4400
123431-May-16Open.82.4400
123430-Jun-16Open.82.4400
123431-Jul-16Closed20-Jul-1682.4482.4482.44
123431-May-18reopened20-Jul-1682.4482.440
56478929-Feb-16Open02-Feb-161001010
56478931-Mar-16Open02-Feb-16100100
56478930-Apr-16Open04-Apr-161003020
56478931-May-16Open15-May-161004010
56478930-Jun-16Open12-Jun-161007030
56478931-Jul-16Open12-Jun-16100700
56478931-May-18Open22-May-1810010030
32145CN31-May-16default12-May-1630012020
32145CN30-Jun-16default06-Jun-1630013010

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Try this:

 

data have;
input classification	ID :$10.	Date :date7.	Status :$20.	Collectiondate	:date7. Amountowed	evercollected;
format Date Collectiondate date7.;
cards;
1	1234	29-Feb-16	Open	.	82.44	0
1	1234	31-Mar-16	Open	.	82.44	0
1	1234	30-Apr-16	Open	.	82.44	0
1	1234	31-May-16	Open	.	82.44	0
1	1234	30-Jun-16	Open	.	82.44	0
1	1234	31-Jul-16	Closed	20-Jul-16	82.44	82.44
1	1234	31-May-18	reopened	20-Jul-16	82.44	82.44
2	564789	29-Feb-16	Open	02-Feb-16	100	10
2	564789	31-Mar-16	Open	02-Feb-16	100	10
2	564789	30-Apr-16	Open	04-Apr-16	100	30
2	564789	31-May-16	Open	15-May-16	100	40
2	564789	30-Jun-16	Open	12-Jun-16	100	70
2	564789	31-Jul-16	Open	12-Jun-16	100	70
2	564789	31-May-18	Open	22-May-18	100	100
3	32145	29-Feb-16	Open	.	300	0
3	32145	31-Mar-16	Open	.	300	55
3	32145	30-Apr-16	Open	15-Apr-16	300	100
3	32145	31-May-16	Open	15-Apr-16	300	100
3	32145DF	31-May-16	default	12-May-16	200	120
3	32145DF	30-Jun-16	default	06-Jun-16	200	130
4	91254DF	31-Mar-16	default	02-Mar-16	520	0
4	91254DF	30-Apr-16	default	06-Apr-16	520	50
4	91254DF	31-May-16	default	17-May-16	520	120
5	4105	31-May-16	Open	.	1250	0
;
run;
data _have;
set have;
_id=compress(id,'DF');
run;
data want;
call missing(_f1,_f2);
	do until(last._id);
		set _have;
		by  classification	_ID;
		if anyalpha(id)>0 then _f1=1;
		if anyalpha(id)=0 then _f2=1;
	end;
	call missing(_k1);
	do until(last._id);
		set _have;
		by  classification	_ID;
		thismonthamount=dif(evercollected);
		if first._id then thismonthamount=evercollected;
		if _f1 and _f2 then do;
			if status ne 'default' then _k1=Amountowed;
			else if status='default' then do; Amountowed=_k1;output;end;
		end;
		else output;
	end;
drop _:;
run;

View solution in original post

9 REPLIES 9
novinosrin
Tourmaline | Level 20

Not sure I understood you req well, but anyway:

 

data have;
input ID $	Date :date7.	Status :$10.	Collectiondate :date7.	Amountowed	evercollected;
format date Collectiondate date9.;
cards;
1234	29-Feb-16	Open	.	82.44	0
1234	31-Mar-16	Open	.	82.44	0
1234	30-Apr-16	Open	.	82.44	0
1234	31-May-16	Open	.	82.44	0
1234	30-Jun-16	Open	.	82.44	0
1234	31-Jul-16	Closed	20-Jul-16	82.44	82.44
1234	31-May-18	reopened	20-Jul-16	82.44	82.44
564789	29-Feb-16	Open	2-Feb-16	100	10
564789	31-Mar-16	Open	2-Feb-16	100	10
564789	30-Apr-16	Open	4-Apr-16	100	30
564789	31-May-16	Open	15-May-16	100	40
564789	30-Jun-16	Open	12-Jun-16	100	70
564789	31-Jul-16	Open	12-Jun-16	100	70
564789	31-May-18	Open	22-May-18	100	100
32145	29-Feb-16	Open	.	300	0
32145	31-Mar-16	Open	.	300	55
32145	30-Apr-16	Open	15-Apr-16	300	100
32145	31-May-16	Open	15-Apr-16	300	100
32145DF	31-May-16	default	12-May-16	200	120
32145DF	30-Jun-16	default	6-Jun-16	200	130
;
data _have;
set have;
_id=compress(id,'DF');
run;
data want;
call missing(_f);
do until(last._id);
set _have;
by _id notsorted;
if status='default' then _f=1;
end;
call missing(k1);
do until(last._id);
set _have;
by _id notsorted;
thismonthamount=dif(evercollected);
if first._id then thismonthamount=evercollected;
if _f then do;
if status='Open' then k1=Amountowed;
else if status='default' then do; Amountowed=k1;output;end;
end;
else  output;
end;
drop _: k1;
run;
altijani
Quartz | Level 8

Thanks novinosrin. I tried to apply your code to the data and it does not give the outcome I am looking for.

Simply, I want to have the following outputs for each of the three examples:

Example 1234: “this month amount” should show only the amount collected this month, which is $0. The 82.44 was collected back in July/2016, but the ever collect show the cumulative of all collections.

Example 564789: Same is true for this one too. Collections this month should show what was collected each month and not the cumulative.

Example 32145 and 32145DF: amount owed to be the same as the amounts before sending the ID to default (DF). So for ID 32145DF, I need it to show the original amount owed ($300) instead of the current $200. Same with the collection. It should show the total amounts collected each month including the amounts collected at the non-DF periods. In addition, I don’t want to have the ID 32145 in the data at all, because it will double-count the amount.

Hope this clarifies

novinosrin
Tourmaline | Level 20

Hi @altijani  I think I have understood much better this time. I am willing to work interactively and make any changes required regardless.

Please test and let me know for further edit/changes. Thanks

 

data have;
input ID $	Date :date7.	Status :$10.	Collectiondate :date7.	Amountowed	evercollected;
format date Collectiondate date9.;
cards;
1234	29-Feb-16	Open	.	82.44	0
1234	31-Mar-16	Open	.	82.44	0
1234	30-Apr-16	Open	.	82.44	0
1234	31-May-16	Open	.	82.44	0
1234	30-Jun-16	Open	.	82.44	0
1234	31-Jul-16	Closed	20-Jul-16	82.44	82.44
1234	31-May-18	reopened	20-Jul-16	82.44	82.44
564789	29-Feb-16	Open	2-Feb-16	100	10
564789	31-Mar-16	Open	2-Feb-16	100	10
564789	30-Apr-16	Open	4-Apr-16	100	30
564789	31-May-16	Open	15-May-16	100	40
564789	30-Jun-16	Open	12-Jun-16	100	70
564789	31-Jul-16	Open	12-Jun-16	100	70
564789	31-May-18	Open	22-May-18	100	100
32145	29-Feb-16	Open	.	300	0
32145	31-Mar-16	Open	.	300	55
32145	30-Apr-16	Open	15-Apr-16	300	100
32145	31-May-16	Open	15-Apr-16	300	100
32145DF	31-May-16	default	12-May-16	200	120
32145DF	30-Jun-16	default	6-Jun-16	200	130
;

data _have;
set have;
_id=compress(id,'DF');
run;

data want;
call missing(_f);
do until(last._id);
	set _have;
	by _id notsorted;
	if status='default' then _f=1;
end;
call missing(_k1);
do until(last._id);
	set _have;
	by _id notsorted;
	_temp=dif(evercollected);
	if missing(Collectiondate) then thismonthamount=0;
	else if month(Collectiondate)=month(date) and year(date)=year(Collectiondate) then thismonthamount=_temp;
	else thismonthamount=0;
	if first._id then thismonthamount=evercollected;
	if _f then do;
	if status='Open' then _k1=Amountowed;
	else if status='default' then do; Amountowed=_k1;output;end;
	end;
	else  output;
end;
drop _: ;
run;
altijani
Quartz | Level 8

Thanks again @novinosrin

Still not getting the results I want. Let us use a different example. I have the following dataset:

idowedpaid
1234$100$50
1234$100$100
65478$88$28
65478DF$60$-  

 

And I want the following dataset:

idowedpaid
1234$100$50
1234$100$100
65478$88$28
65478DF$88 $28

 

It is important to have the amounts owed and paid for the last ID with the DF equals the amounts owed and paid from the previous ID without the DF.

Hope this clarifies

novinosrin
Tourmaline | Level 20

I think we are missing something that either your sample is perhaps not clear representative of your real?

 

data have;
input id $	owed : comma10.	paid : comma10.;
cards;
1234	$100	$50
1234	$100	$100
65478	$88	$28
65478DF	$60	.  
;

data _have;
set have;
_id=compress(id,'DF');
run;

data want;
set _have;
by _id notsorted;
retain _owed _paid;
if first._id then call missing(_owed,_paid);
if index(id,'DF')=0 then do;
_owed=owed;
_paid=paid;
end;
else if index(id,'DF')>0 then do;
owed=_owed;
paid=_paid;
end;
drop _:;
run;
altijani
Quartz | Level 8

thanks for the code @novinosrin          

However, I still have a problem with the actual data, which is much more obs than this mock data. The problem is that there is a significant number of observations that have both owed and paid equaling missing.

Any idea of what might be the reason for that?

Thanks,

Altijani

          

novinosrin
Tourmaline | Level 20

@altijani  Thank you for the message. I am afraid, we can only help each other if you could make up a great mock p data that represents your real. Change the names and values with an alternative equivalent so that it is easy for us to test. I would request you to spend some time in getting the same right that hopefully is very "alike" your real.

 

 

altijani
Quartz | Level 8

Sure @. Let us try to give it another try with all possible scenarios.

 

Here is what I have:

Have      
calssificationIDDateStatusCollection dateAmount owedever collected
1123429-Feb-16Open.82.440
1123431-Mar-16Open.82.440
1123430-Apr-16Open.82.440
1123431-May-16Open.82.440
1123430-Jun-16Open.82.440
1123431-Jul-16Closed20-Jul-1682.4482.44
1123431-May-18reopened20-Jul-1682.4482.44
256478929-Feb-16Open02-Feb-1610010
256478931-Mar-16Open02-Feb-1610010
256478930-Apr-16Open04-Apr-1610030
256478931-May-16Open15-May-1610040
256478930-Jun-16Open12-Jun-1610070
256478931-Jul-16Open12-Jun-1610070
256478931-May-18Open22-May-18100100
33214529-Feb-16Open.3000
33214531-Mar-16Open.30055
33214530-Apr-16Open15-Apr-16300100
33214531-May-16Open15-Apr-16300100
332145DF31-May-16default12-May-16200120
332145DF30-Jun-16default06-Jun-16200130
491254DF31-Mar-16default02-Mar-165200
491254DF30-Apr-16default06-Apr-1652050
491254DF31-May-16default17-May-16520120
5410531-May-16Open.12500

Please note that

Classification 4 starts with a DF ID, and does not have a non-DF ID

Classification 5: is a new ID with no collection

 

Her is what I want:

calssificationIDDateStatusCollection dateAmount owedever collectedthis month amount
1123429-Feb-16Open.82.4400
1123431-Mar-16Open.82.4400
1123430-Apr-16Open.82.4400
1123431-May-16Open.82.4400
1123430-Jun-16Open.82.4400
1123431-Jul-16Closed20-Jul-1682.4482.4482.44
1123431-May-18reopened20-Jul-1682.4482.440
256478929-Feb-16Open02-Feb-161001010
256478931-Mar-16Open02-Feb-16100100
256478930-Apr-16Open04-Apr-161003020
256478931-May-16Open15-May-161004010
256478930-Jun-16Open12-Jun-161007030
256478931-Jul-16Open12-Jun-16100700
256478931-May-18Open22-May-1810010030
332145DF31-May-16default12-May-1630012020
332145DF30-Jun-16default06-Jun-1630013010
491254DF31-Mar-16default.52000
491254DF30-Apr-16default06-Apr-165205050
491254DF31-May-16default17-May-1652012070
5410531-May-16Open20-May-16125000

 

Hope this clarifies

 

novinosrin
Tourmaline | Level 20

Try this:

 

data have;
input classification	ID :$10.	Date :date7.	Status :$20.	Collectiondate	:date7. Amountowed	evercollected;
format Date Collectiondate date7.;
cards;
1	1234	29-Feb-16	Open	.	82.44	0
1	1234	31-Mar-16	Open	.	82.44	0
1	1234	30-Apr-16	Open	.	82.44	0
1	1234	31-May-16	Open	.	82.44	0
1	1234	30-Jun-16	Open	.	82.44	0
1	1234	31-Jul-16	Closed	20-Jul-16	82.44	82.44
1	1234	31-May-18	reopened	20-Jul-16	82.44	82.44
2	564789	29-Feb-16	Open	02-Feb-16	100	10
2	564789	31-Mar-16	Open	02-Feb-16	100	10
2	564789	30-Apr-16	Open	04-Apr-16	100	30
2	564789	31-May-16	Open	15-May-16	100	40
2	564789	30-Jun-16	Open	12-Jun-16	100	70
2	564789	31-Jul-16	Open	12-Jun-16	100	70
2	564789	31-May-18	Open	22-May-18	100	100
3	32145	29-Feb-16	Open	.	300	0
3	32145	31-Mar-16	Open	.	300	55
3	32145	30-Apr-16	Open	15-Apr-16	300	100
3	32145	31-May-16	Open	15-Apr-16	300	100
3	32145DF	31-May-16	default	12-May-16	200	120
3	32145DF	30-Jun-16	default	06-Jun-16	200	130
4	91254DF	31-Mar-16	default	02-Mar-16	520	0
4	91254DF	30-Apr-16	default	06-Apr-16	520	50
4	91254DF	31-May-16	default	17-May-16	520	120
5	4105	31-May-16	Open	.	1250	0
;
run;
data _have;
set have;
_id=compress(id,'DF');
run;
data want;
call missing(_f1,_f2);
	do until(last._id);
		set _have;
		by  classification	_ID;
		if anyalpha(id)>0 then _f1=1;
		if anyalpha(id)=0 then _f2=1;
	end;
	call missing(_k1);
	do until(last._id);
		set _have;
		by  classification	_ID;
		thismonthamount=dif(evercollected);
		if first._id then thismonthamount=evercollected;
		if _f1 and _f2 then do;
			if status ne 'default' then _k1=Amountowed;
			else if status='default' then do; Amountowed=_k1;output;end;
		end;
		else output;
	end;
drop _:;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1479 views
  • 0 likes
  • 2 in conversation