BookmarkSubscribeRSS Feed
Krueger
Pyrite | Level 9

Hi All,

 

I've been lurking around the forums trying to learn more about SAS. I use SQL 90% of the time and am still learning with SAS but I've run into a slight road block that I'm not sure how exactly to handle in SAS. 

 

Below is some example data. 

 

To summarize what I'm trying to accomplish...

 

1. We only want CODE 36 results to appear while ultimately appending the other code data onto it.

2. There are several scenarios that will change the calculation they are as follows:

   a. CUSTOMER_KEY must = CUSTOMER_KEY

   b. iteration begins on CODE 36

   c. iteration looks at next row (label = n) to determine how to calculate.

 

Code Logic BreakDown

1. If first.CODE = 36 and second.CODE = 36 THEN we do the following:

    DATE_PAID = .

    AMOUNT_PAID = .

    AMOUNT_OUTSTANDING = CHARGE (+ previous AMOUNT_OUTSTANDING)

    AMOUNT_WAIVED = .

 

2. If First.CODE = 36 AND Second.CODE IN (18, 19) THEN do the following:

    DATE_PAID = last iteration of 18/19

    AMOUNT_PAID = Total of all iterations that went from 18/19

    AMOUNT_OUTSTANDING = Prior amount outstanding - amount_paid

    AMOUNT_WAIVED = .

    Basically this will continue going through the list totaling the above until the next code is not in (18/19). Example if CUTSOMER_KEY 93198003 AND N = 7 had a CODE of 11 then we would only total N4-N6 ($65.10) for the AMOUNT_PAID.

 

3. If First.CODE = 36 AND Second.CODE IN (15, 31) THEN do the following:

    DATE_PAID = last iteration of 15/31 code

    AMOUNT_PAID = .

    AMOUNT_OUTSTANDING = prior amount_outstanding - amount_waived (this is typically going to zero out)

    AMOUNT_WAIVED = CHARGE amount where code 15/31 exists.

   On another special note, for Item 2; if the iteration runs into code 15 or 31 then we would need to stop the count at this code as well  and ensure we track this on the prior 36 as AMOUNT_WAIVED.

 

Apologies for the wall of text but hopefully this all makes sense. If ya'll have any questions please ask.

    


DATA HAVE;
INPUT CUSTOMER_KEY CODE DATE DATETIME22.3 CHARGE N;
CARDS;

93198003 36 01JAN2018:01:34:15.337 -50 1
93198003 36 01FEB2018:01:43:17.693 -50 2
93198003 36 01MAR2018:01:42:10.140 -50 3
93198003 18 01MAY2018:01:50:57.747 15.10 4
93198003 18 04JUN2018:02:05:09.193 25 5
93198003 18 02JUL2018:02:03:36.927 25 6
93198003 18 01AUG2018:02:01:23.547 29.62 7
93198003 18 03SEP2018:02:18:07.553 8.48 8
93198003 18 01OCT2018:02:21:52.833 6.17 9
93198003 18 01NOV2018:02:07:13.793 4.88 10
93198003 18 03DEC2018:02:06:23.333 8.22 11
93198003 18 31DEC2018:02:03:39.460 6.94 12
93244967 36 28FEB2018:01:40:55.513 -50 1
93244967 15 02MAR2018:15:11:39.870 50 2
93246973 36 05FEB2018:01:44:38.197 -50 1
93246973 36 05MAR2018:01:42:02.130 -50 2
93246973 36 04APR2018:01:40:05.990 -50 3
93246973 36 03MAY2018:01:41:39.387 -50 4
;RUN;

DATA HAVE;
SET HAVE;
FORMAT DATE DATETIME22.3;
RUN;

DATA WANT;
INPUT CUSTOMER_KEY CODE DATE DATETIME22.3 CHARGE DATE_PAID DATETIME22.3 AMOUNT_PAID AMOUNT_OUTSTANDING AMOUNT_WAIVED;
DATALINES;

93198003 36 01JAN2018:01:34:15.337 -50                        0      -50    0 
93198003 36 01FEB2018:01:43:17.693 -50                        0      -100   0
93198003 36 01MAR2018:01:42:10.140 -50 31DEC2018:02:03:39.460 129.41 -20.59 0
93244967 36 28FEB2018:01:40:55.513 -50 02MAR2018:15:11:39.870 0      0      50
93246973 36 05FEB2018:01:44:38.197 -50                        0      -50    0
93246973 36 05MAR2018:01:42:02.130 -50                        0 	 -100 	0
93246973 36 04APR2018:01:40:05.990 -50  					  0 	 -150 	0
93246973 36 03MAY2018:01:41:39.387 -50  					  0 	 -200 	0
;RUN;

DATA WANT;
SET WANT;
FORMAT DATE DATETIME22.3;
FORMAT DATE_PAID DATETIME22.3;
RUN;

 

5 REPLIES 5
ballardw
Super User

So what happens if you have a mix of 18/19 and 15/31 for the same customer? Such as

 

DATA HAVE;
INPUT CUSTOMER_KEY CODE DATE DATETIME22.3 CHARGE N;
CARDS;

93198003 36 01JAN2018:01:34:15.337 -50 1
93198003 36 01FEB2018:01:43:17.693 -50 2
93198003 36 01MAR2018:01:42:10.140 -50 3
93198003 18 01MAY2018:01:50:57.747 15.10 4
93198003 15 04JUN2018:02:05:09.193 25 5
93198003 18 02JUL2018:02:03:36.927 25 6
93198003 31 01AUG2018:02:01:23.547 29.62 7
93198003 18 03SEP2018:02:18:07.553 8.48 8

 

Krueger
Pyrite | Level 9

If there's a Mix of 18/19 (Paid) and 15/31 (Waive) then it would default to Waive. I think there's only like 2 instances of this however out of the 100's of thousand records so if I can at least get it close I can manually correct a few records.

mkeintz
PROC Star

Questions:

 

  1. Do all instances of code=36 occur at the beginning of a customer id?
  2. Are all instances of code 18/19 for a given customer id in a single contiguous sequence?  i.e., there is no code 36 falling between two groups of 18/19?
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Krueger
Pyrite | Level 9

Code 36 is always going to appear first, but it may also appear multiple times for this same customer on different dates.

 

It may go 36, 36, 36 (3 different dates) then move into, 36, 18, 36, 36, 15. 36 Basically means they were charged a late fee, after said late fee they will either Pay, Waive or it will remain outstanding (as they will get another 36 or late fee). 

 

Basically, I am trying to calculate the late fees (amount is almost always $50) and determine what our balances are outstanding, paid and waived. In the example above, the first 3 late fees (-$150), then have a 4th late fee (-$200), Paid (We'll just say $50) so we are now at (-$150), 2 more late fees (-$250) a Waive (which will zero out the LAST late fee) so (-$200).

 

So our outstanding balance in this example is (-$200), with $50 paid and $50 waived.

 

I have written some of this in SQL but my big roadblock is I can't do a cumulative sum(?) if that's the right terminology to get the Paid amounts to calculate. I THINK I could likely break it down and start doing LEAD() functions pull the next number and SUM those then delete that row and rinse and repeat but was hopeful there might be a more simple approach with SAS and I could learn something with SAS as well. 

Krueger
Pyrite | Level 9

I've been experimenting around learning about cumulative sums. This is where I'm at thus far in case it inspires any other ideas from someone else. It doesn't work but I'm messing around with it still.

 


DATA WANT;
	SET HAVE;
	BY CUSTOMER_KEY N;
	IF FIRST.N THEN 
		DO;
			AMOUNT_PAID = 0;
			AMOUNT_OUTSTANDING = CHARGE;
			AMOUNT_WAIVED = 0;
		END;
	IF CODE= 36 THEN 
		DO;
			AMOUNT_PAID = AMOUNT_PAID;
			AMOUNT_OUTSTANDING = AMOUNT_OUTSTANDING + CHARGE;
			AMOUNT_WAIVED = AMOUNT_WAIVED;
		END;
	IF CODE IN (18, 19) THEN 
		DO;
			AMOUNT_PAID = AMOUNT_PAID + CHARGE;
			AMOUNT_OUTSTANDING = AMOUNT_OUTSTANDING;
			AMOUNT_WAIVED = AMOUNT_WAIVED;
		END;
	IF CODE IN (15, 31) THEN 
		DO;
			AMOUNT_PAID = AMOUNT_PAID;
			AMOUNT_OUTSTANDING = AMOUNT_OUTSTANDING;
			AMOUNT_WAIVED = AMOUNT_WAIVED + CHARGE;
		END;
	ELSE output;
	DROP CODE;
RUN;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 831 views
  • 0 likes
  • 3 in conversation