Solved: Conditionally calculate variables by ID conditional on how many times ...

uwabe3 · Posted 02-21-2022 09:51 PM

I am not sure how exactly to explain this.

I have a dataset of loans with ids, monthly indicators of time, and monthly indicators of how much is owed.

I want to count how many times a given loan occurs in the dataset for the previous 6 months. I also want to calculate the difference between the current observation's amount owed and the mean amount owed over those six months.

Here's what the data looks like. I want to calculate pmts and omb_n_change.

loan_id	calendar_time	omb_n	pmts	omb_n_change
1	96	165836	0	0
1	97	165646	1	165646 - 165836
1	98	165646	2	165646 - average of (165646,165836)
1	99	165646	3	165646 - average of (165646,165646,165836)
1	110	161957	0	0
1	111	161748	1	161748 - 161957
1	112	161537	2	161537 - average of (161748,161957)
1	113	161537	3	161537 - average of (161537,161748,161957)
1	114	161537	4	161537 - average of (161537,161537,161748,161957)
1	115	161326	5	161326 - average of (161537,161537,161537,161748,161957)
1	116	161326	6	161326 - average of (161326,161537,161537,161537,161748,161957)
1	117	161113	6	161113 - average of (161326,161326,161537,161537,161537,161748)

Loan_id is an id for each loan, calendar_time indicates what month that payment was made, and omb_n indicates how much was still owed that month. So basically I want to count how many times loan_id "1" occurred in my data over calendar_time - 6 as well as the mean value of omb_n over that time period (then take the difference between omb_n and that mean).

I tried using a hash iterator but am a rookie with those so I got lost before I could really get anything coherent.

Ksharp · Posted 02-22-2022 07:02 AM

Assuming your table is not big. Otherwise, you need Hash Table.

data have;
input loan_id calendar_time omb_n;
cards;
1 96 165836
1 97 165646
1 98 165646
1 99 165646
1 110 161957
1 111 161748
1 112 161537
1 113 161537
1 114 161537
1 115 161326
1 116 161326
1 117 161113
;

proc sql;
create table want as
select *,
(select count(*) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as pmts,
(select mean(omb_n) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as average,
 omb_n-calculated average as  omb_n_change
 from have as a
;
quit;

View solution in original post

KachiM · Posted 02-21-2022 11:41 PM

Your calendar_time 117 is more than 6 months from 110. Your illustration for 117 violates 6-month interval.

uwabe3 · Posted 02-21-2022 11:43 PM

Alright well I guess that would be a different number then.

andreas_lds · Posted 02-22-2022 02:17 AM

Title, description and table don't match. I don't see "calculate ... on how many times that ID is present in data". Please explain. And when you do that, post data in usable form and show exactly, what you expect as result.

Ksharp · Posted 02-22-2022 07:02 AM

Assuming your table is not big. Otherwise, you need Hash Table.

data have;
input loan_id calendar_time omb_n;
cards;
1 96 165836
1 97 165646
1 98 165646
1 99 165646
1 110 161957
1 111 161748
1 112 161537
1 113 161537
1 114 161537
1 115 161326
1 116 161326
1 117 161113
;

proc sql;
create table want as
select *,
(select count(*) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as pmts,
(select mean(omb_n) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as average,
 omb_n-calculated average as  omb_n_change
 from have as a
;
quit;

uwabe3 · Posted 02-22-2022 10:27 AM

This is perfect, thanks!

Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Registration is open

Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Registration is open

SAS Training: Just a Click Away