## Conditionally calculate variables by ID conditional on how many times that ID is present in data

I am not sure how exactly to explain this.

I have a dataset of loans with ids, monthly indicators of time, and monthly indicators of how much is owed.

I want to count how many times a given loan occurs in the dataset for the previous 6 months. I also want to calculate the difference between the current observation's amount owed and the mean amount owed over those six months.

Here's what the data looks like. I want to calculate pmts and omb_n_change.

 loan_id calendar_time omb_n pmts omb_n_change 1 96 165836 0 0 1 97 165646 1 165646 - 165836 1 98 165646 2 165646 - average of (165646,165836) 1 99 165646 3 165646 - average of (165646,165646,165836) 1 110 161957 0 0 1 111 161748 1 161748 - 161957 1 112 161537 2 161537 - average of (161748,161957) 1 113 161537 3 161537 - average of (161537,161748,161957) 1 114 161537 4 161537 - average of (161537,161537,161748,161957) 1 115 161326 5 161326 - average of (161537,161537,161537,161748,161957) 1 116 161326 6 161326 - average of (161326,161537,161537,161537,161748,161957) 1 117 161113 6 161113 - average of (161326,161326,161537,161537,161537,161748)

Loan_id is an id for each loan, calendar_time indicates what month that payment was made, and omb_n indicates how much was still owed that month. So basically I want to count how many times loan_id "1" occurred in my data over calendar_time - 6 as well as the mean value of omb_n over that time period (then take the difference between omb_n and that mean).

I tried using a hash iterator but am a rookie with those so I got lost before I could really get anything coherent.

1 ACCEPTED SOLUTION

Accepted Solutions  Ksharp
Super User

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Assuming your table is not big. Otherwise, you need Hash Table.

``````data have;
input loan_id calendar_time omb_n;
cards;
1 96 165836
1 97 165646
1 98 165646
1 99 165646
1 110 161957
1 111 161748
1 112 161537
1 113 161537
1 114 161537
1 115 161326
1 116 161326
1 117 161113
;

proc sql;
create table want as
select *,
(select count(*) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as pmts,
(select mean(omb_n) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as average,
omb_n-calculated average as  omb_n_change
from have as a
;
quit;``````
5 REPLIES 5

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Your calendar_time 117 is more than 6 months from 110. Your illustration for 117 violates 6-month interval.

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Alright well I guess that would be a different number then.

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Title, description and table don't match. I don't see "calculate ... on how many times that ID is present in data". Please explain. And when you do that, post data in usable form and show exactly, what you expect as result.  Ksharp
Super User

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

Assuming your table is not big. Otherwise, you need Hash Table.

``````data have;
input loan_id calendar_time omb_n;
cards;
1 96 165836
1 97 165646
1 98 165646
1 99 165646
1 110 161957
1 111 161748
1 112 161537
1 113 161537
1 114 161537
1 115 161326
1 116 161326
1 117 161113
;

proc sql;
create table want as
select *,
(select count(*) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as pmts,
(select mean(omb_n) from have where loan_id=a.loan_id and calendar_time between a.calendar_time-1 and a.calendar_time-6) as average,
omb_n-calculated average as  omb_n_change
from have as a
;
quit;``````

## Re: Conditionally calculate variables by ID conditional on how many times that ID is present in data

This is perfect, thanks!

Discussion stats
• 5 replies
• 295 views
• 1 like
• 4 in conversation