Hi....I have a dataset that contains a list of unique records for customers who are approved for specific products during a specific time period. The Approval dataset looks like this:
Customer_ID | Product_ID | Effective_Date | Expiration_Date |
1232 | 1127654 | 20160101 | 20160430 |
1232 | 1200782 | 20160201 | 20160525 |
1345 | 1097862 | 20151223 | 20160325 |
1345 | 1097862 | 20160401 | 20160428 |
1345 | 1097862 | 20160515 | 20160615 |
1345 | 1163452 | 20160201 | 20160630 |
I want to be able to check another dataset (Claims_Data) that contains all the claims submitted for these same customers and obtain the following:
Customer_ID | Product_ID | Count_Approved | Count_Not_Approved | No_Claims |
1232 | 1127654 | 3 | 1 | |
1232 | 1200782 | 4 | ||
1345 | 1097862 | 5 | 2 | |
1345 | 1163452 | 1 |
Count_Approved is the number of claims where the claim_date from the Claims_Data dataset is between the Effective_Date and Expiration Date for the respective customer for the each Product_ID.
Count_Not_Approved is the number of claims submitted where the claim_date is outside the range for the Effective_Date and Expiration Date for the respective customer for the each Product_ID.
As it is possible that customers may be approved for products and yet they may not have submitted any claims for the product. In this case, No_Claims=1 would be entered to keep track. Any suggestions. Thanks,
Data step solution:
data approval;
input
customer_id
product_id
effective_date:yymmdd8.
expiration_date:yymmdd8.
;
format
effective_date
expiration_date
yymmddn8.
;
cards;
1232 1127654 20160101 20160430
1232 1200782 20160201 20160525
1345 1097862 20151223 20160325
1345 1097862 20160401 20160428
1345 1097862 20160515 20160615
1345 1163452 20160201 20160630
;
run;
data claims_data;
input
customer_id
product_id
claim_date:yymmdd8.
;
format claim_date yymmddn8.;
cards;
1232 1127654 20160320
1232 1127654 20160517
;
run;
data want (
keep=customer_id product_id count_approved count_not_approved no_claims
);
merge
approval (in=a)
claims_data (in=b)
;
by customer_id product_id;
if a;
if not b
then do;
count_approved = 0;
count_not_approved = 0;
no_claims = 1;
output;
end;
else do;
if first.product_id
then do;
count_approved = 0;
count_not_approved = 0;
no_claims = .;
end;
if effective_date <= claim_date <= expiration_date
then count_approved + 1;
else count_not_approved + 1;
if last.product_id then output;
end;
run;
You can expand the test data in claims_data to verify more test cases, but I think it solves your requirements.
With your data, it might be necessary to sort the datasets first by customer_id product_id.
Data step solution:
data approval;
input
customer_id
product_id
effective_date:yymmdd8.
expiration_date:yymmdd8.
;
format
effective_date
expiration_date
yymmddn8.
;
cards;
1232 1127654 20160101 20160430
1232 1200782 20160201 20160525
1345 1097862 20151223 20160325
1345 1097862 20160401 20160428
1345 1097862 20160515 20160615
1345 1163452 20160201 20160630
;
run;
data claims_data;
input
customer_id
product_id
claim_date:yymmdd8.
;
format claim_date yymmddn8.;
cards;
1232 1127654 20160320
1232 1127654 20160517
;
run;
data want (
keep=customer_id product_id count_approved count_not_approved no_claims
);
merge
approval (in=a)
claims_data (in=b)
;
by customer_id product_id;
if a;
if not b
then do;
count_approved = 0;
count_not_approved = 0;
no_claims = 1;
output;
end;
else do;
if first.product_id
then do;
count_approved = 0;
count_not_approved = 0;
no_claims = .;
end;
if effective_date <= claim_date <= expiration_date
then count_approved + 1;
else count_not_approved + 1;
if last.product_id then output;
end;
run;
You can expand the test data in claims_data to verify more test cases, but I think it solves your requirements.
With your data, it might be necessary to sort the datasets first by customer_id product_id.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.