Hi,
I have a dataset containing one record for each claim. it contains claim number , clm created date and clm closed date. I want to know number of pending claims at each month end, point in time.
Cl no. Clmcreatedt Clmclosedt
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
if i do it for single month manually the code would be ,
data want ;
set have (where=(clm_created_dt <= '30APR2018'd and (clm_closed_dt = . or clm_closed_dt > '30APR2018'd ))) ;
run ;
No. of records will be pending count for April 2018.
However, i don't want to write this for each month from 2015 to 2018, so i have put all month end dates (using intnx and call symput. - code not shown here) into macro variables: lastDay1 to lastDay48. 1 for 31JAN2015 and 48 for 31DEC2018.
How can i subset in the above step automatically and give one dataset for each month. the record number of each dataset is pending claim count for respective month.
Thank you,
So your data looks like this
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
Why do you want to split into separate data sets?
Thanks for replying, even if it doesn't require splitting its fine, I am interested in counts. Could be any method.
Something like this?
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
data temp;
set have;
Clmclosedt_end=intnx('month',Clmclosedt, 0, 'end');
format Clmclosedt_end date9.;
run;
proc freq data=temp;
tables Clmclosedt_end / out=want nocum;
run;
Thanks, what i am looking for is how many claims are pending (sill open at that time, either closing date is missing or closing date is after the respective month,
So, for FEB 2017 the count would be 2, for APR 2017 count would be 1, etc.. so the frequency is taking open and closed date into consideration, it only shifts time at which we are counting (point in time).
E.g. For FEB2017, I want all claims opened anytime before FEB2017 (not necessarily in FEB2017), and either not closed or closing date is after FEB2017. That will give me pending claim count point in time for FEB 2017.
Thank you for your time.
You can do something like this
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
proc sql;
create table want as
select *
,count(distinct Clno) as count
from have
where Clmclosedt=. or Clmcreatedt<=Clmclosedt
group by year(Clmcreatedt), month(Clmcreatedt);
quit;
and edit the where clause to consider only year 2015 through 2018. Then use a simple RETAIN Statement in a data step to give you the accumulated count.
@vpgodbole wrote:
Thanks, what i am looking for is how many claims are pending (sill open at that time, either closing date is missing or closing date is after the respective month,
So, for FEB 2017 the count would be 2, for APR 2017 count would be 1, etc.. so the frequency is taking open and closed date into consideration, it only shifts time at which we are counting (point in time).
E.g. For FEB2017, I want all claims opened anytime before FEB2017 (not necessarily in FEB2017), and either not closed or closing date is after FEB2017. That will give me pending claim count point in time for FEB 2017.
Thank you for your time.
For this you need to explicitly tell us what the end date rule to consider would be. Otherwise we don't have any idea when to stop. Either a specific date, a relative date base on the creation date (do you allow "pending" to run for 5 years? 10 Years? 25 years?) or perhaps program run date.
Also since you show
104 27AUG2016 29SEP2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
that is 4 claims pending at the end of Feb 2017.
Perhaps:
data have; input Clno (Clmcreatedt Clmclosedt)(:date9.); format Clmcreatedt Clmclosedt date9.; datalines; 101 30APR2017 01JUL2017 102 02JAN2017 08JAN2017 103 25MAR2017 07MAY2017 104 27AUG2016 29SEP2018 105 30JUN2018 . 106 30APR2018 01JUL2018 107 06OCT2015 . 108 14FEB2017 30DEC2017 109 16FEB2017 . 110 01MAY2018 27JUN2018 ; run; data temp; set have; EndMonth = coalesce(intnx('month',Clmclosedt,0,'E'),intnx('month',today(),0,'E')); format endmonth PendMonth date9.; PendMonth = intnx('month',Clmcreatedt,0,'E'); do while (pendmonth lt endmonth); Pending =1; output; pendmonth= intnx('month',pendmonth,1,'E'); end; run; proc freq data=temp; tables pendmonth; run;
The start year month would be JAN 2015 and End month and year would be DEC 2018 or could be current month and year.
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
data temp;
set have;
Clmmonth_end=intnx('month',Clmcreatedt, 0, 'end');
format Clmmonth_end date9.;
keep Clmmonth_end ;
run;
proc sql;
create table want as
select *
from temp a, have b
group by Clmmonth_end
having (Clmcreatedt <= Clmmonth_end and (Clmclosedt = . or Clmclosedt > Clmmonth_end))=1
order by Clmmonth_end,clno;
quit;
If the above is ok, and you want just the counts, let me know
Yes, this is very helpful, next I can get counts by simply running freq on Clmmonth_end?
Yep whichever approach you want, or do you want me to modify the above? or you wanna take a stab at it yourself? feel free!
just the counts:
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
data temp;
set have;
Clmmonth_end=intnx('month',Clmcreatedt, 0, 'end');
format Clmmonth_end Clmmonth_begin date9.;
keep Clmmonth_end ;
run;
proc sql;
create table want as
select Clmmonth_end format=monyy.,sum(Clmcreatedt <= Clmmonth_end and (Clmclosedt = . or Clmclosedt > Clmmonth_end)) as pending_claims
from temp a, have b
group by Clmmonth_end
order by Clmmonth_end;
quit;
Sorry everyone for the confusion..
If we can further develop:-
I want counts till current month and year, Your code is taking the last End date into consideration which is Jun2018.
What if last dt is May2018 and I want counts for June and July to show 0? or in the above case July should be zero.
Months to consider for this query can be JAN2015 to current month year or DEC2018.
@vpgodbole wrote:
Sorry everyone for the confusion..
If we can further develop:-
I want counts till current month and year, Your code is taking the last End date into consideration which is Jun2018.
What if last dt is May2018 and I want counts for June and July to show 0? or in the above case July should be zero.
Months to consider for this query can be JAN2015 to current month year or DEC2018.
If this is reply to my example please note that it was prefaced with a question directly related to this: what is the end.
The COALESCE statement was to select one of two conditions since you indicated you have blank data and the order would be important. So the coalesce picks the first of the dates: the end date if present or todays date (since lacking any explicit prior requirement). You could replace the second part of the endmonth assignment with a specific date:
EndMonth = coalesce(intnx('month',Clmclosedt,0,'E'),'31DEC2018'd));
for example
Pick one: current month year (already provided) or Dec 2018 ^^^
You may have to describe in more detail what "Months to consider for this query can be JAN2015 " might mean as do you only want to count "pending" starting at Jan2015 if the creation date was prior to that?
Or only consider creation dates of Jan2015 or later.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.