DATA Step, Macro, Functions and more

If string is macthed and number of obs. per bygroup > 1 then delete

Reply
Occasional Contributor
Posts: 10

If string is macthed and number of obs. per bygroup > 1 then delete

Hi everybody

 

I hope you can help me. I have a dataset which depicts the case flow per ID/case.  Of this I am calculating the days from when a case step is starting to when it's ending of a case to measure how good the caseworker is performing. 

You can follow the flow of a case by the variables SAGSTRIN, QUEUE and NEW_QUEUE and in that order. The meaning of an "ending" case step is where the case has been set to rest so to speak. I am only interested in the last step (new_queue) and have in a previous data step used the following if statement:

 

if index(trim(upcase(NEW_QUEUE)),"XREF") GE 1 AND index(trim(upcase(SAGSTRIN)),"XREF") NE 1;

 

where XREF is how a "ending case step" is spelled.

Above data step is used to remove all other steps than XREF steps in the final new_queue.

There can be multiple XREF step per ID and it's not all XREF step I am interested in. Let's say the case worker awaits information and the case is put to a XREF step until that information arrives. That shouldn't be included when determine the number of days a caseworker has used on a case.

So looking through data I have seen a kind of systematic in how the structure us, so I can set up some rules for which XREF steps to be excluded per case/ID and it is this I would like some help with.

 

So far I don’t want a case completely removed, example if a case has 2 XREF steps with "”Dødvurd Mgl Opl Mod" from below is should only remove one so there is always one step left per case.

 

So below I have counted the cases pr. ID and a sum variable:

 

options obs=250;

*Dannelse af en sum variabel som viser antallet af sager pr. ID;

data case1;

set &navn._død_m_XREF_u_vudr_dag_X2;

by ID;

ID_count+1;

if first.ID then ID_count=1;

run;

 

proc sort data=case1;

by ID descending ID_count;

run;

 

data case2;

set case1;

by ID;

retain ID_count_sum;

if first.ID then ID_count_sum=ID_count;

output;

run;

 

proc sort data=case2 out=outfil.sas7bdat;by ID ID_count;run;

 

 

Then I would like the following rules to apply:

 

If the step before new_queue contains "Dødvurd Mgl Opl Mod" and the sum of XREF steps>1 then the XREF step will be removed.

         Then I run the code above again to count cases per ID and next rule:

If the step before new_queue contains "Dødvurd Mgl Opl Ej Mod" and the sum of XREF steps>1 then the XREF step will be removed.

         Then I run the code above again to count cases per ID and next rule:

If the step before new_queue contains "Dødafv oplysninger" and the sum of XREF steps>1 then the XREF step will be removed.

 

 

I have tried with the following code but found that if a case has 2 instances of "Dødvurd Mgl Opl Mod" both will be removed. Because the sum is of course 2 and so it will delete both of them.

 

data case3;

set case2;

by ID ID_count;

if first.id and prxmatch("m/Dødvurd Mgl Opl Modt/oi",QUEUE) and ID_count_sum>1 then delete;

run;

 

I hope you can help me with the coding and maybe it can be done more simple.

Thanks in advanced.

 

Super User
Posts: 5,424

Re: If string is macthed and number of obs. per bygroup > 1 then delete

It's always good to know the overall requirement what trying to answer a query.
But it's also a danger to get "drowned" in all this information.
Is it possible for you to narrow down your problem to one step, with sample input data to that step, and desired output?
Data never sleeps
Ask a Question
Discussion stats
  • 1 reply
  • 210 views
  • 0 likes
  • 2 in conversation