Help using Base SAS procedures

Hunting Down Problem Cases in a Large Data set

Reply
Occasional Contributor
Posts: 10

Hunting Down Problem Cases in a Large Data set

Hi all--

I am working with a very large data set and I need to identify some problem cases. I have to find the cases which are being dropped where the MAX

of CLOSINGS_ID for the CASE_ID group is BLANK/NULL and an earlier CLOSE date (a calculated variable) has been picked up.

The cases I’m trying to hunt down look like this:

CASE_ID

CLOSING_DT_1

CLOSING_DT_2

CLOSINGS_ID

CLOSE

111

.

7-Dec-11

1

7-Dec-11

111

7-Oct-11

.

2

7-Oct-11

111

.

3

.

111

.

.

4

.

I am using these two programs below. One with the HAVING clause and then one without the HAVING clause and then trying all sorts of different ways to combine them together to find with cases which are dropped when the have HAVING clause is used. I can’t figure it out.  I just want a list of all these problem cases. Is there a way to do this is Base SAS as well as using Proc SQL as well?

Any help is greatly appreciated!!

Proc sql;

create table With_HAVING as

select distinct

CASE_ ID,

CLOSING_DT_1,

CLOSING_DT_2

CLOSINGS_ID,

case when CLOSING_DT_2 gt .then CLOSING_DT_2

      when CLOSING_DT_1 gt . then CLOSING_DT_1  end as CLOSE

from DATA_HAVE

group by CASE_ ID, CLOSINGS_ID

having CLOSINGS_ID =max(CLOSINGS_ID)

;quit;

Proc sql;

create table With_NO_HAVING as

select distinct

CASE_ ID,

CLOSING_DT_1,

CLOSING_DT_2

CLOSINGS_ID,

case when CLOSING_DT_2 gt .then CLOSING_DT_2

      when CLOSING_DT_1 gt . then CLOSING_DT_1  end as CLOSE

from DATA_HAVE

;quit;

Super User
Posts: 9,671

Re: Hunting Down Problem Cases in a Large Data set

Maybe you could explain it more detail.

What is BLANK/NULL when you refer to MAX .

And what output would you like to see and why you have to use DATA STEP ?

Ksharp

Occasional Contributor
Posts: 10

Re: Hunting Down Problem Cases in a Large Data set

Thanks KSharp.  By Blank/Null  I am on referring to all the cases where CLOSE is . When I take  the highest (MAX) CLOSINGS_ID. 

CASE_IDCLOSING_DT_1CLOSING_DT_2CLOSINGS_IDCLOSE
111.#####17-Dec-11
1117-Oct-11.27-Oct-11
111. 3.
111..4.

So, when I take the MAX  CLOSINGS_ID and group  by CASE_ID I'm actually dropping a close date when I should pick it up.

CASE_IDCLOSING_DT_1CLOSING_DT_2CLOSINGS_IDCLOSE
111..4.

The output I need is just a data set.

is the above helpful?

Thanks. 

Super User
Posts: 5,079

Re: Hunting Down Problem Cases in a Large Data set

Jake,

This is actually something that a DATA step does pretty easily.  Subject to my understanding the problem correctly, here is one attempt that you can tweak as you see fit.  The drawback:  it relies on the incoming data being sorted by CASE_ID.

data problems;

   set have;

   by case_id;

   if first.case_id then do;

      earlier_dt1 = .;

      earlier_dt2 = .;

   end;

   retain earlier_dt1 earlier_dt2;

   if closing_dt1 > . then earlier_dt1 = closing_dt1;

   if closing_dt2 > . then earlier_dt2 = closing_dt2;

   if last.case_id and (closing_dt1 = closing_dt2 = .) and (earlier_dt1 > . or earlier_dt2 > .);

run;

The logic of which cases are selected may be exactly what you want, or may be very close.  But you can test it and see how close it comes.

Good luck.

Super User
Posts: 9,671

Re: Hunting Down Problem Cases in a Large Data set

It is easy for SQL. If I understand your question.

data have;
infile cards expandtabs truncover;
input CASE_ID     CLOSING_DT_1 : date9.     CLOSING_DT_2 : date9.     CLOSINGS_ID     ;
format  CLOSING_DT_1 CLOSING_DT_2  date11.;
cards;
111     .     7-Dec-11     1     
111     7-Oct-11     .     2
111     .     .     3     
111     .     .     4
;
run;
proc sql;
create table close as
 select *,coalesce(CLOSING_DT_1,CLOSING_DT_2) as CLOSE format date11.
  from have;

create table dropped as
 select * 
  from close 
   group by CASE_ID
    having CLOSINGS_ID=max(CLOSINGS_ID) and CLOSE is missing;

create table pick_up as
 select * from close
  except
 select * from dropped ;

quit;


Ksharp

Ask a Question
Discussion stats
  • 4 replies
  • 221 views
  • 0 likes
  • 3 in conversation