Solved: Re: Find Records

tmm · Posted 07-16-2012 10:33 AM

I touched on this earlier but think I got off on the wrong path. I have 125K plus records that vary in the place of service. An example is:

id plc of service dos disch

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 17dec2010 30dec2010

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

I need to say that if anything other than hospital shows up, then look at the disch and dos. If the dos is the same as the hospital disch date then all those records go into one file where. So my outcome I want to be:

Hospital table:

id plc of service dos disch

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 17aug2011 30aug2011

Other table:

id plc of service dos disch

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 17dec2010 30dec2010

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

I started my code and know I need a subquery but not sure I am doing the subquery right.

PROC SQL;

CREATE TABLE ReAdm.Claims_inpt2 AS

(SELECT DISTINCT

A.*

(SELECT * B.ADMIT_DT FROM ReAdm.Claims_inpt

WHERE A.CLM_PL_OF_SRVC_DESC <> 'INPATIENT HOSPITAL' AND

B.ADMIT_DT = A.ADMIT_DT AND B.MBR_SYS_ID = A.MBR_SYS_ID)

AS OTHERADM)

FROM ReAdm.Claims_inpt

WHERE A.CLM_PL_OF_SRVC_DESC = 'INPATIENT HOSPITAL');

RUN;

The above should give me something like:

id plc of service dos disch otheradm

111 hospital 11dec2010 15dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010 07jun2010

444 nursing 07jun2010 15jun2010

444 hospital 17aug2011 30aug2011

Then from there I can take and do another query to isolate only the hospital and only the otheradm.

Reeza · Posted 07-17-2012 12:16 PM

The code I provided earlier works, with a fix for the episode part. Just make sure your new_id is formatted correctly.

The point is to create a unique id per 'episode' based on the id and then a counter, 1 for first episode. But because you'll have multiple episodes up to the 100 per patients you need to add zeros so things will sort properly.

data have;

informat dos disch date9.;

format dos disch date9.;

input id $ plc $ dos disch ;

cards;

111 hospital 15jun2010 18jun2010

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

555 rehab 15aug2011 29aug2011

666 hospital 01jan2010 02jan2010

777 skilled 06aug2011 15aug2011

888 hospice 10aug2011 30aug2011

999 hospital 01jan2010 15jan2010

999 hospital 20feb2010 25feb2010

999 rehab 27feb2010 28feb2010

999 skilled 03mar2010 07mar2010

001 hospital 20aug2010 25aug2010

001 hospital 15sep2011 15sep2011

002 rehab 05sep2011 07sep2011

002 hospital 08sep2011 15sep2011

002 hospice 15sep2011 17sep2011

003 hospital 01jan2010 15jan2010

004 rehab 15jan2010 17jan2010

005 hospital 01jan2011 15jan2011

005 rehab 27jan2011 30jan2011

;

run;

proc sort data=have;

by id dos disch;

run;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date9.;

prev_disch=lag(disch);

*time between dos and discharge;

if not first.id then time_between=dos-prev_disch;

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if time_between >30 then episode+1;

*Create id to group episodese for the query later on;

*Make sure this fits your actual data, ie you may have up to 100 episodes;

new_id=trim(id)||put(episode, z2.);

run;

*Only Hospitals;

proc sql;

create table hospital as

select *

from episodes

group by new_id

having min(plc)='hospital' and max(plc)='hospital';

quit;

*All the others;

proc sql;

create table others as

select *

from episodes

group by new_id

having min(plc)ne 'hospital' or max(plc)ne 'hospital';

quit;

View solution in original post

Reeza · Posted 07-16-2012 02:14 PM

I still think the method I proposed is easier to understand.

data have;

informat dos disch date9.;

format dos disch date9.;

input id plc $ dos disch ;

cards;

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 17dec2010 30dec2010

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

;

run;

*Isolate into episodes;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date9.;

prev_disch=lag(disch);

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if dos ne prev_disch then episode+1;

*Create id to group episodese for the query later on;

new_id=put(id, z3.)||put(episode, z2.);

run;

*Only Hospitals;

proc sql;

create table hospital as

select *

from episodes

group by new_id

having min(plc)='hospital' and max(plc)='hospital';

quit;

*All the others;

proc sql;

create table others as

select *

from episodes

group by new_id

having min(plc)ne 'hospital' or max(plc)ne 'hospital';

quit;

tmm · Posted 07-16-2012 03:00 PM

Ok. Now if there is a situation of the below which I just found in fishing thru the 125k plus records

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 20dec2010 30dec2010 changed this date

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 19jun2010 25jun2010 changed this date

444 hospital 17aug2011 30aug2011

would I put:

*Isolate into episodes;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date9.;

prev_disch=lag(disch);

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if dos ne prev_disch then episode+30; a +30 on this???

*Create id to group episodese for the query later on;

new_id=put(id, z3.)||put(episode, z2.);

run;

Reeza · Posted 07-16-2012 03:06 PM

No. Your question said:

If the dos is the same as the hospital disch date then all those records go into one file

Which was different from your previous requirement of 30 days.

Are you saying it is now 30 days again?

Assuming so, change the condition that is checked, not the calculation.

The condition is dos ne prev_disch so you'll need to see if the difference between these dates is less than 30 days.

(dos - prev_disch) <= 30 (Make sure its less than or equal to depending on what you need.

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date9.;

prev_disch=lag(disch);

*Calculate the time from previous stay, if this is not a new id;

if not first.id then time_from_previous_stay=dos-prev_disch;

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if time_from_previous_stay <30 then episode+1; *Should be less than or equal to???

*Create id to group episodese for the query later on;

new_id=put(id, z3.)||put(episode, z2.);

run;

tmm · Posted 07-16-2012 03:29 PM

Not sure. The table is auto constructed to have what it has. Not my table it is on a server. It has 125k records and unless I go thru line by line I really do not know all the scenarios. I am up to row 1015 and am writing down all the scenarios as I see them where it is not a 100% hospital. I have to log all of them because I explained to my manager that what I have seen so far is not a simple case of what I described above. I hate projects like these. I am thinking instead of manually going thru all these 125k records that maybe I can create a query that says give me distinct ID and their admit disch date without place of service so I can see all the possible scenarios. I alerted my boss to the scenarios so far and she seems to think if we isolate all 100% hospital and then the others are within a 30day that should get us all that we want. Again, did I say I hate projects like these? Nothing in any of our claims systems is ever cut and dry. There are always a set of scenarios but then you find oops, now I have 12 monkey wrenches thrown into it all.

tmm · Posted 07-16-2012 03:30 PM

I will try the above scenarios when my other SAS query is done which is actually extracting some huge data. Will let you know how this works out. Thanks.

tmm · Posted 07-16-2012 05:26 PM

I think I did something wrong. I have this:

data have;

informat dos disch date8.;

format dos disch date8.;

input id plc $ dos disch ;

cards;

111 hospital 11dec2010 15dec2010

111 hospital 15jun2010 18jun2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

;

run;

*Isolate into episodes;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date8.;

prev_disch=lag(disch);

*Calculate the time from previous stay, if this is not a new id;

if not first.id then time_from_previous_stay=dos-prev_disch;

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if time_from_previous_stay between >=0 and <=30 then episode+1;

*Create id to group episode for the query later on;

new_id=put(id, z3.)||put(episode, z2.);

run;

*Only Hospitals;

proc sql;

create table hospital as

select *

from episodes

group by new_id

having min(plc)='hospital' and max(plc)='hospital';

quit;

*All the others;

proc sql;

create table others as

select *

from episodes

group by new_id

having min(plc)ne 'hospital' or max(plc)ne 'hospital';

quit;

I get this back on each table:

4 back in the hospital and 7 in the other but there should be 5 based on the 15jun2010 I put in. I am finding more and more oddities in this table of 125k. It is not strictly first service is hospital and then other things follow. I have some where the first service is nursing or rehab and they never even go to the hospital. It is just getting more and more complex the more I dig into this table.

tmm · Posted 07-16-2012 05:30 PM

I found this scenario in the table and searching for more.

111 hospital 11dec2010 15dec2010

111 hospital 15jun2010 18jun2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

555 rehab 15aug2011 29aug2011

;

Reeza · Posted 07-16-2012 05:41 PM

So what's the problem?

According to your criteria there isn't an issue. You need to explicitly state your criteria to avoid these issues when someone else needs to repeat the analysis

ie.

Select all data from period Jan 2009 to February 2012.

Find patients first hospital visit.

Separate out patients who were in hospital

Separate out patients who went to other facilities.

*Because you've selected using a date criteria there will be patients whose first visit is outside of the time period;

*Also if its health care perhaps benefits weren't applicable until after hospital stay or was covered under anotehr policy;

Also there's a bug in your code:

time_from_previous_stay between >=0 and <=30

Check if that line is working the way you expect it to.

tmm · Posted 07-16-2012 05:47 PM

Sorry. The problem is this is not my table. It is a table on our claims system. I have to go thru each row of over 125K records to find the varying scenarios I am running across and the problem now is that now that I am having to look at each row in this huge table I might as well click edit in SAS and manually put in skillled nursing or rehab. If I am going thru each record I might as well do that anyway. I was trying to avoid that by coding this but it seems I cannot because I have to view each ID to get all the scenarios logged to get help on the SAS forum. The PM's have no criteria other than they want hospital separate from others. The problem is when summing it all up the numbers do not match and I told my boss and she is like well you will have to figure out what hospital goes with rehab, skilled nursing, hospice, etc. You will have to review the dates of the records for each member ID and I am like there has to be an easier way than this. This is just plain ridiculous..

tmm · Posted 07-16-2012 05:53 PM

Sorry this has so many twists and turns. Nature of the data we store. I just wished I was well versed in SAS so I did not have to ask for help and frustrate everyone. If I had someone internally to go to I would this way I could share the REAL information but cannot with outside world as it would be a violation of PHI and HIPAA and I could lose my job and get fined by the govt. So I am doing the best I can with providing the items I am finding. I am fairly to SAS and do mostly proc sql stuff. and then case stuff.

data have;

informat dos disch date8.;

format dos disch date8.;

input id plc $ dos disch ;

cards;

111 hospital 15jun2010 18jun2010

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

555 rehab 15aug2011 29aug2011

666 hospital 01jan2010 02jan2010

777 skilled 06aug2011 15aug2011

888 hospice 10aug2011 30aug2011

999 hospital 01jan2010 15jan2010

999 hospital 20feb2010 25feb2010

999 rehab 27feb2010 28feb2010

999 skilled 03mar2010 07mar2010

;

run;

tmm · Posted 07-16-2012 05:54 PM

*Isolate into episodes;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date8.;

prev_disch=lag(disch);

*Calculate the time from previous stay, if this is not a new id;

if first.id then episode =1;

if not first.id then time_from_previous_stay=dos-prev_disch;

if time_from_previous_stay =>1 or time_from_previous_stay <=30 then episode+1;

*Create id to group episode for the query later on;

new_id=put(id, z3.)||put(episode, z2.);

run;

*Only Hospitals;

proc sql;

create table hospital as

select *

from episodes

group by new_id

having min(plc)='hospital' and max(plc)='hospital';

quit;

*All the others;

proc sql;

create table others as

select *

from episodes

group by new_id

having min(plc)ne 'hospital' or max(plc)ne 'hospital';

quit;

Maybe the time issue of between does not work so I changed it to the above. The query runs but still not right output and I am lost. This project is more than I can handle.

tmm · Posted 07-16-2012 06:07 PM

I have an inpatient table and in that table are various dates of admission and discharges. A little over 125k rows in all. Member can have more than 1 admission to everything from hospital, nursing, rehab, hospice. Any hospital only admission where the member clearly only has 1 admission in the entire table goes into the HOSPITAL TABLE. If the member had more than 1 admission but they were never sent to nursing, rehab or hospice, that goes in the HOSPITAL TABLE. The tricky part for the other table. If a member has an admission to a hospital and then is discharged and then admitted to a nursing, rehab or hospice the same date of the discharge, that goes into the OTHER TABLE. If the member has a hospital admission and then is discharged and then admitted to a nursing, rehab or hospice within 30 days that goes into the OTHER TABLE. (I am making the call on 30days to make this easier). If the member was only admitted to rehab, nursing or hospice and never had a hospital stay, that goes into the OTHER TABLE.

I hope this makes sense. The problem is I am doing this readmission project and it was all fine and done and then they threw in place of service. They said separate hospital from all others. I did that and the odds ratio was so off. The readmits were right but the admissions were too high because those that were admitted to hospital and readmitted to hospice, nursing or rehab their original admissions are counted in the first query I did because they have place of service as hospital. I think I have confused everyone haven't I?

tmm · Posted 07-17-2012 10:18 AM

Ok Maybe this will help. I have a table of 125k+ rows all with the ID, dos, disch date and place of service. There are various options within this table.

I went thru all 125k+ rows to find all the options.

data have;

informat dos disch date8.;

format dos disch date8.;

input id plc $ dos disch ;

cards;

111 hospital 15jun2010 18jun2010

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

555 rehab 15aug2011 29aug2011

666 hospital 01jan2010 02jan2010

777 skilled 06aug2011 15aug2011

888 hospice 10aug2011 30aug2011

999 hospital 01jan2010 15jan2010

999 hospital 20feb2010 25feb2010

999 rehab 27feb2010 28feb2010

999 skilled 03mar2010 07mar2010

001 hospital 20aug2010 25aug2010

001 hospital 15sep2011 15sep2011

002 rehab 05sep2011 07sep2011

002 hospital 08sep2011 15sep2011

002 hospice 15sep2011 17sep2011

003 hospital 01jan2010 15jan2010

004 rehab 15jan2010 17jan2010

005 hospital 01jan2011 15jan2011

005 rehab 27jan2011 30jan2011

;

run;

I need to have hospital only items that are never associated with rehab, hospice, skilled put into one table and the rest in the other table. So the outcome of this above should be like this: using up to +30 between the disch and the adm. So in the case of ID 111 if they are discharged from the hospital on 15dec2010 and then admitted to rehab on 15jan2011, the entire record fo 111 during the date periods would be in the other table.

HOSPITAL Table:

111 hospital 15jun2010 18jun2010

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 17aug2011 30aug2011

666 hospital 01jan2010 02jan2010

999 hospital 01jan2010 15jan2010

001 hospital 20aug2010 25aug2010

001 hospital 15sep2011 15sep2011

003 hospital 01jan2010 15jan2010

OTHER Table:

data have;

informat dos disch date8.;

format dos disch date8.;

input id plc $ dos disch ;

cards;

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

555 rehab 15aug2011 29aug2011

777 skilled 06aug2011 15aug2011

888 hospice 10aug2011 30aug2011

999 hospital 20feb2010 25feb2010

999 rehab 27feb2010 28feb2010

999 skilled 03mar2010 07mar2010

002 rehab 05sep2011 07sep2011

002 hospital 08sep2011 15sep2011

002 hospice 15sep2011 17sep2011

004 rehab 15jan2010 17jan2010

005 hospital 01jan2011 15jan2011

005 rehab 27jan2011 30jan2011

Reeza · Posted 07-17-2012 12:16 PM

The code I provided earlier works, with a fix for the episode part. Just make sure your new_id is formatted correctly.

The point is to create a unique id per 'episode' based on the id and then a counter, 1 for first episode. But because you'll have multiple episodes up to the 100 per patients you need to add zeros so things will sort properly.

data have;

informat dos disch date9.;

format dos disch date9.;

input id $ plc $ dos disch ;

cards;

111 hospital 15jun2010 18jun2010

111 hospital 11dec2010 15dec2010

111 hospice 15dec2010 17dec2010

111 rehab 15jan2011 31jan2011

111 hospital 12nov2011 15nov2011

222 hospital 01jan2010 10jan2010

333 hospital 01feb2011 07feb2011

444 hospital 01jun2010 07jun2010

444 rehab 07jun2010 15jun2010

444 nursing 15jun2010 17jun2010

444 hospital 17aug2011 30aug2011

555 rehab 15aug2011 29aug2011

666 hospital 01jan2010 02jan2010

777 skilled 06aug2011 15aug2011

888 hospice 10aug2011 30aug2011

999 hospital 01jan2010 15jan2010

999 hospital 20feb2010 25feb2010

999 rehab 27feb2010 28feb2010

999 skilled 03mar2010 07mar2010

001 hospital 20aug2010 25aug2010

001 hospital 15sep2011 15sep2011

002 rehab 05sep2011 07sep2011

002 hospital 08sep2011 15sep2011

002 hospice 15sep2011 17sep2011

003 hospital 01jan2010 15jan2010

004 rehab 15jan2010 17jan2010

005 hospital 01jan2011 15jan2011

005 rehab 27jan2011 30jan2011

;

run;

proc sort data=have;

by id dos disch;

run;

data episodes;

set have;

by id;

retain episode;

*keep track of previous discharge date;

format prev_disch date9.;

prev_disch=lag(disch);

*time between dos and discharge;

if not first.id then time_between=dos-prev_disch;

*If first id then set episode to 1, else if dos ne disch then increase by 1;

if first.id then episode=1;

else if time_between >30 then episode+1;

*Create id to group episodese for the query later on;

*Make sure this fits your actual data, ie you may have up to 100 episodes;

new_id=trim(id)||put(episode, z2.);

run;

*Only Hospitals;

proc sql;

create table hospital as

select *

from episodes

group by new_id

having min(plc)='hospital' and max(plc)='hospital';

quit;

*All the others;

proc sql;

create table others as

select *

from episodes

group by new_id

having min(plc)ne 'hospital' or max(plc)ne 'hospital';

quit;

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away