BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JJ_211
Obsidian | Level 7

Hi!

 

I have a dataset with multiple dates of admission per person. I want to split them into admissions and readmissions (defined as another admission within 90 days of discharge date of the previous admission). I used the lag function to define those then I created a count (M) for each admission and readmission, and it looks like this: 

 

Data have: 

ID    admission date      readmission date    count (M)

1      1/1/2000                .                               1

1       .                            1/2/2000                  1

1      1/1/2010                 .                              2

1       .                           20/1/2010                 2 

1       .                           1/2/2010                   2

 

I want to collapse these observations so it looks like this: 

 

Data want:

ID    admission date      readmission date    count (M)

1      1/1/2000                1/2/2000                  1

1      1/1/2010                20/1/2010                2

 

I tried doing: data want; update have(obs=0) have; by id M; run; 

 

That works well, except that it takes the last date per M, so it looks like this: 

 

ID    admission date      readmission date    count (M)

1      1/1/2000                1/2/2000                  1

1      1/1/2010                1/2/2010                  2

 

Note the date bolded in this above dataset is the latest of the two dates that followed the admission date on 1/1/2010, and I want it to take the earliest (want 20/1/2010 and NOT 1/2/2010). Is there a way to do that? 

 

Many thanks in advance! 

Janet

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
proc sql;
select id,count,min( admission_date ) as admission_date,
min( readmission_date ) as readmission_date
from have
group by id,count;
quit;

View solution in original post

9 REPLIES 9
Kurt_Bremser
Super User

See this:

data have;
infile datalines missover;
input ID (admission_date readmission_date) (:ddmmyy10.) count;
format admission_date readmission_date yymmdd10.;
datalines;
1      1/1/2000                .                               1
1       .                            1/2/2000                  1
1      1/1/2010                 .                              2
1       .                           20/1/2010                 2 
1       .                           1/2/2010                   2
;

data want;
set have;
by id;
retain _ad _re;
if first.id
then do;
  _ad = .;
  _re = .;
end;
_ad = coalesce(_ad,admission_date);
_re = coalesce(_re,readmission_date);
if _ad and _re
then do;
  admission_date = _ad;
  readmission_date = _re;
  output;
  _ad = .;
  _re = .;
end;
run;
JJ_211
Obsidian | Level 7

Thank you! 

 

Do I need to do the first part with the infile and datalines for the rest to work? 

 

My dataset contains some millions of lines 🙂 

 

I tried the second part of the code (from data want) and it for some reason deletes some admission dates, and still takes the latest of the readmission dates.

Janet

Kurt_Bremser
Super User

The DATALINES step is just there to create some example data to work with. By using a data step with DATALINES, it is extremely easy for everybody else to recreate the dataset as is, so that we are all talking about the same thing. All that is needed is copy/paste and submit.

Please supply example data in the same way and include the cases that don't come out as intended; with the data you posted, my code works, and I can only test code against usable data.

 

Ksharp
Super User
proc sql;
select id,count,min( admission_date ) as admission_date,
min( readmission_date ) as readmission_date
from have
group by id,count;
quit;
JJ_211
Obsidian | Level 7

Hi Ksharp. Many thanks for the sql code! Works as it should (Note to any future user of the code: I added after proc sql; create table want as...), as without this my SAS ran forever and would end up not responding. 

 

Ksharp, a follow up question: If I need further variables to be kept from the original dataset (have), where do I specify that within this code? 

I tried adding some variables in "select" but end up with repeated lines (e.g. I also want to keep the discharge date for each admission date, and some other variables). The keep statement doesn't seem to love sql either. 

 

I appreciate your help! 

Janet

Ksharp
Super User
Better post some sample data to explain your question.
If the variables you want added were also GROUP variables ,need include them in GROUP BY clause.

proc sql;
select id,count, xx,
min( admission_date ) as admission_date,
min( readmission_date ) as readmission_date
from have
group by id,count , xx ;
quit;
JJ_211
Obsidian | Level 7

Here a sample: 

 

HAVE data:

 

Id   count     sex   admission    discharge    readmission

1      1          M         1/1/03         15/1/03           . 

1      1          M          .                   .                   15/3/03

.

.

.

 

After suggested proc sql, I want to also include the discharge date and sex (examples, there are many other variables like death date etc). If I include sex and discharge date in the GROUP BY clause, I end up getting the same as data HAVE. 

 

WANT data: 

Id   count     sex   admission    discharge    readmission

1      1          M         1/1/03         15/1/03       15/3/03

.

.

.

 

THANKS 🙂

Kurt_Bremser
Super User

Use a summary function for discharge also:

data have;
input Id $ count sex $ (admission discharge readmission) (:ddmmyy10.);
format admission discharge readmission ddmmyy10.;
datalines;
1      1          M         1/1/03         15/1/03           . 
1      1          M          .                   .                   15/3/03
;

proc sql;
create table want as
  select
    id,
    count,
    sex,
    min(admission) as admission format=ddmmyy10.,
    min(discharge) as discharge format=ddmmyy10.,
    min(readmission) as readmission format=ddmmyy10.
  from have
  group by id, count, sex
;
quit;
JJ_211
Obsidian | Level 7

of course! That does work 🙂 Many, many thanks!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 806 views
  • 2 likes
  • 3 in conversation