BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yvonne1
Calcite | Level 5

Hi,

 

I would like some help in selecting the last date of the first series of consecutive observations (Sas enterprise 6.1). Currently my (large) dataset is comprised of dates (numeric) per ID. Example:

OBSIDDate
1120150316
2120150317
3120150318
4120150326
5120150327
6120150328
7120150329
8120150330
9120150331
10220150601
11220150602
12220150603
13220150624
14220150625
15220150907
16220150908
17320150127
18320150128
19320150129
20320150130
21320150131

 

And would like my output to look like:

OBSIDDate
1120150318
2220150603
3320150131

(dates for obs 3, 12 and 21 )

I know how to select the first obs per ID and last obs per ID but not this..

Hopefully someone can help me! Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, ok, then maybe something like the below (note how I have the test data, please use this method in future).  What this does is assign a period to each block, assuming the dates are sequential, then simply takes min/max of each period to give the output of id period and start/stop dates:

data have;
  input n id date;
  format date date9.;
  informat date yymmdd8.;
datalines;
1	1	20150316
2	1	20150317
3	1	20150318
4	1	20150326
5	1	20150327
6	1	20150328
7	1	20150329
8	1	20150330
9	1	20150331
10	2	20150601
11	2	20150602
12	2	20150603
13	2	20150624
14	2	20150625
15	2	20150907
16	2	20150908
;
run;
data inter;
  set have;
  by id;
  retain period lstdate;
  if first.id then do;
    period=1;
    lstdate=date;
  end;
  else if date ne lstdate+1 then period=period+1;
  lstdate=date;
run;
proc sql;
  create table WANT as
  select  ID,
          PERIOD,
          min(DATE) as FIRST_DATE format=date9.,
          max(DATE) as LAST_DATE format=date9.
  from    INTER
  group by ID,PERIOD;
quit;

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You say you want to pull out the last date in a group of data, however you have not shown any grouping information, so this is not possible.  Why should:

1

20150318

 This data item be the one pulled out from the data?  I can see no logical reason for it from what you have posted, so its not possible to provide logic to cover the scenario.

yvonne1
Calcite | Level 5

Hi

 

Thanks for the quick reply, as you might see there are consecutive dates and non consecutive data (dates following each other) within each ID. They are hospital admission dates where subsequent dates equal one admission and when there is more than 1 day in between a new hospital admission is registered per person.

 

 

So for person 1 (ID=1) the first admission is from 16 to 18th of march (obs 1 2 and 3), the second 26 to 31st of march (obs 4 to 9)

 

There is no variable indication these seperate admissions.So I would like to first identify the first admission (series of dates) and then the last day of this first admission.

 

Is this helpful?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, ok, then maybe something like the below (note how I have the test data, please use this method in future).  What this does is assign a period to each block, assuming the dates are sequential, then simply takes min/max of each period to give the output of id period and start/stop dates:

data have;
  input n id date;
  format date date9.;
  informat date yymmdd8.;
datalines;
1	1	20150316
2	1	20150317
3	1	20150318
4	1	20150326
5	1	20150327
6	1	20150328
7	1	20150329
8	1	20150330
9	1	20150331
10	2	20150601
11	2	20150602
12	2	20150603
13	2	20150624
14	2	20150625
15	2	20150907
16	2	20150908
;
run;
data inter;
  set have;
  by id;
  retain period lstdate;
  if first.id then do;
    period=1;
    lstdate=date;
  end;
  else if date ne lstdate+1 then period=period+1;
  lstdate=date;
run;
proc sql;
  create table WANT as
  select  ID,
          PERIOD,
          min(DATE) as FIRST_DATE format=date9.,
          max(DATE) as LAST_DATE format=date9.
  from    INTER
  group by ID,PERIOD;
quit;
yvonne1
Calcite | Level 5

Thanks so much RW9! Exactly what I meant

ShiroAmada
Lapis Lazuli | Level 10
proc sort data=sample;
by id date;
run;

data sample;
set sample;
format lag_admit_dt date9.;
by id;
lag_admit_dt=lag(date);
diff=intck('days',lag_admit_dt,date);
if first.id then do;
lag_admit_dt=admit_dt;
diff=1;
tag='';
end;

if diff>1 then tag='*';
retain tag;
run;

data First_Discharge_dt (keep=id date);
set sample (where = (tag is missing) ) ;
by id;
if last.id;
run;

NOTE: I added ID=4 for patient with 1 day admission date only.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1105 views
  • 0 likes
  • 3 in conversation