BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lillymaginta
Lapis Lazuli | Level 10
data test;
input id$2. start;
attrib start format =date9. informat=date9.;

datalines;
1 01JAN2015 
1 18FEB2015 
2 01jan2015 
2 01apr2015 
3 01JAN2015 
3 01JAN2015 
;
run;

 I want to retain the second date for each id: output data:

 

1 18FEB2015

2 01apr2015

3 01JAN2015 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Are you sure of the results for 

1 01JAN2015

1 18FEB2015

 

 

data test;
input id$2. start;
attrib start format =date9. informat=date9.;

datalines;
1 01JAN2015 
1 18FEB2015 
2 01jan2015 
2 01apr2015 
3 01JAN2015 
3 01JAN2015 
;
run;

data want;
 do _n_=1 by 1 until(last.id);
  set test;
  by id;
  if _n_=2 then output;
 end;
run;

View solution in original post

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Are you sure of the results for 

1 01JAN2015

1 18FEB2015

 

 

data test;
input id$2. start;
attrib start format =date9. informat=date9.;

datalines;
1 01JAN2015 
1 18FEB2015 
2 01jan2015 
2 01apr2015 
3 01JAN2015 
3 01JAN2015 
;
run;

data want;
 do _n_=1 by 1 until(last.id);
  set test;
  by id;
  if _n_=2 then output;
 end;
run;
novinosrin
Tourmaline | Level 20

To be safe, in an event of having only 1 record per id

 

data test;
input id$2. start;
attrib start format =date9. informat=date9.;

datalines;
1 01JAN2015 
1 18FEB2015 
2 01jan2015 
2 01apr2015 
3 01JAN2015 
3 01JAN2015 
;
run;

data want;
 do _n_=1 by 1 until(last.id);
  set test;
  by id;
  if first.id and last.id then output;
  else if _n_=2 then output;
 end;
run;
mkeintz
PROC Star

You've already chosen @novinosrin's solution.  As a slight conceptual change (from explicit loop to the hidden implicit loop), consider this analog:

 

data want;
  set have;
  by id;
  if first.id=0  and lag(first.id)=1;
run;

However, this approach is not as neatly extensible to higher order dates.  Here's what it would look like for say, the 4th date of each id:

 

data want;
  set have;
  by id;
  if max(first.id,lag(first.id),lag2(first.id))=0 and lag3(first.id)=1;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

@lillymaginta wrote:
data test;
input id$2. start;
attrib start format =date9. informat=date9.;

datalines;
1 01JAN2015 
1 18FEB2015 
2 01jan2015 
2 01apr2015 
3 01JAN2015 
3 01JAN2015 
;
run;

 I want to retain the second date for each id: output data:

 

1 18FEB2015

2 01apr2015

3 01JAN2015 


Are you absolutely positive that there are never more than 2 records for each id?

 

Are there other variables that have to be brought along? If not perhaps:

proc summary data=have nway;
   class id;
   var  start;
   output out=want (drop=_:) max=;
run;

which may also handle the issue of more than 2 records assuming the actual latest date value is desired and not order position.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1068 views
  • 1 like
  • 4 in conversation