BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi guys, 

I have the following data set: 

 

  ID             Start                 Stop        

 001       01JAN2013     31JAN2013             

 001       01FEB2013     31DEC2013           

 002       01MAR2013     31DC2013              

 003       01JAN2013     31DEC2013             

 

I need the following output:

   ID             Start                 Stop              Start_flag         End_flag

 001       01JAN2013     31JAN2013              1                        2

 001       01FEB2013     31DEC2013             2                        3

 002       01MAR2013     31DC2013              1                        2

 003       01JAN2013     31DEC2013             1                        2

 

In other words I need to add a flag for the start and end with the exception that for consecutive periods the end flag of the previous period will become the start flag of the subsequent period and the remaining end flag will be increased by 1. 

 

Can anyone help me please? 

 

Thnk you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star
data want;
  set have;
  by id;
  if first.id then start_flag=1;
  else start_flag+1;
  end_flag=start_flag+1;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star
data want;
  set have;
  by id;
  if first.id then start_flag=1;
  else start_flag+1;
  end_flag=start_flag+1;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

You  need to provide a somewhat better example as I am not sure from your description what the result for the 3rd row of the modified ID 001 data below. Or Explicitly state that there will not be gaps (or overlaps) in the dates of the sequences.

 

ID             Start                 Stop        
 001       01JAN2013     31JAN2013             
 001       01FEB2013     31DEC2013           
 001       01MAY2014     30JUN2014
 002       01MAR2013     31DC2013              
 003       01JAN2013     31DEC2013    

What I understand to this point:

 /* assumes data is sorted by ID and start as implied*/

 data want;
   set have;
   by id;
   retain start_flag;
   ls = lag(stop);
   Difdate = start- ls;
   if first.id then do;
      start_flag=1;
      end_flag=2;
   end;
   else if difdate=1 then do;
      start_flag+1;
      end_flag = start_flag+1;
   end;
   Else do;
     /* this would be for gaps in the sequence*/
   end;
   drop ls difdate;
run;

When you have By group processing in a data step SAS creates automatic variables that indicate whether a particular observation is the First or Last of the by group. These are 1/0 (True/False) variables that are accessed with the First. and Last. construct as shown. This allows setting things at breaks in the data such as your Id changes.

The LAG function pulls a value from a preceding observation (if done properly, best not in IF type constructs). Retain keeps the value of a variable across data step iterations and is often used for a accumulator or sequencing variables like your Start_flag.

 

 

A style point for naming variables: If you have variables named Start and Stop and then Start_flag, it would make a lot more sense to have a Stop_flag then an End_flag for the variable name. Then the connection when you read the code a year later makes more sense.

PaigeMiller
Diamond | Level 26

From your explanation and example, the start and stop dates are not used in determining the flags. Is that right?

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 694 views
  • 1 like
  • 4 in conversation