BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Chaupak
Obsidian | Level 7

Hi, I have a variable with a series 1 and 0 in dataset, most of the time they are like "111011011" for 9 observations for example. Most importantly I want to change the first 0 to 1 and keep the second 0. There are more specific rules and I will display it via an example:

data have;
    input ID time var;
    cards;
1 1 1
1 2 1
1 3 0
1 4 1
1 5 1
1 6 1
1 7 0
1 8 1
2 1 1
2 2 1
2 3 0
2 4 0
2 5 1
2 6 1
2 7 0
2 8 1
;
run;

In this example, I have two IDs with different pattern of the variable "var" over time. I wanted to replace the first 0 by 1 only when this 0 is followed by 1 (e.g., var = 0 for ID 1 at time 3). I do not replace 0 for ID 2 at time 3 because it is also 0 for time 4. The 0 at time 7 for ID 2 will also not replaced by 1 although the pattern is like "101", because it is not the first 0 for this ID. 

 

The final dataset I want is:

data want;
    input ID time var;
    cards;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 1
1 6 1
1 7 0
1 8 1
2 1 1
2 2 1
2 3 0
2 4 0
2 5 1
2 6 1
2 7 0
2 8 1
;
run;

Is there any way to accomplish that? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Looking ahead is difficult.  Here is a method using a second SET statement that re-reads the data starting from the second observation.  Tack on an extra empty observation to have the number of observations match.

 

You will also need to keep track of whether or not you have found a zero to be able to detect the first one.

data want;
  set have ;
  by id;
  set have(keep=var rename=(var=next) firstobs=2) have(obs=1 drop=_all_);
  retain found;
  if last.id then call missing(next);
  if first.id then found=0;
  if var=0 and not found then do;
    found=1;
    if next=1 then var=1;
  end;
run;

Result

Obs    ID    time    var    next    found

  1     1      1      1       1       0
  2     1      2      1       0       0
  3     1      3      1       1       1
  4     1      4      1       1       1
  5     1      5      1       1       1
  6     1      6      1       0       1
  7     1      7      0       1       1
  8     1      8      1       .       1
  9     2      1      1       1       0
 10     2      2      1       0       0
 11     2      3      0       0       1
 12     2      4      0       1       1
 13     2      5      1       1       1
 14     2      6      1       0       1
 15     2      7      0       1       1
 16     2      8      1       .       1

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Looking ahead is difficult.  Here is a method using a second SET statement that re-reads the data starting from the second observation.  Tack on an extra empty observation to have the number of observations match.

 

You will also need to keep track of whether or not you have found a zero to be able to detect the first one.

data want;
  set have ;
  by id;
  set have(keep=var rename=(var=next) firstobs=2) have(obs=1 drop=_all_);
  retain found;
  if last.id then call missing(next);
  if first.id then found=0;
  if var=0 and not found then do;
    found=1;
    if next=1 then var=1;
  end;
run;

Result

Obs    ID    time    var    next    found

  1     1      1      1       1       0
  2     1      2      1       0       0
  3     1      3      1       1       1
  4     1      4      1       1       1
  5     1      5      1       1       1
  6     1      6      1       0       1
  7     1      7      0       1       1
  8     1      8      1       .       1
  9     2      1      1       1       0
 10     2      2      1       0       0
 11     2      3      0       0       1
 12     2      4      0       1       1
 13     2      5      1       1       1
 14     2      6      1       0       1
 15     2      7      0       1       1
 16     2      8      1       .       1
FreelanceReinh
Jade | Level 19

Hi @Chaupak,

 

If dataset HAVE is sorted by ID TIME and variable VAR contains only 0s and 1s, you can also use the implied look-ahead enabled by a BY statement:

data want(drop=f);
set have;
by id var notsorted;
if first.id then f=.;
if ~f & var=0 & ~last.id then do;
  var=last.var;
  f+1;
end;
run;
Ksharp
Super User
data have;
    input ID time var;
    cards;
1 1 1
1 2 1
1 3 0
1 4 1
1 5 1
1 6 1
1 7 0
1 8 1
2 1 1
2 2 1
2 3 0
2 4 0
2 5 1
2 6 1
2 7 0
2 8 1
;
run;

data want;
 merge have have(keep=id var rename=(id=_id var=_var) firstobs=2);
 retain found .;
 if id ne lag(id) then found=.;
 if not found and var=0 and id=_id and _var=1 and (id ne lag(id) or (id=lag(id) and lag(var)=1)) then do;found=1;var=1;end;
 if var=0 then found=1;
drop found _:;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1115 views
  • 0 likes
  • 4 in conversation