BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
HEB1
Calcite | Level 5

Hi

my data looks like that:

ID year post

1  1990 0

1 1991 0 

1 1992 0

1 1993 1

1 1994 1

2 2000 0

2 2001 0 

2 2002 1

2 2002 1

2 2003 1

I want to add a variable so my data will look like that- 

ID year post  b_f

1  1990 0         -3

1 1991 0           -2 

1 1992 0         -1

1 1993 1        1

1 1994 1         2

2 2000 0          -2

2 2001 0         -1

2 2002 1           1

2 2002 1            2

2 2003 1            3

 

Many thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

What do you want to do when some value of ID does not have any YEAR with POST=1 ?  How do you want to number those observations?  

If you want them to number up to -1 then you could do:

data have;
  input ID $ YEAR POST ;
cards;
A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1
C 1990 0
C 1991 0
C 1992 0
;

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if post=1 and missing(event_year) then event_year=year;
  end;
  if missing(event_year) then event_year=year+1;
  do _n_=1 to _n_;
    set have;
    b_f = year-event_year ;
    output;
  end;
run;

Results:

                             event_
Obs    ID    YEAR    POST     year     b_f

  1    A     1990      0      1993      -3
  2    A     1991      0      1993      -2
  3    A     1992      0      1993      -1
  4    A     1993      1      1993       0
  5    A     1994      1      1993       1
  6    B     2000      0      2002      -2
  7    B     2001      0      2002      -1
  8    B     2002      1      2002       0
  9    B     2002      1      2002       0
 10    B     2003      1      2002       1
 11    C     1990      0      1993      -3
 12    C     1991      0      1993      -2
 13    C     1992      0      1993      -1

I calculated an "event" year instead of just numbering because that would handle the case where some year is missing in the middle, or in your data assign the same B_F flag to the two records that are both for the year 2002.  If you don't want to do that then change the logic to just use the row counter variable instead.

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if post=1 and missing(event_row) then event_row=_n_;
  end;
  if missing(event_row) then event_row=_n_+1;
  do _n_=1 to _n_;
    set have;
    b_f = _n_-event_row ;
    output;
  end;
run;

 

PS Learn to use the Insert Code and Insert SAS Code icons on the forum editor to insert your test blocks and SAS code blocks. That will prevent the forum from trying to flow the text into paragraphs.

View solution in original post

6 REPLIES 6
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input ID year post;
datalines;
1 1990 0 
1 1991 0 
1 1992 0 
1 1993 1 
1 1994 1 
2 2000 0 
2 2001 0 
2 2002 1 
2 2002 1 
2 2003 1 
;

data want(keep = ID year post b_f);

   flag = 0;

   do _N_ = 0 by 1 until (last.ID);
      set have;
      by ID;
      if post = 1 and flag = 0 then do;
         nn = _N_; flag = 1;
      end;
   end;

   do b_f = -nn by 1 until (last.ID);
      set have;
      by ID;
      if b_f = 0 then b_f + 1;
      output;
   end;
run;

 

Result:

 

ID  year  post  b_f
1   1990  0     -3
1   1991  0     -2
1   1992  0     -1
1   1993  1      1
1   1994  1      2
2   2000  0     -2
2   2001  0     -1
2   2002  1      1
2   2002  1      2
2   2003  1      3

 

HEB1
Calcite | Level 5

Thanks,

but I get this 

"ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the
BY expression is missing, zero, or invalid."...

Any idea why? 

Tom
Super User Tom
Super User

@HEB1 wrote:

Thanks,

but I get this 

"ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the
BY expression is missing, zero, or invalid."...

Any idea why? 


That probably indicates a by group where there are no 1's detected.

You need to explain how you want those cases numbered.

 

Also why did you skip zero in your numbering?

HEB1
Calcite | Level 5

MY ID  variable is sting, with letters- is that a problem? 

I do not want to skip zero. I will write again. 

My data is 

ID YEAR POST

A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1

 

AND I WANT- zero on the first post of the ID:

ID YEAR POST B_F

A 1990 0         -3
A 1991 0      -2
A 1992 0        -1
A 1993 1       0
A 1994 1        1
B 2000 0        -2
B 2001 0           -1
B 2002 1        0
B 2002 1         1
B 2003 1         2

 

 

Tom
Super User Tom
Super User

What do you want to do when some value of ID does not have any YEAR with POST=1 ?  How do you want to number those observations?  

If you want them to number up to -1 then you could do:

data have;
  input ID $ YEAR POST ;
cards;
A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1
C 1990 0
C 1991 0
C 1992 0
;

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if post=1 and missing(event_year) then event_year=year;
  end;
  if missing(event_year) then event_year=year+1;
  do _n_=1 to _n_;
    set have;
    b_f = year-event_year ;
    output;
  end;
run;

Results:

                             event_
Obs    ID    YEAR    POST     year     b_f

  1    A     1990      0      1993      -3
  2    A     1991      0      1993      -2
  3    A     1992      0      1993      -1
  4    A     1993      1      1993       0
  5    A     1994      1      1993       1
  6    B     2000      0      2002      -2
  7    B     2001      0      2002      -1
  8    B     2002      1      2002       0
  9    B     2002      1      2002       0
 10    B     2003      1      2002       1
 11    C     1990      0      1993      -3
 12    C     1991      0      1993      -2
 13    C     1992      0      1993      -1

I calculated an "event" year instead of just numbering because that would handle the case where some year is missing in the middle, or in your data assign the same B_F flag to the two records that are both for the year 2002.  If you don't want to do that then change the logic to just use the row counter variable instead.

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if post=1 and missing(event_row) then event_row=_n_;
  end;
  if missing(event_row) then event_row=_n_+1;
  do _n_=1 to _n_;
    set have;
    b_f = _n_-event_row ;
    output;
  end;
run;

 

PS Learn to use the Insert Code and Insert SAS Code icons on the forum editor to insert your test blocks and SAS code blocks. That will prevent the forum from trying to flow the text into paragraphs.

HEB1
Calcite | Level 5

Thank you! it is great! I dont have any missing years so I used the last option. It is Awesome. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 817 views
  • 0 likes
  • 3 in conversation