Hi
my data looks like that:
ID year post
1 1990 0
1 1991 0
1 1992 0
1 1993 1
1 1994 1
2 2000 0
2 2001 0
2 2002 1
2 2002 1
2 2003 1
I want to add a variable so my data will look like that-
ID year post b_f
1 1990 0 -3
1 1991 0 -2
1 1992 0 -1
1 1993 1 1
1 1994 1 2
2 2000 0 -2
2 2001 0 -1
2 2002 1 1
2 2002 1 2
2 2003 1 3
Many thanks
What do you want to do when some value of ID does not have any YEAR with POST=1 ? How do you want to number those observations?
If you want them to number up to -1 then you could do:
data have;
input ID $ YEAR POST ;
cards;
A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1
C 1990 0
C 1991 0
C 1992 0
;
data want;
do _n_=1 by 1 until(last.id);
set have;
by id;
if post=1 and missing(event_year) then event_year=year;
end;
if missing(event_year) then event_year=year+1;
do _n_=1 to _n_;
set have;
b_f = year-event_year ;
output;
end;
run;
Results:
event_ Obs ID YEAR POST year b_f 1 A 1990 0 1993 -3 2 A 1991 0 1993 -2 3 A 1992 0 1993 -1 4 A 1993 1 1993 0 5 A 1994 1 1993 1 6 B 2000 0 2002 -2 7 B 2001 0 2002 -1 8 B 2002 1 2002 0 9 B 2002 1 2002 0 10 B 2003 1 2002 1 11 C 1990 0 1993 -3 12 C 1991 0 1993 -2 13 C 1992 0 1993 -1
I calculated an "event" year instead of just numbering because that would handle the case where some year is missing in the middle, or in your data assign the same B_F flag to the two records that are both for the year 2002. If you don't want to do that then change the logic to just use the row counter variable instead.
data want;
do _n_=1 by 1 until(last.id);
set have;
by id;
if post=1 and missing(event_row) then event_row=_n_;
end;
if missing(event_row) then event_row=_n_+1;
do _n_=1 to _n_;
set have;
b_f = _n_-event_row ;
output;
end;
run;
PS Learn to use the Insert Code and Insert SAS Code icons on the forum editor to insert your test blocks and SAS code blocks. That will prevent the forum from trying to flow the text into paragraphs.
Try this
data have;
input ID year post;
datalines;
1 1990 0
1 1991 0
1 1992 0
1 1993 1
1 1994 1
2 2000 0
2 2001 0
2 2002 1
2 2002 1
2 2003 1
;
data want(keep = ID year post b_f);
flag = 0;
do _N_ = 0 by 1 until (last.ID);
set have;
by ID;
if post = 1 and flag = 0 then do;
nn = _N_; flag = 1;
end;
end;
do b_f = -nn by 1 until (last.ID);
set have;
by ID;
if b_f = 0 then b_f + 1;
output;
end;
run;
Result:
ID year post b_f 1 1990 0 -3 1 1991 0 -2 1 1992 0 -1 1 1993 1 1 1 1994 1 2 2 2000 0 -2 2 2001 0 -1 2 2002 1 1 2 2002 1 2 2 2003 1 3
Thanks,
but I get this
"ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the
BY expression is missing, zero, or invalid."...
Any idea why?
@HEB1 wrote:
Thanks,
but I get this
"ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the
BY expression is missing, zero, or invalid."...Any idea why?
That probably indicates a by group where there are no 1's detected.
You need to explain how you want those cases numbered.
Also why did you skip zero in your numbering?
MY ID variable is sting, with letters- is that a problem?
I do not want to skip zero. I will write again.
My data is
ID YEAR POST
A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1
AND I WANT- zero on the first post of the ID:
ID YEAR POST B_F
A 1990 0 -3
A 1991 0 -2
A 1992 0 -1
A 1993 1 0
A 1994 1 1
B 2000 0 -2
B 2001 0 -1
B 2002 1 0
B 2002 1 1
B 2003 1 2
What do you want to do when some value of ID does not have any YEAR with POST=1 ? How do you want to number those observations?
If you want them to number up to -1 then you could do:
data have;
input ID $ YEAR POST ;
cards;
A 1990 0
A 1991 0
A 1992 0
A 1993 1
A 1994 1
B 2000 0
B 2001 0
B 2002 1
B 2002 1
B 2003 1
C 1990 0
C 1991 0
C 1992 0
;
data want;
do _n_=1 by 1 until(last.id);
set have;
by id;
if post=1 and missing(event_year) then event_year=year;
end;
if missing(event_year) then event_year=year+1;
do _n_=1 to _n_;
set have;
b_f = year-event_year ;
output;
end;
run;
Results:
event_ Obs ID YEAR POST year b_f 1 A 1990 0 1993 -3 2 A 1991 0 1993 -2 3 A 1992 0 1993 -1 4 A 1993 1 1993 0 5 A 1994 1 1993 1 6 B 2000 0 2002 -2 7 B 2001 0 2002 -1 8 B 2002 1 2002 0 9 B 2002 1 2002 0 10 B 2003 1 2002 1 11 C 1990 0 1993 -3 12 C 1991 0 1993 -2 13 C 1992 0 1993 -1
I calculated an "event" year instead of just numbering because that would handle the case where some year is missing in the middle, or in your data assign the same B_F flag to the two records that are both for the year 2002. If you don't want to do that then change the logic to just use the row counter variable instead.
data want;
do _n_=1 by 1 until(last.id);
set have;
by id;
if post=1 and missing(event_row) then event_row=_n_;
end;
if missing(event_row) then event_row=_n_+1;
do _n_=1 to _n_;
set have;
b_f = _n_-event_row ;
output;
end;
run;
PS Learn to use the Insert Code and Insert SAS Code icons on the forum editor to insert your test blocks and SAS code blocks. That will prevent the forum from trying to flow the text into paragraphs.
Thank you! it is great! I dont have any missing years so I used the last option. It is Awesome.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.