hello people,
i've a dataset like bellow.
x y
1 0
1 23
1 56
1 0
1 0
1 0
i want to replace the zero with '.' but there is one thing that i'm worried with.
last three observations in y are zeros. replace last two zeros with '.' , last but two and first values should stay zeros. ie output will look like
x y
1 0
1 23
1 56
1 0
1 .
1 .
Thanks in advance.
I think this works for the more extended sample you provided. If there are more nuances to the real data, this might need to modified or scrapped altogether for an alternative approach.
data have;
input x y;
datalines;
1 0
1 311
1 65
1 102
1 217
1 174
1 18
1 0
1 0
1 0
2 0
2 12
2 65
2 77
2 92
2 0
2 208
2 226
2 19
2 0
;
data want;
set have;
by x;
y2 = y;
if not first.x then do;
if y = 0 and lag(y) = 0 then y2 = .;
end;
run;
Does this question have anything to do with the value of X? Please explain. Or give a bigger example.
What if the last two values are not zero? What happens then? Please explain. Or give an example.
hello paigemiller and collinelliot,
there is more data with last observations zeros and non zeros.
example:
x y
1 0
1 311
1 65
1 102
1 217
1 174
1 18
1 0
1 0
1 0
2 0
2 12
2 65
2 77
2 92
2 0
2 208
2 226
2 19
2 0 ...etc
1 has three consecutive zeros and out of those only last two zeros need to replace with '.' . ie values 174 18 0 0 0 will be 174 18 0 . .
keep the value zero if its preceeding value is non zero. replace zero with '.' if its preceeding value is zero.
if zero comes first keep it as zero. zeros in between non zero values should be zeros only.
o/p
x y
1 0
1 311
1 65
1 102
1 217
1 174
1 18
1 0
1 .
1 .
2 0
2 12
2 65
2 77
2 92
2 0
2 208
2 226
2 19
2 0
hope you understood.
I think this works for the more extended sample you provided. If there are more nuances to the real data, this might need to modified or scrapped altogether for an alternative approach.
data have;
input x y;
datalines;
1 0
1 311
1 65
1 102
1 217
1 174
1 18
1 0
1 0
1 0
2 0
2 12
2 65
2 77
2 92
2 0
2 208
2 226
2 19
2 0
;
data want;
set have;
by x;
y2 = y;
if not first.x then do;
if y = 0 and lag(y) = 0 then y2 = .;
end;
run;
I think this has been granted "solution" status prematurely.
It does not do what I believe the OP wants when (1) there are consecutive zero's in the middle of an X series, or (2) when a given X has only two observations, both equal to zero. See the data sample below:
1 0
1 311
1 65
1 102
1 217
1 0
1 0
1 0
1 174
1 18
1 0
1 0
1 0
2 0
2 12
2 65
2 77
2 92
2 0
2 208
2 226
2 19
2 0
3 0
3 0
A solution for the specific case you presented is pretty easy, but what other cases might exist? I assume this needs to be done by the different x values, for one. Also, within an x value, do non zero numbers appear in more than one grouping surrounded by zeros? That is, could you have 0, 0, 1, 2, 3, 0, 0, 0, 0, 1, 2, 3, 0...?
I think the easiest way to do this would be to reverse the order.
data ordered;
set have;
original_order = _n_;
run;
proc sort data=ordered;
by descending original_order;
run;
Then it becomes easier to detect which observations need to change, since you can start at the beginning of each X value.
data want;
set ordered;
by x y notsorted;
if first.x then do;
if y=0 then change='Y';
else change = 'N';
end;
retain change;
if change='Y' then do;
if last.y then change='N';
else y=.;
end;
drop change;
run;
And finally, put it all back in the original order:
proc sort data=want;
by original_order;
run;
thanks every one each reply helped me a lot.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.