Dear SAS Users,
Here is a puzzle.
I have a data set as follows:
symbol | date | X | Y | Z |
ABC | 1/2/2012 | 5 | . | . |
ABC | 1/3/2012 | 7 | . | . |
ABC | 1/4/2012 | 2 | . | . |
ABC | 1/5/2012 | . | 20 | . |
ABC | 1/6/2012 | . | . | . |
ABC | 1/9/2012 | . | . | . |
ABC | 1/10/2012 | 5 | . | . |
ABC | 1/11/2012 | 11 | . | . |
ABC | 1/12/2012 | 5 | 2 | |
ABC | 1/13/2012 | 8 | . | . |
DEF | 1/2/2012 | 5 | . | . |
DEF | 1/3/2012 | 7 | 15 | . |
DEF | 1/4/2012 | 8 | . | . |
DEF | 1/5/2012 | 6 | . | . |
DEF | 1/6/2012 | . | . | 3 |
DEF | 1/9/2012 | 9 | . | . |
DEF | 1/10/2012 | 5 | . | . |
DEF | 1/11/2012 | . | . | 4 |
DEF | 1/12/2012 | . | . | . |
DEF | 1/13/2012 | 9 | . | . |
I want to produce a new variable newX such that
1. if Y^=. and X^=. then NewX=X+Y
2. if Z^=. and X^=. then NewX=X*Z
3. if Y^=. and X=. then the value of Y will be added to the next nonmissing X
4. if Z^=. and X=. then the value of Z will be multiplied by the next nonmissing X
The output will be as follows:
symbol | date | X | Y | Z | NewX |
ABC | 1/2/2012 | 5 | . | . | 5 |
ABC | 1/3/2012 | 7 | . | . | 7 |
ABC | 1/4/2012 | 2 | . | . | 2 |
ABC | 1/5/2012 | . | 20 | . | . |
ABC | 1/6/2012 | . | . | . | . |
ABC | 1/9/2012 | . | . | . | . |
ABC | 1/10/2012 | 5 | . | . | 25 |
ABC | 1/11/2012 | 11 | . | . | 11 |
ABC | 1/12/2012 | 5 | 2 | 10 | |
ABC | 1/13/2012 | 8 | . | . | 8 |
DEF | 1/2/2012 | 5 | . | . | 5 |
DEF | 1/3/2012 | 7 | 15 | . | 22 |
DEF | 1/4/2012 | 8 | . | . | 8 |
DEF | 1/5/2012 | 6 | . | . | 6 |
DEF | 1/6/2012 | . | . | 3 | . |
DEF | 1/9/2012 | 9 | . | . | 27 |
DEF | 1/10/2012 | 5 | . | . | 5 |
DEF | 1/11/2012 | . | . | 4 | . |
DEF | 1/12/2012 | . | . | . | . |
DEF | 1/13/2012 | 9 | . | . | 36 |
How can this be done? Thanks for any help in advance.
INPUT DATA
symbol date X Y Z
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
OUTPUT DATA
symbol date X Y Z NewX
ABC 1/2/2012 5 . . 5
ABC 1/3/2012 7 . . 7
ABC 1/4/2012 2 . . 2
ABC 1/5/2012 . 20 . .
ABC 1/6/2012 . . . .
ABC 1/9/2012 . . . .
ABC 1/10/2012 5 . . 25
ABC 1/11/2012 11 . . 11
ABC 1/12/2012 5 2 10
ABC 1/13/2012 8 . . 8
DEF 1/2/2012 5 . . 5
DEF 1/3/2012 7 15 . 22
DEF 1/4/2012 8 . . 8
DEF 1/5/2012 6 . . 6
DEF 1/6/2012 . . 3 .
DEF 1/9/2012 9 . . 27
DEF 1/10/2012 5 . . 5
DEF 1/11/2012 . . 4 .
DEF 1/12/2012 . . . .
DEF 1/13/2012 9 . . 36
Just like Tom has pointed out, your rules are ambiguous. I will just give an example based on your data as is and there are implied rules such as:
1. y and z are mutually exclusive.
or
1. z rules triumph over y rules.
data have;
infile cards;
input symbol $ date :mmddyy10. X Y Z;
format date mmddyy10.;
cards;
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
;
data want (drop=_:);
retain _ry _rz;
set have;
_ry=ifn(missing(y),_ry,y);
_rz=ifn(missing(z),_rz,z);
if not missing(x) then
do;
if missing(_rz) then
do;
newx=sum(x,_ry);
call missing(_ry);
end;
else if not missing(_rz) then
do;
newx=x*_rz;
call missing (_rz);
end;
end;
run;
proc print;run;
Kindly Regards,
Haikuo
Those conditions overlap. You need to define what order you want the operations to happen.
If x=1 ,y=2 and z=4 then the first two rules imply that NewX is either 3 or 4 depending on which of the two rules to use.
Just like Tom has pointed out, your rules are ambiguous. I will just give an example based on your data as is and there are implied rules such as:
1. y and z are mutually exclusive.
or
1. z rules triumph over y rules.
data have;
infile cards;
input symbol $ date :mmddyy10. X Y Z;
format date mmddyy10.;
cards;
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
;
data want (drop=_:);
retain _ry _rz;
set have;
_ry=ifn(missing(y),_ry,y);
_rz=ifn(missing(z),_rz,z);
if not missing(x) then
do;
if missing(_rz) then
do;
newx=sum(x,_ry);
call missing(_ry);
end;
else if not missing(_rz) then
do;
newx=x*_rz;
call missing (_rz);
end;
end;
run;
proc print;run;
Kindly Regards,
Haikuo
Sorry for the confusion,
I thought it was clear from the example but apparently not. Y and Z are not nonmissing at the same time. In other words, if y is nonmissing, then Z is missing and vice versa.
So either we have the case
x=1, y=2 and z=.
or
x=1, y=. and z=4
bülent
If so, my code seems working for you. Have you tried on your real data?
Haikuo
Hi Haikuo,
My question was not about exactly what I needed for my data but I got the algorithm from your answer. After some slight modification of your code, it worked.
Thanks a lot.
bülent
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.