Dear SAS Users,
Here is a puzzle.
I have a data set as follows:
symbol | date | X | Y | Z |
ABC | 1/2/2012 | 5 | . | . |
ABC | 1/3/2012 | 7 | . | . |
ABC | 1/4/2012 | 2 | . | . |
ABC | 1/5/2012 | . | 20 | . |
ABC | 1/6/2012 | . | . | . |
ABC | 1/9/2012 | . | . | . |
ABC | 1/10/2012 | 5 | . | . |
ABC | 1/11/2012 | 11 | . | . |
ABC | 1/12/2012 | 5 | 2 | |
ABC | 1/13/2012 | 8 | . | . |
DEF | 1/2/2012 | 5 | . | . |
DEF | 1/3/2012 | 7 | 15 | . |
DEF | 1/4/2012 | 8 | . | . |
DEF | 1/5/2012 | 6 | . | . |
DEF | 1/6/2012 | . | . | 3 |
DEF | 1/9/2012 | 9 | . | . |
DEF | 1/10/2012 | 5 | . | . |
DEF | 1/11/2012 | . | . | 4 |
DEF | 1/12/2012 | . | . | . |
DEF | 1/13/2012 | 9 | . | . |
I want to produce a new variable newX such that
1. if Y^=. and X^=. then NewX=X+Y
2. if Z^=. and X^=. then NewX=X*Z
3. if Y^=. and X=. then the value of Y will be added to the next nonmissing X
4. if Z^=. and X=. then the value of Z will be multiplied by the next nonmissing X
The output will be as follows:
symbol | date | X | Y | Z | NewX |
ABC | 1/2/2012 | 5 | . | . | 5 |
ABC | 1/3/2012 | 7 | . | . | 7 |
ABC | 1/4/2012 | 2 | . | . | 2 |
ABC | 1/5/2012 | . | 20 | . | . |
ABC | 1/6/2012 | . | . | . | . |
ABC | 1/9/2012 | . | . | . | . |
ABC | 1/10/2012 | 5 | . | . | 25 |
ABC | 1/11/2012 | 11 | . | . | 11 |
ABC | 1/12/2012 | 5 | 2 | 10 | |
ABC | 1/13/2012 | 8 | . | . | 8 |
DEF | 1/2/2012 | 5 | . | . | 5 |
DEF | 1/3/2012 | 7 | 15 | . | 22 |
DEF | 1/4/2012 | 8 | . | . | 8 |
DEF | 1/5/2012 | 6 | . | . | 6 |
DEF | 1/6/2012 | . | . | 3 | . |
DEF | 1/9/2012 | 9 | . | . | 27 |
DEF | 1/10/2012 | 5 | . | . | 5 |
DEF | 1/11/2012 | . | . | 4 | . |
DEF | 1/12/2012 | . | . | . | . |
DEF | 1/13/2012 | 9 | . | . | 36 |
How can this be done? Thanks for any help in advance.
INPUT DATA
symbol date X Y Z
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
OUTPUT DATA
symbol date X Y Z NewX
ABC 1/2/2012 5 . . 5
ABC 1/3/2012 7 . . 7
ABC 1/4/2012 2 . . 2
ABC 1/5/2012 . 20 . .
ABC 1/6/2012 . . . .
ABC 1/9/2012 . . . .
ABC 1/10/2012 5 . . 25
ABC 1/11/2012 11 . . 11
ABC 1/12/2012 5 2 10
ABC 1/13/2012 8 . . 8
DEF 1/2/2012 5 . . 5
DEF 1/3/2012 7 15 . 22
DEF 1/4/2012 8 . . 8
DEF 1/5/2012 6 . . 6
DEF 1/6/2012 . . 3 .
DEF 1/9/2012 9 . . 27
DEF 1/10/2012 5 . . 5
DEF 1/11/2012 . . 4 .
DEF 1/12/2012 . . . .
DEF 1/13/2012 9 . . 36
Just like Tom has pointed out, your rules are ambiguous. I will just give an example based on your data as is and there are implied rules such as:
1. y and z are mutually exclusive.
or
1. z rules triumph over y rules.
data have;
infile cards;
input symbol $ date :mmddyy10. X Y Z;
format date mmddyy10.;
cards;
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
;
data want (drop=_:);
retain _ry _rz;
set have;
_ry=ifn(missing(y),_ry,y);
_rz=ifn(missing(z),_rz,z);
if not missing(x) then
do;
if missing(_rz) then
do;
newx=sum(x,_ry);
call missing(_ry);
end;
else if not missing(_rz) then
do;
newx=x*_rz;
call missing (_rz);
end;
end;
run;
proc print;run;
Kindly Regards,
Haikuo
Those conditions overlap. You need to define what order you want the operations to happen.
If x=1 ,y=2 and z=4 then the first two rules imply that NewX is either 3 or 4 depending on which of the two rules to use.
Just like Tom has pointed out, your rules are ambiguous. I will just give an example based on your data as is and there are implied rules such as:
1. y and z are mutually exclusive.
or
1. z rules triumph over y rules.
data have;
infile cards;
input symbol $ date :mmddyy10. X Y Z;
format date mmddyy10.;
cards;
ABC 1/2/2012 5 . .
ABC 1/3/2012 7 . .
ABC 1/4/2012 2 . .
ABC 1/5/2012 . 20 .
ABC 1/6/2012 . . .
ABC 1/9/2012 . . .
ABC 1/10/2012 5 . .
ABC 1/11/2012 11 . .
ABC 1/12/2012 5 2
ABC 1/13/2012 8 . .
DEF 1/2/2012 5 . .
DEF 1/3/2012 7 15 .
DEF 1/4/2012 8 . .
DEF 1/5/2012 6 . .
DEF 1/6/2012 . . 3
DEF 1/9/2012 9 . .
DEF 1/10/2012 5 . .
DEF 1/11/2012 . . 4
DEF 1/12/2012 . . .
DEF 1/13/2012 9 . .
;
data want (drop=_:);
retain _ry _rz;
set have;
_ry=ifn(missing(y),_ry,y);
_rz=ifn(missing(z),_rz,z);
if not missing(x) then
do;
if missing(_rz) then
do;
newx=sum(x,_ry);
call missing(_ry);
end;
else if not missing(_rz) then
do;
newx=x*_rz;
call missing (_rz);
end;
end;
run;
proc print;run;
Kindly Regards,
Haikuo
Sorry for the confusion,
I thought it was clear from the example but apparently not. Y and Z are not nonmissing at the same time. In other words, if y is nonmissing, then Z is missing and vice versa.
So either we have the case
x=1, y=2 and z=.
or
x=1, y=. and z=4
bülent
If so, my code seems working for you. Have you tried on your real data?
Haikuo
Hi Haikuo,
My question was not about exactly what I needed for my data but I got the algorithm from your answer. After some slight modification of your code, it worked.
Thanks a lot.
bülent
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.