So here's what I have:
ID X Y Z
123 8
123 7
456 9
456 2
And what I want is this:
ID X Y Z
123 7 8
456 9 2
Help? I know retain is necessary here, right? I've been working at it all day, and I tend to run into problems when I encounter a [blank] or missing value.
Look at the val_ variables instead, you can rename them if it makes you happy at the end.
Btw,
I know how to restructure. What I don't know is how to get to this interim step :
ID X Y Z
123 8
123 7 8
456 9
456 9 2
Can you have two values in a column, ie the X=3 and ID=123 below? if so what would you want to happen?
ID X Y Z
123 3 8
123 7
456 9
456 2
Otherwise, just summarize the data using a proc means or sql step to get the max/min/avg or whatever works for what you need.
proc means data=have;
class id;
var x y x;
output out=want max=;
run;
What I'm trying to do is to combine all the information from X, Y, and Z, into a single observation that will be kept (restructing to person level). In the original dataset, there is only one value across x, y, and z for each observation.
or:
proc means data=have nway ;
class id;
var x y z;
output out=want(drop=_:) max=;
run;
proc print data=want;run;
If it is assumed that the problem is as specified then the following will collapse the data.
data have;
infile datalines missover;
input id x y z;
datalines;
123 8
123 . 7
456 . 9
456 . . 2
run;
* for this specific example;
proc summary data=have nway;
class id;
var x y z;
output out=want(drop=_type_ _freq_)
sum=;
run;
proc print data=want;
run;
This solution will not work if there are other conditions such as more than one row with a non-missing value.
data have;
input ID X Y Z ;
cards;
123 . 8 .
123 7 . .
456 . 9 .
456 . . 2
;
options missing=' ';
data want(drop=_:);
retain _x _y _z;
set have;
by id;
if first.id then do;
_x=x; _y=y;_z=z; end;
else do;
x=ifn(x=.,_x,x);
y=ifn(y=.,_y,y);
z=ifn(z=.,_z,z);
end;
if last.id;
run;
proc print;run;
Obs ID X Y Z
1 123 7 8
2 456 9 2
This is really close to what I need. but here's the snag I ran into when I tried your code (basically the value for y isn't coming through in the observation I want to keep (droprec = 0):
have | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 1 | ||
123 | 1 | |||
123 | 3 | 0 | ||
got | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 4 | 1 | |
123 | 4 | 1 | ||
123 | 4 | 3 | 0 |
want | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 4 | 1 | |
123 | 4 | 1 | ||
123 | 4 | 4 | 3 | 0 |
suggestions?
Thanks again for all the prompt replies. Much appreciated folks
Old school, but seems to be what you're after:
data have;
input ID X Y Z ;
cards;
123 . 8 .
123 7 . .
456 . 9 .
456 . . 2
124 . 4 .
124 4 . .
124 . . .
124 . . 3
;
data want;
set have;
by id notsorted;
retain val_x val_y val_z;
if first.id then call missing(val_x, val_y, val_z);
if x ne . then val_x=x;
if y ne . then val_y=y;
if z ne . then val_z=z;
run;
I got the following which is the same as the have dataset:
x | y | z | droprec |
4 | . | . | 1 |
. | 4 | . | 1 |
. | . | . | 1 |
. | . | 3 | 0 |
Look at the val_ variables instead, you can rename them if it makes you happy at the end.
oh wow. I can't believe I missed that. hahaha. thanks so much! this is it!
I really think UPDATE is the way to go, you just have to pre-prepare an empty model
option obs=0 ;
data final_shape ;
set this that the other ; *list all transaction tables;
run ;
option obs= max ;
data view/view=view ;
set this that the other ;
by ID notsorted ;
run ;
data final_data ;
update final_shape view ;
by id notsorted ;
run ;
****** beware I haven't tested this ;
Check again. The results you are describing would represent a bug in the software. UPDATE has been around for decades and it is not likely to have that sort of a bug in it!
Sorry. I don't think my first post work.
try this one:
data have;
input ID X y Z droprec ;
cards;
123 4 . . 1
123 . 4 . 1
123 . . . 1
123 . . 3 0
;
options missing='.';
data want(drop= _:);
retain x y z;
set have;
by id;
if first.id then do;
_x=x; _y=y;_z=z; end;
else do;
x=ifn(x=.,_x,x);
y=ifn(y=.,_y,y);
z=ifn(z=.,_z,z);
end;
if last.id;
run;
proc print;run;
Obs _x _y _z ID
1 4 4 3 123
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.