So here's what I have:
ID X Y Z
123 8
123 7
456 9
456 2
And what I want is this:
ID X Y Z
123 7 8
456 9 2
Help? I know retain is necessary here, right? I've been working at it all day, and I tend to run into problems when I encounter a [blank] or missing value.
Look at the val_ variables instead, you can rename them if it makes you happy at the end.
Btw,
I know how to restructure. What I don't know is how to get to this interim step :
ID X Y Z
123 8
123 7 8
456 9
456 9 2
Can you have two values in a column, ie the X=3 and ID=123 below? if so what would you want to happen?
ID X Y Z
123 3 8
123 7
456 9
456 2
Otherwise, just summarize the data using a proc means or sql step to get the max/min/avg or whatever works for what you need.
proc means data=have;
class id;
var x y x;
output out=want max=;
run;
What I'm trying to do is to combine all the information from X, Y, and Z, into a single observation that will be kept (restructing to person level). In the original dataset, there is only one value across x, y, and z for each observation.
or:
proc means data=have nway ;
class id;
var x y z;
output out=want(drop=_:) max=;
run;
proc print data=want;run;
If it is assumed that the problem is as specified then the following will collapse the data.
data have;
infile datalines missover;
input id x y z;
datalines;
123 8
123 . 7
456 . 9
456 . . 2
run;
* for this specific example;
proc summary data=have nway;
class id;
var x y z;
output out=want(drop=_type_ _freq_)
sum=;
run;
proc print data=want;
run;
This solution will not work if there are other conditions such as more than one row with a non-missing value.
data have;
input ID X Y Z ;
cards;
123 . 8 .
123 7 . .
456 . 9 .
456 . . 2
;
options missing=' ';
data want(drop=_:);
retain _x _y _z;
set have;
by id;
if first.id then do;
_x=x; _y=y;_z=z; end;
else do;
x=ifn(x=.,_x,x);
y=ifn(y=.,_y,y);
z=ifn(z=.,_z,z);
end;
if last.id;
run;
proc print;run;
Obs ID X Y Z
1 123 7 8
2 456 9 2
This is really close to what I need. but here's the snag I ran into when I tried your code (basically the value for y isn't coming through in the observation I want to keep (droprec = 0):
have | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 1 | ||
123 | 1 | |||
123 | 3 | 0 | ||
got | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 4 | 1 | |
123 | 4 | 1 | ||
123 | 4 | 3 | 0 |
want | ||||
SAMPLEID | x | y | z | droprec |
123 | 4 | 1 | ||
123 | 4 | 4 | 1 | |
123 | 4 | 1 | ||
123 | 4 | 4 | 3 | 0 |
suggestions?
Thanks again for all the prompt replies. Much appreciated folks
Old school, but seems to be what you're after:
data have;
input ID X Y Z ;
cards;
123 . 8 .
123 7 . .
456 . 9 .
456 . . 2
124 . 4 .
124 4 . .
124 . . .
124 . . 3
;
data want;
set have;
by id notsorted;
retain val_x val_y val_z;
if first.id then call missing(val_x, val_y, val_z);
if x ne . then val_x=x;
if y ne . then val_y=y;
if z ne . then val_z=z;
run;
I got the following which is the same as the have dataset:
x | y | z | droprec |
4 | . | . | 1 |
. | 4 | . | 1 |
. | . | . | 1 |
. | . | 3 | 0 |
Look at the val_ variables instead, you can rename them if it makes you happy at the end.
oh wow. I can't believe I missed that. hahaha. thanks so much! this is it!
I really think UPDATE is the way to go, you just have to pre-prepare an empty model
option obs=0 ;
data final_shape ;
set this that the other ; *list all transaction tables;
run ;
option obs= max ;
data view/view=view ;
set this that the other ;
by ID notsorted ;
run ;
data final_data ;
update final_shape view ;
by id notsorted ;
run ;
****** beware I haven't tested this ;
Check again. The results you are describing would represent a bug in the software. UPDATE has been around for decades and it is not likely to have that sort of a bug in it!
Sorry. I don't think my first post work.
try this one:
data have;
input ID X y Z droprec ;
cards;
123 4 . . 1
123 . 4 . 1
123 . . . 1
123 . . 3 0
;
options missing='.';
data want(drop= _:);
retain x y z;
set have;
by id;
if first.id then do;
_x=x; _y=y;_z=z; end;
else do;
x=ifn(x=.,_x,x);
y=ifn(y=.,_y,y);
z=ifn(z=.,_z,z);
end;
if last.id;
run;
proc print;run;
Obs _x _y _z ID
1 4 4 3 123
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.