I have two tables: "bweight_old" and "bweight_new". The only difference between the two is that in "bweight_new", I set outliers to null. How can I create table "bweight_want" that lists only the variables that changed, specifying the PATIENT_ID, the variable name of the outlier, and the value of the outlier in "bweight_old".
data bweight_old;
format patient_id;
set sashelp.bweight;
patient_id+1;
run;
data bweight_new;
set bweight_old;
if patient_id=25 then weight=. and momwtgain=.;
if patient_id=2020 then cigsperday=.;
if patient_id=2029 then momwtgain=.;
run;
data bweight_want;
format patient_id variable originalvalue;
length variable $20.;
input patient_id variable $ originalvalue ;
cards;
25 weight 4050
25 momwtgain -20
2020 cigsperday 20
2090 momwtgain -2
;
run;
This is what I started with, but I'm not sure where to go from here.
proc compare base = bweight_old compare = bweight_new out = bweight_want
outnoequal outbase outcomp noprint;
id patient_id;
run;
You can try the data step as well to know the difference
proc sort data=bweight_old;
by _all_;
run;
proc sort data=bweight_new;
by _all_;
run;
data bweight_want;
merge bweight_old(in=a) bweight_new(in=b);
by _all_;
if a and not b;
run;
alternatively with proc compare also we can identify the variables that differ between the datasets. You can output the want dataset from proc compare and check for the variables with values less than zero which also include missing values then we get the differing records and which ever the variables that have missing values or values below zero could be considered as differing.
proc compare base = bweight_old compare = bweight_new out=want listall;
id patient_id;
run;
data want2;
set want;
array vars(*) weight -- momedlevel;
do i = 1 to dim(vars);
if vars(i)<0 then output;
end;
run;
Thanks, Jag. I'm trying to find a way to present the results of proc compare statement you provided in a single table. The array works well, but since I have hundreds of variables, I only want to see the variables that changed.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.