Hello All,
Hope all is well. This is my first post since I've reviewed several others but unfortunately was not able to find the right solution to fit my circumstances. I have a data set that requires me to provide: (1) one row per patient id, (2) collapsed (corrected) values from subsequent rows into the earliest patient record. Please see below.
EXAMPLE
Datum id DayFromTxBase CorrectionBas Var1 Var2 Var3 Var4 Var5
1/27/2016 22:06 Z029 -2 ABC 1 2 3 UVW
6/23/2017 15:36 Z029 -2 1 DEF . 2 5 XYZ
9/27/2019 10:37 Z029 -2 1 GHI 3 . 4 XYZ
DESIRED RESULT
Datum id DayFromTxBase CorrectionBas Var1 Var2 Var3 Var4 Var5
1/27/2016 22:06 Z029 -2 GHI 3 2 4 XYZ
I've inherited some code from a colleague that runs an import macro which (1) checks the data set to see if there are corrections, and (2) if so, then make an original and a corrections data set, and (3) update the original values with corrected values. Please see below.
%macro import2b(site=want);
proc import out=want
datafile="\\.."
dbms=xlsx replace ;
sheet= " ";
getnames=yes;
run;
%symdel correct;
/* checks the dataset if there are any corrections*/
proc sql;
create table want as
select max(CORRECTIONBAS) as max_c,CORRECTIONBAS,*
from want ;
QUIT;
data want ;
%let correct=0;
set want;
call symput('correct',max_c);
run;
/* if there are corrections then make a original dataset and a corrections dataset*/
%if &correct=1 %then
%do;
DATA ORIGINAL CORRECTION;
SET want;
if CORRECTIONBAS=. then
output ORIGINAL;
if CORRECTIONBAS=1 then
output CORRECTION;
RUN;
proc sort data=original;
by ID DayFromTxBase;
run;
proc sort data=correction;
by ID DayFromTxBase;
run;
/* Update the original values (in original data)with corrected values (in corrected dataset)*/
DATA want;
UPDATE ORIGINAL CORRECTION;
BY ID DayFromTxBase;
RUN;
proc datasets;
delete ORIGINAL CORRECTION;
run;
%end;
%mend;
import2b (site =want );
The long story short here is that this current program works for some patient ids and does not work (returns more than record) for other patient ids. Also, we've also advised to remove the DayFromTxBase column altogether as this data has been deemed to be unreliable, and thus we will likely devise a solution that sorts the tables by Datum. We've been wracking our brains on trying to figure this out, so hopefully there's someone out there who can help resolve this issue. Thank you.
Your logic is a little bit incomplete. It looks like what you want is:
The value from the first row for Datum, id, DayFromTxBase, CorrectionBas
The last non-missing value for Var1 through Var5
Is this correct?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.