Hi all,
I am working on conversion of a table with many missing values that I need to group to one.
each id has for each variable either missing or a filled in value. so only 2 options, not more.
table looks like this
have
ID FLG1 FLG2 FLG3 FLG4 FLG5 FLG6 FLG7 FLG8 FLG9 1 1 X 1 X 2 X 2 X 2 X 2 SEG1 3 SEG4 3 X
want
ID FLG1 FLG2 FLG3 FLG4 FLG5 FLG6 FLG7 FLG8 FLG9 1 X X 2 X X X SEG1 3 X SEG4
I started with using a dummy value to copy the value to other rows using retain,
but it seems too tedious, since I have to do it for each variable (even tough I could put in a macro).
%macro row_copy(data_in,data_out,var1,var2); %sort(&data_in,&var1 descending &var2); data &data_out; set &data_in; by &var1; retain dummyvar; if first.&var1 then dummyvar = &var2; if missing(&var2) then &var2 = dummyvar; drop dummyvar; run; %mend row_copy;
than I would repeat the macro for each value,
for instance if I would to do it for this table
%row_copy(have,want1,ID,flg1); %row_copy(want2,want1,ID,flg2);
etc...and repeat for all flags, this gives a table still with duplicate rows by ID, but no more missing values.
than I can just end with a group by to get the final result.
but I feel like there is an easier way that I am missing (in one step and not 10)
Ok, how about the old double transpose method?
data fake;
infile cards truncover dlm=',' dsd;
input id (flg1-flg4)($) ;
cards;
1,X
1, ,X
1, , ,SEG1
2, ,X
2, , , ,FROG
;
proc transpose data=fake out=fake2;
by id;
var flg1-flg4;
run;
data want_transposed;
set fake2;
final=coalescec(of col:);
run;
proc transpose data=want_transposed out=want;
by id;
var final;
run;
You would make your programming a huge amount simpler if instead of X you use the number (not character) 1, and if instead of SEG1 use the number (not character) 1, and if instead of SEG4, you use the number (not character) 4.
Then, the coding is very simple
proc summary data=have nway;
class id;
var flg1-flg9;
output out=want max=;
run;
So, convert your X, SEG1 and SEG4 to numeric values.
thx for the suggestion, but unfortunately, half of them are X, the others not the same string wise. you could have values like PINK, unknown, blue, outdated,...
I would have to make a huge referential table, and also the values tend to change as well (I need to run this routine daily), so I would need to change everything when a new value of a certain flag is released.
Also the names aren't really flg1 etc, but more like f.e. segment, target, scope,...so i'd need to convert that as well.
Ok, how about the old double transpose method?
data fake;
infile cards truncover dlm=',' dsd;
input id (flg1-flg4)($) ;
cards;
1,X
1, ,X
1, , ,SEG1
2, ,X
2, , , ,FROG
;
proc transpose data=fake out=fake2;
by id;
var flg1-flg4;
run;
data want_transposed;
set fake2;
final=coalescec(of col:);
run;
proc transpose data=want_transposed out=want;
by id;
var final;
run;
Did I miss something ?
data fake;
infile cards truncover dlm=',' dsd;
input id (flg1-flg4)($) ;
cards;
1,X
1, ,X
1, , ,SEG1
2, ,X
2, , , ,FROG
;
data want;
update fake(obs=0) fake;
by id;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.