I have transposed data that I'd like to recode into 5 separate columns. The issue is that the data within the columns are unordered. So I do end up with binary values 0/1, but they correspond to the wrong newly created variable. I would like for the field in the first columns (Col1-Col5) to correctly agree with the newly created variables (DX thru UN).
ID Col1 Col2 Col3 Col4 Col5
1 MH CARE SP CARE PR CARE
2 UN CARE DX CARE PR CARE UN CARE MH CARE
3 MH CARE UN CARE
4 . . . . .
5 UN CARE
want; set have;
array sc{5}$ col1-col5;
array pc{5} DX SP PR MH UN ;
do i = 1 to 5;
if sc(i)='DX CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='SP CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='PR CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='MH CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='UN CARE' then pc(i)=1; else pc(i)=0;
end;
drop i;
run;
My results which aren't correct:
ID DX SP PR MH UN
1 1 1 1 0 0
2 1 1 1 1 1
3 1 1 0 0 0
4 . . . . .
5 1 0 0 0 0
I'd like the output below:
ID DX SP PR MH UN
1 0 1 1 1 0
2 1 1 1 1 1
3 0 0 0 1 1
4 . . . . .
5 0 0 0 0 1
Another approach plus an actual data set to work with:
data have; infile datalines dlm='|' truncover; informat id best5. Col1 Col2 Col3 Col4 Col5 $10.; input ID Col1 Col2 Col3 Col4 Col5; datalines; 1|MH CARE|SP CARE|PR CARE 2|UN CARE|DX CARE|PR CARE|UN CARE|MH CARE 3|MH CARE|UN CARE 4 5|UN CARE ; run; data want; set have; array sc{5}$ col1-col5; array pc{5} DX SP PR MH UN ; array t{5} $10 _temporary_ ('DX CARE','SP CARE','PR CARE','MH CARE','UN CARE'); do i= 1 to dim(pc); pc[i]=0; end; do i= 1 to dim(sc); if not missing(sc[i]) then pc[whichc(sc[i],of t(*))]=1; end; keep id dx--un; run;
Think about yoru ELSE conditions.
If it's not found it's resetting it to 0, even if you've previously set it to 1...
I think this is what you want instead:
if sc(i) in ('DX CARE', 'SP CARE', 'PR CARE', 'MH CARE', 'UN CARE') then pc(i)=1;
else pc(i)=0;
@jsandsk wrote:
I have transposed data that I'd like to recode into 5 separate columns. The issue is that the data within the columns are unordered. So I do end up with binary values 0/1, but they correspond to the wrong newly created variable. I would like for the field in the first columns (Col1-Col5) to correctly agree with the newly created variables (DX thru UN).
ID Col1 Col2 Col3 Col4 Col5
1 MH CARE SP CARE PR CARE
2 UN CARE DX CARE PR CARE UN CARE MH CARE
3 MH CARE UN CARE
4 . . . . .
5 UN CARE
want; set have;
array sc{5}$ col1-col5;
array pc{5} DX SP PR MH UN ;
do i = 1 to 5;
if sc(i)='DX CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='SP CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='PR CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='MH CARE' then pc(i)=1; else pc(i)=0;
if sc(i)='UN CARE' then pc(i)=1; else pc(i)=0;
end;
drop i;
run;
My results which aren't correct:
ID DX SP PR MH UN
1 1 1 1 0 0
2 1 1 1 1 1
3 1 1 0 0 0
4 . . . . .
5 1 0 0 0 0
I'd like the output below:
ID DX SP PR MH UN
1 0 1 1 1 0
2 1 1 1 1 1
3 0 0 0 1 1
4 . . . . .
5 0 0 0 0 1
Post your exact code and log, and if you can provide some sample data as text that would help to test it. See instructions below, only need a few rows.
Unsure what you mean because I did transpose the data. Each record contains a different amount of the values, some don't contain any. So I still need to get the correct corresponding values for the new variables.
Another approach plus an actual data set to work with:
data have; infile datalines dlm='|' truncover; informat id best5. Col1 Col2 Col3 Col4 Col5 $10.; input ID Col1 Col2 Col3 Col4 Col5; datalines; 1|MH CARE|SP CARE|PR CARE 2|UN CARE|DX CARE|PR CARE|UN CARE|MH CARE 3|MH CARE|UN CARE 4 5|UN CARE ; run; data want; set have; array sc{5}$ col1-col5; array pc{5} DX SP PR MH UN ; array t{5} $10 _temporary_ ('DX CARE','SP CARE','PR CARE','MH CARE','UN CARE'); do i= 1 to dim(pc); pc[i]=0; end; do i= 1 to dim(sc); if not missing(sc[i]) then pc[whichc(sc[i],of t(*))]=1; end; keep id dx--un; run;
something like this?
data want;
set have;
array sc{5}$ col1-col5;
array pc{5} DX SP PR MH UN ;
do i = 1 to 5;
if vname(pc(i)) in sc then pc(i)=1; else pc(i)=0;
end;
drop i;
run;
what exactly does 'vname' reference? The name of the new variable?
yes The name of the new variable in your second array
I was feeling lazy to test but here you go. I tested with your sample:
data have;
infile datalines truncover;
input ID Col1 & $10. Col2 & $10. Col3 & $10. Col4 & $10. Col5 & $10. ;
datalines;
1 MH CARE SP CARE PR CARE
2 UN CARE DX CARE PR CARE UN CARE MH CARE
3 MH CARE UN CARE
4 . . . . .
5 UN CARE
;
data want;
set have;
array sc{5}$ col1-col5;
array pc{5} DX SP PR MH UN ;
do i = 1 to 5;
if catx(' ',vname(pc(i)),'CARE') in sc then pc(i)=1; else pc(i)=0;
end;
drop i;
run;
This works, doesn't matter what the values are or how many. It removes duplicates as well.
data have;
infile datalines dlm='|' truncover;
informat id best5. Col1 Col2 Col3 Col4 Col5 $10.;
input ID Col1 Col2 Col3 Col4 Col5;
datalines;
1|MH CARE|SP CARE|PR CARE
2|UN CARE|DX CARE|PR CARE|UN CARE|MH CARE
3|MH CARE|UN CARE
4
5|UN CARE
;
run;
proc transpose data=have out=_temp;
by ID;
var col1-col5;
run;
data _temp2;
set _temp;
Value = 1;
Prefix = scan(col1, 1);
run;
*remove duplicates;
proc sort data=_temp2 nodupkey;
by id prefix;
run;
proc transpose data=_temp2 out=want;
by ID;
ID PREFIX;
Var Value;
run;
Why does record 2, get SP=1 when there's no SP CARE?
Or should Column 4 be SP CARE?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.