Hello all,
I'm pulling table from SQL, say I have a column named "illness":
ID | illness ----|--------- 1 | fever,headache 2 | headache 3 | cough,fever 4 | cough,fever,headache
In total there are, 4 possible categories (fever, headache, cough, other), how can I go about separating it into 4 columns with 0 or 1 values
ID | fever | headache | cough | other ----|------|----------|-------|--------- 1 | 1 | 1 | 0 | 0 2 | 0 | 1 | 0 | 0 3 | 1 | 0 | 1 | 0 4 | 1 | 1 | 1 | 0
Thank you
Do you need the result as dataset or report?
Here is one way to create a dataset:
data have;
length id 8 illness $ 100;
input id illness;
datalines;
1 fever,headache
2 headache
3 cough,fever
4 cough,fever,headache
;
run;
data want;
set have;
length fever headache cough other 8;
array ill[1:4] fever headache cough other;
retain illnessList "fever headache cough other";
do i = 1 to dim(ill);
ill[i] = 0;
end;
do i = 1 to countw(illness, ',');
p = findw(illnessList, scan(illness, i, ','), ' ', 'itse');
ill[p] = ill[p] + 1;
end;
drop i p illness illnessList;
run;
proc print;run;
Do you need the result as dataset or report?
Here is one way to create a dataset:
data have;
length id 8 illness $ 100;
input id illness;
datalines;
1 fever,headache
2 headache
3 cough,fever
4 cough,fever,headache
;
run;
data want;
set have;
length fever headache cough other 8;
array ill[1:4] fever headache cough other;
retain illnessList "fever headache cough other";
do i = 1 to dim(ill);
ill[i] = 0;
end;
do i = 1 to countw(illness, ',');
p = findw(illnessList, scan(illness, i, ','), ' ', 'itse');
ill[p] = ill[p] + 1;
end;
drop i p illness illnessList;
run;
proc print;run;
Hi thank you this is really helpful.
Actually in my data, all the illnesses are coded as numbers. E.g cough=1, fever=2, headache=3, other=11.
So with findw() function, will it pick up 1, even though it's actually 11?
Like: if person 5 has 'other'. Will the findw() function read it as 'cough' and 'other'?
@JackyK wrote:
Actually in my data, all the illnesses are coded as numbers. E.g cough=1, fever=2, headache=3, other=11.So with findw() function, will it pick up 1, even though it's actually 11?
Like: if person 5 has 'other'. Will the findw() function read it as 'cough' and 'other'?
That sounds like your actual source data looks quite different from what you've posted.
Can you please provide a SAS data step which creates sample source data as close to what you really have? And then show us the desired result using this sample data.
** let 1=cough, 2=fever, 3=headache, 11=other
** patient #3 has headache and fever;
data have;
input ID illness$;
datalines;
1 1,2
2 2
3 3,11
4 1,2,3,11
;
run;
My follow up question was would the countw() function count patient #3 as cough,headache,other instead of just headache,other.
Because there is a '1' in '11'. But I just tried the code and it works perfectly!
Thank you
One way to go:
data have;
infile datalines dlm='|' truncover;
input id $ illness $30.;
datalines;
1|fever,headache
2|headache
3|cough,fever
4|cough,fever,headache
4|rash
;
proc sql;
create table want as
select
id,
find(illness,'fever')>0 as fever,
find(illness,'headache')>0 as headache,
find(illness,'cough')>0 as cough,
max(calculated fever, calculated headache, calculated cough)=0 as other
from have
;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.