Hi All,
I have data as below.
Subject Disease
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
I need output as below.
Subject Disease1 Disease2 Disease3
X1 Fever Acidity Vomiting
X2 Chills Nausea
We may have more data in Disease column sepearated by ^.I am looking for macro which can count number of ^ in string and create new variables in dataset for each of disease .
To start with, i have used countc function for occurence of ^ but after that i am not getting any way to move ahead.
Can anyone please suggest way forward.
Regards,
Rajesh
Well, two ways - simplest I find:
data inter;
set have;
do i=1 to countw(disease,"^");
d=scan(disease,i,"^");
output;
end;
run;
proc transpose data=inter out=want prefix=disease;
by subject;
var d;
run;
You can also find max(count of ^) then use arrays, but thats more code.
Well, two ways - simplest I find:
data inter;
set have;
do i=1 to countw(disease,"^");
d=scan(disease,i,"^");
output;
end;
run;
proc transpose data=inter out=want prefix=disease;
by subject;
var d;
run;
You can also find max(count of ^) then use arrays, but thats more code.
The "easy" approach might be to just set a large number of variables assuming you'll never exceed that number.
data have;
infile datalines missover;
informat subject $5. disease $100.;
input Subject Disease ;
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X3 Fever^Acidity^SomethingMuchlonger
;
run;
data want;
set have;
array d $30 disease1-disease25;
do i=1 to (countw(disease,'^'));
d[i] = scan(disease,i,'^');
end;
drop i;
run;
A possibly more critical issue is actually how long to make your individul "disease" variables (which look more like symptoms). The array I used set them to 30 characters.
Execise for the interested reader is to capture the maximum number actually used and then remove unused (hint: retain max i, call symputx and a drop statement in another datastep.
Hi, here's another idea. I made the maximum number of diseses five (D1-D5) and the maimum lenght of any disease twenty. You could make those values anythimg you like.
data x;
infile datalines dsd dlm='^' missover;
input @;
substr(_infile_,find(_infile_,' '),1) = '^';
input id :$5. (d1-d5) (:$20.);
count = countc(_infile_,'^');
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X3 Bad Vibes^Bummed^Just Plain Tired
X1234 Upset Stomach
;
DATA SET: x
id d1 d2 d3 d4 d5 count
X1 Fever Acidity Vomiting 3
X2 Chills Nausea 2
X3 Bad Vibes Bummed Just Plain Tired 3
X1234 Upset Stomach 1
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.