Hi All,
I have data as below.
Subject Disease
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
I need output as below.
Subject Disease1 Disease2 Disease3
X1 Fever Acidity Vomiting
X2 Chills Nausea
We may have more data in Disease column sepearated by ^.I am looking for macro which can count number of ^ in string and create new variables in dataset for each of disease .
To start with, i have used countc function for occurence of ^ but after that i am not getting any way to move ahead.
Can anyone please suggest way forward.
Regards,
Rajesh
Well, two ways - simplest I find:
data inter; set have; do i=1 to countw(disease,"^"); d=scan(disease,i,"^"); output; end; run; proc transpose data=inter out=want prefix=disease; by subject; var d; run;
You can also find max(count of ^) then use arrays, but thats more code.
Well, two ways - simplest I find:
data inter; set have; do i=1 to countw(disease,"^"); d=scan(disease,i,"^"); output; end; run; proc transpose data=inter out=want prefix=disease; by subject; var d; run;
You can also find max(count of ^) then use arrays, but thats more code.
The "easy" approach might be to just set a large number of variables assuming you'll never exceed that number.
data have; infile datalines missover; informat subject $5. disease $100.; input Subject Disease ; datalines; X1 Fever^Acidity^Vomiting X2 Chills^Nausea X3 Fever^Acidity^SomethingMuchlonger ; run; data want; set have; array d $30 disease1-disease25; do i=1 to (countw(disease,'^')); d[i] = scan(disease,i,'^'); end; drop i; run;
A possibly more critical issue is actually how long to make your individul "disease" variables (which look more like symptoms). The array I used set them to 30 characters.
Execise for the interested reader is to capture the maximum number actually used and then remove unused (hint: retain max i, call symputx and a drop statement in another datastep.
Hi, here's another idea. I made the maximum number of diseses five (D1-D5) and the maimum lenght of any disease twenty. You could make those values anythimg you like.
data x;
infile datalines dsd dlm='^' missover;
input @;
substr(_infile_,find(_infile_,' '),1) = '^';
input id :$5. (d1-d5) (:$20.);
count = countc(_infile_,'^');
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X3 Bad Vibes^Bummed^Just Plain Tired
X1234 Upset Stomach
;
DATA SET: x
id d1 d2 d3 d4 d5 count
X1 Fever Acidity Vomiting 3
X2 Chills Nausea 2
X3 Bad Vibes Bummed Just Plain Tired 3
X1234 Upset Stomach 1
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.