Hi All,
I have data as below.
Subject Disease
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
I need output as below.
Subject Disease1 Disease2 Disease3
X1 Fever Acidity Vomiting
X2 Chills Nausea
We may have more data in Disease column sepearated by ^.I am looking for macro which can count number of ^ in string and create new variables in dataset for each of disease .
To start with, i have used countc function for occurence of ^ but after that i am not getting any way to move ahead.
Can anyone please suggest way forward.
Regards,
Rajesh
Well, two ways - simplest I find:
data inter; set have; do i=1 to countw(disease,"^"); d=scan(disease,i,"^"); output; end; run; proc transpose data=inter out=want prefix=disease; by subject; var d; run;
You can also find max(count of ^) then use arrays, but thats more code.
Well, two ways - simplest I find:
data inter; set have; do i=1 to countw(disease,"^"); d=scan(disease,i,"^"); output; end; run; proc transpose data=inter out=want prefix=disease; by subject; var d; run;
You can also find max(count of ^) then use arrays, but thats more code.
The "easy" approach might be to just set a large number of variables assuming you'll never exceed that number.
data have; infile datalines missover; informat subject $5. disease $100.; input Subject Disease ; datalines; X1 Fever^Acidity^Vomiting X2 Chills^Nausea X3 Fever^Acidity^SomethingMuchlonger ; run; data want; set have; array d $30 disease1-disease25; do i=1 to (countw(disease,'^')); d[i] = scan(disease,i,'^'); end; drop i; run;
A possibly more critical issue is actually how long to make your individul "disease" variables (which look more like symptoms). The array I used set them to 30 characters.
Execise for the interested reader is to capture the maximum number actually used and then remove unused (hint: retain max i, call symputx and a drop statement in another datastep.
Hi, here's another idea. I made the maximum number of diseses five (D1-D5) and the maimum lenght of any disease twenty. You could make those values anythimg you like.
data x;
infile datalines dsd dlm='^' missover;
input @;
substr(_infile_,find(_infile_,' '),1) = '^';
input id :$5. (d1-d5) (:$20.);
count = countc(_infile_,'^');
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X3 Bad Vibes^Bummed^Just Plain Tired
X1234 Upset Stomach
;
DATA SET: x
id d1 d2 d3 d4 d5 count
X1 Fever Acidity Vomiting 3
X2 Chills Nausea 2
X3 Bad Vibes Bummed Just Plain Tired 3
X1234 Upset Stomach 1
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.