Solved
Contributor
Posts: 63

# Create variables as per separators in string

Hi All,

I have data as below.

Subject           Disease

X1                  Fever^Acidity^Vomiting

X2                  Chills^Nausea

I need output as below.

Subject    Disease1     Disease2      Disease3

X1            Fever           Acidity          Vomiting

X2            Chills            Nausea

We may have more data in Disease column sepearated by ^.I am looking for macro which can count number of ^ in string and create new variables in dataset for each of disease .

To start with, i have used countc function for occurence of ^ but after that i am not getting any way to move ahead.

Can anyone please suggest way forward.

Regards,

Rajesh

Accepted Solutions
Solution
‎08-31-2017 10:18 AM
Super User
Posts: 9,599

## Re: Create variables as per separators in string

Well, two ways - simplest I find:

```data inter;
set have;
do i=1 to countw(disease,"^");
d=scan(disease,i,"^");
output;
end;
run;
proc transpose data=inter out=want prefix=disease;
by subject;
var d;
run;```

You can also find max(count of ^) then use arrays, but thats more code.

All Replies
Solution
‎08-31-2017 10:18 AM
Super User
Posts: 9,599

## Re: Create variables as per separators in string

Well, two ways - simplest I find:

```data inter;
set have;
do i=1 to countw(disease,"^");
d=scan(disease,i,"^");
output;
end;
run;
proc transpose data=inter out=want prefix=disease;
by subject;
var d;
run;```

You can also find max(count of ^) then use arrays, but thats more code.

Super User
Posts: 13,521

## Re: Create variables as per separators in string

The "easy" approach might be to just set a large number of variables assuming you'll never exceed that number.

```data have;
infile datalines missover;
informat subject \$5. disease \$100.;
input Subject           Disease ;
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X3 Fever^Acidity^SomethingMuchlonger
;
run;

data want;
set have;
array d \$30 disease1-disease25;
do i=1 to (countw(disease,'^'));
d[i] = scan(disease,i,'^');
end;
drop i;
run;
```

A possibly more critical issue is actually how long to make your individul "disease" variables (which look more like symptoms). The array I used set them to 30 characters.

Execise for the interested reader is to capture the maximum number actually used and then remove unused (hint: retain max i, call symputx and a drop statement in another datastep.

Valued Guide
Posts: 765

## Re: Create variables as per separators in string

Hi, here's another idea. I made the maximum number of diseses five (D1-D5) and the maimum lenght of any disease twenty. You could make those values anythimg you like.

data x;
infile datalines dsd dlm='^' missover;
input @;
substr(_infile_,find(_infile_,' '),1) = '^';
input id :\$5. (d1-d5) (:\$20.);
count = countc(_infile_,'^');
datalines;
X1 Fever^Acidity^Vomiting
X2 Chills^Nausea
X1234 Upset Stomach
;

``````DATA SET: x

id       d1                 d2       d3                  d4    d5    count

X1       Fever            Acidity    Vomiting                          3
X2       Chills           Nausea                                       2
X3       Bad Vibes        Bummed     Just Plain Tired                  3
X1234    Upset Stomach                                                 1
``````
☑ This topic is solved.