BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vgrywacheski
Fluorite | Level 6

Hello,

 

I have a string variable called "illness" which contains a varying number of conditions separated by commas:

 

EX what I have:

Illness: "Heart Attack, Dementia, Cancer"

 

I need to create a new variable for each condition with each condition remapped to a new classification.

 

EX what I want:

Ilness1: "I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)

Illness2: "F00-F09 Organic disorders"  (i.e. Dementia)

Illness3: "C00-D48 Neoplasms"  (i.e. Cancer)

 

I was thinking I could do a simple if then statement like the one below:


data x;
set x;
If illness contains "Heart Attack" then illness1="I20-I25 Ischaemic heart diseases";

If illness contains "Dementia" then illness2="F00-F09 Organic disorders";

If illness contains "Cancer" then illness3="C00-D48 Neoplasms";
.....

run;

 

However, this would be very time consuming seeing that the variable contains a varying number of conditions... Any suggestions for a more efficient way to achieve this?

 

Thanks for your insight!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

View solution in original post

5 REPLIES 5
Reeza
Super User

You can use the SCAN() function to separate words if they're comma delimited as in your example.

 

You can also output them out to a 'long format' data set and then do a merge or lookup to get the full codes. 

Use COUNTC to count the number of commas and see how many words you need to parse out.

Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

vgrywacheski
Fluorite | Level 6

Thank you so much Patrick! This is exactly what I needed 🙂

PGStats
Opal | Level 21

You could use an indexed dataset with key= access

 

 

 

 

data have;
patient = 1;
Illnesses = "Heart Attack, Dementia, Cancer";
output;
patient = 2;
Illnesses = "Heart Attack, Tumor, Booboo";
output;
run;

data d2(index = (illness/unique));
infile datalines dsd;
length label illness $32;
input label &  Illness &;
illness = upcase(illness);
datalines;
I20-I25 Ischaemic heart diseases, Heart Attack
F00-F09 Organic disorders, Dementia
C00-D48 Neoplasms, Cancer
C00-D48 Neoplasms, Tumor
;

data want;
set have;
length illness $32;
array a $32 illness1 - illness10;
do i = 1 to countw(Illnesses, ",");
    illness = upcase(scan(Illnesses, i, ",", "R"));
    set d2 key=illness;
    if _iorc_ = 0 then a{i} = label;
    else a{i} = "*** Unknown ***";
    end;
keep patient illness1-illness10;
run;

 

 

PG
ShiroAmada
Lapis Lazuli | Level 10

%let illness1=%str('HEART ATTACK','CARDIAC');

%let illness2=%str('DEMENTIA','AMNESIA');

%let illness3=%str('CANCER','CIRRHOSIS');

 

data  WANT;

  set HAVE;

  length ILLNESS1 ILLNESS2 ILLNESS3 $30;

  ILLNESS1=ifc(upcase(ILLNESS) in (&illness1),"I20-I25 Ischaemic heart diseases","");

  ILLNESS2=ifc(upcase(ILLNESS) in (&illness2),"F00-F09 Organic disorders","");

  ILLNESS3=ifc(upcase(ILLNESS) in (&illness3),"C00-D48 Neoplasms","");

run;

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1550 views
  • 2 likes
  • 5 in conversation