BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vgrywacheski
Fluorite | Level 6

Hello,

 

I have a string variable called "illness" which contains a varying number of conditions separated by commas:

 

EX what I have:

Illness: "Heart Attack, Dementia, Cancer"

 

I need to create a new variable for each condition with each condition remapped to a new classification.

 

EX what I want:

Ilness1: "I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)

Illness2: "F00-F09 Organic disorders"  (i.e. Dementia)

Illness3: "C00-D48 Neoplasms"  (i.e. Cancer)

 

I was thinking I could do a simple if then statement like the one below:


data x;
set x;
If illness contains "Heart Attack" then illness1="I20-I25 Ischaemic heart diseases";

If illness contains "Dementia" then illness2="F00-F09 Organic disorders";

If illness contains "Cancer" then illness3="C00-D48 Neoplasms";
.....

run;

 

However, this would be very time consuming seeing that the variable contains a varying number of conditions... Any suggestions for a more efficient way to achieve this?

 

Thanks for your insight!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

View solution in original post

5 REPLIES 5
Reeza
Super User

You can use the SCAN() function to separate words if they're comma delimited as in your example.

 

You can also output them out to a 'long format' data set and then do a merge or lookup to get the full codes. 

Use COUNTC to count the number of commas and see how many words you need to parse out.

Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

vgrywacheski
Fluorite | Level 6

Thank you so much Patrick! This is exactly what I needed 🙂

PGStats
Opal | Level 21

You could use an indexed dataset with key= access

 

 

 

 

data have;
patient = 1;
Illnesses = "Heart Attack, Dementia, Cancer";
output;
patient = 2;
Illnesses = "Heart Attack, Tumor, Booboo";
output;
run;

data d2(index = (illness/unique));
infile datalines dsd;
length label illness $32;
input label &  Illness &;
illness = upcase(illness);
datalines;
I20-I25 Ischaemic heart diseases, Heart Attack
F00-F09 Organic disorders, Dementia
C00-D48 Neoplasms, Cancer
C00-D48 Neoplasms, Tumor
;

data want;
set have;
length illness $32;
array a $32 illness1 - illness10;
do i = 1 to countw(Illnesses, ",");
    illness = upcase(scan(Illnesses, i, ",", "R"));
    set d2 key=illness;
    if _iorc_ = 0 then a{i} = label;
    else a{i} = "*** Unknown ***";
    end;
keep patient illness1-illness10;
run;

 

 

PG
ShiroAmada
Lapis Lazuli | Level 10

%let illness1=%str('HEART ATTACK','CARDIAC');

%let illness2=%str('DEMENTIA','AMNESIA');

%let illness3=%str('CANCER','CIRRHOSIS');

 

data  WANT;

  set HAVE;

  length ILLNESS1 ILLNESS2 ILLNESS3 $30;

  ILLNESS1=ifc(upcase(ILLNESS) in (&illness1),"I20-I25 Ischaemic heart diseases","");

  ILLNESS2=ifc(upcase(ILLNESS) in (&illness2),"F00-F09 Organic disorders","");

  ILLNESS3=ifc(upcase(ILLNESS) in (&illness3),"C00-D48 Neoplasms","");

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 597 views
  • 2 likes
  • 5 in conversation