BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vgrywacheski
Fluorite | Level 6

Hello,

 

I have a string variable called "illness" which contains a varying number of conditions separated by commas:

 

EX what I have:

Illness: "Heart Attack, Dementia, Cancer"

 

I need to create a new variable for each condition with each condition remapped to a new classification.

 

EX what I want:

Ilness1: "I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)

Illness2: "F00-F09 Organic disorders"  (i.e. Dementia)

Illness3: "C00-D48 Neoplasms"  (i.e. Cancer)

 

I was thinking I could do a simple if then statement like the one below:


data x;
set x;
If illness contains "Heart Attack" then illness1="I20-I25 Ischaemic heart diseases";

If illness contains "Dementia" then illness2="F00-F09 Organic disorders";

If illness contains "Cancer" then illness3="C00-D48 Neoplasms";
.....

run;

 

However, this would be very time consuming seeing that the variable contains a varying number of conditions... Any suggestions for a more efficient way to achieve this?

 

Thanks for your insight!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

View solution in original post

5 REPLIES 5
Reeza
Super User

You can use the SCAN() function to separate words if they're comma delimited as in your example.

 

You can also output them out to a 'long format' data set and then do a merge or lookup to get the full codes. 

Use COUNTC to count the number of commas and see how many words you need to parse out.

Patrick
Opal | Level 21

@vgrywacheski

And to give you some possible sample code for what @Reeza describes.

data have;
  id=1;
  have_str='Heart  Attack, Dementia, Cancer, something else'; 
run;

proc format;
  value $StandIll
    HEART ATTACK    ='I20-I25 Ischaemic heart diseases" (i.e. Heart Attack)'
    DEMENTIA        ='F00-F09 Organic disorders"  (i.e. Dementia)'
    CANCER          ='C00-D48 Neoplasms"  (i.e. Cancer)'
    ;
quit;

data want(drop=_:);
  set have;
  _cnt=countc(have_str,',');
  do _i=0 to _cnt;
    term=scan(have_str,_i+1,',');
    term=upcase(compbl(strip(term)));
    want_str=put(term,$StandIll.);
    found_flg = (want_str ne term);
    output;
  end;
run;

N.B: If you're after actually working sample code then the best way to motivate forum members to provide you with such code is to provide sample data via a working SAS data step creating such data. It's also highly appreciated if you describe what you've already tried and got stuck with. And of course as you've done it: Describe the desired result.

vgrywacheski
Fluorite | Level 6

Thank you so much Patrick! This is exactly what I needed 🙂

PGStats
Opal | Level 21

You could use an indexed dataset with key= access

 

 

 

 

data have;
patient = 1;
Illnesses = "Heart Attack, Dementia, Cancer";
output;
patient = 2;
Illnesses = "Heart Attack, Tumor, Booboo";
output;
run;

data d2(index = (illness/unique));
infile datalines dsd;
length label illness $32;
input label &  Illness &;
illness = upcase(illness);
datalines;
I20-I25 Ischaemic heart diseases, Heart Attack
F00-F09 Organic disorders, Dementia
C00-D48 Neoplasms, Cancer
C00-D48 Neoplasms, Tumor
;

data want;
set have;
length illness $32;
array a $32 illness1 - illness10;
do i = 1 to countw(Illnesses, ",");
    illness = upcase(scan(Illnesses, i, ",", "R"));
    set d2 key=illness;
    if _iorc_ = 0 then a{i} = label;
    else a{i} = "*** Unknown ***";
    end;
keep patient illness1-illness10;
run;

 

 

PG
ShiroAmada
Lapis Lazuli | Level 10

%let illness1=%str('HEART ATTACK','CARDIAC');

%let illness2=%str('DEMENTIA','AMNESIA');

%let illness3=%str('CANCER','CIRRHOSIS');

 

data  WANT;

  set HAVE;

  length ILLNESS1 ILLNESS2 ILLNESS3 $30;

  ILLNESS1=ifc(upcase(ILLNESS) in (&illness1),"I20-I25 Ischaemic heart diseases","");

  ILLNESS2=ifc(upcase(ILLNESS) in (&illness2),"F00-F09 Organic disorders","");

  ILLNESS3=ifc(upcase(ILLNESS) in (&illness3),"C00-D48 Neoplasms","");

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 586 views
  • 2 likes
  • 5 in conversation