Hello,
I've a wide column in my dataset that contain more information that I don't need except one word of it and would like to create another column that's a subset of the wide column and keep only a few words that I'm interested like bladder. see below example.
column1
Small hiatal hernia. 8mm low-attenuation lesion in the left lobe of
the liver anteriorly (series 2, image 16) is unchanged and most likely
represents a benign cyst or hemangioma. Stable left renal cyst.
Atherosclerotic vascular calcifications including coronary artery
calcifications. Foley catheter in the bladder
Newcolumn
bladder
Many thanks!
Assuming that each set of words under column1 is a unique observation and that you have a list of words that you're searching for
Set up a temporary array that holds the words
Set up a second array that flags the presence of the words
Use a do loop to process the list and check for the presence of the words using the index or find function.
Thanks Reeza for responding my question. I'm not good on array queries. Can you please explain more how to query this? Thanks
data want;
set have;
array words(2) _temporary_ {'bladder', 'cancer'};
array search(2) bladder cancer {2*0};
do i=1 to dim(words);
if find(string, words(i))>0 then search(i)=1;
end;
run;
data have;
a='Small hiatal hernia. 8mm low-attenuation lesion in the left lobe of
the liver anteriorly (series 2, image 16) is unchanged and most likely
represents a benign cyst or hemangioma. Stable left renal cyst.
Atherosclerotic vascular calcifications including coronary artery
calcifications. Foley catheter in the bladder';
b=prxchange('s/.*(\bbladder\b).*/$1/i',-1,a);
run;
Here's an example of how to solve this problem with a number of approaches, including one from our friend
http://support.sas.com/resources/papers/proceedings14/1717-2014.pdf
The original thread is titled `Scanning an Observation for a Word within a Variable`
Tom
Thank you to mention me . My real name is Xia Ke Shan . Ksharp is my nickname because I love CS and A member from 3D team which name is Ksharp (I steal it from him) is who I really respected, His sharpshooter skill is really what I want .
Just out of interest, what is the reasoning behind this. If I was looking at this kind of information I would want to have it coded by a professional using standards such as MedDRA, rather than trying to find words in a string. Even accounting for spellings might get tricky, let alone medical terminolgy of things.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.