Hello,
I've a wide column in my dataset that contain more information that I don't need except one word of it and would like to create another column that's a subset of the wide column and keep only a few words that I'm interested like bladder. see below example.
column1
Small hiatal hernia. 8mm low-attenuation lesion in the left lobe of
the liver anteriorly (series 2, image 16) is unchanged and most likely
represents a benign cyst or hemangioma. Stable left renal cyst.
Atherosclerotic vascular calcifications including coronary artery
calcifications. Foley catheter in the bladder
Newcolumn
bladder
Many thanks!
Assuming that each set of words under column1 is a unique observation and that you have a list of words that you're searching for
Set up a temporary array that holds the words
Set up a second array that flags the presence of the words
Use a do loop to process the list and check for the presence of the words using the index or find function.
Thanks Reeza for responding my question. I'm not good on array queries. Can you please explain more how to query this? Thanks
data want;
set have;
array words(2) _temporary_ {'bladder', 'cancer'};
array search(2) bladder cancer {2*0};
do i=1 to dim(words);
if find(string, words(i))>0 then search(i)=1;
end;
run;
data have;
a='Small hiatal hernia. 8mm low-attenuation lesion in the left lobe of
the liver anteriorly (series 2, image 16) is unchanged and most likely
represents a benign cyst or hemangioma. Stable left renal cyst.
Atherosclerotic vascular calcifications including coronary artery
calcifications. Foley catheter in the bladder';
b=prxchange('s/.*(\bbladder\b).*/$1/i',-1,a);
run;
Here's an example of how to solve this problem with a number of approaches, including one from our friend
http://support.sas.com/resources/papers/proceedings14/1717-2014.pdf
The original thread is titled `Scanning an Observation for a Word within a Variable`
Tom
Thank you to mention me . My real name is Xia Ke Shan . Ksharp is my nickname because I love CS and A member from 3D team which name is Ksharp (I steal it from him) is who I really respected, His sharpshooter skill is really what I want .
Just out of interest, what is the reasoning behind this. If I was looking at this kind of information I would want to have it coded by a professional using standards such as MedDRA, rather than trying to find words in a string. Even accounting for spellings might get tricky, let alone medical terminolgy of things.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.