Hi,
As i have difficulties in parsing a string of characters.
data set like;
orange apple juice and cola
coffee candy and paper
banana sugar and pen and book
i just want to parse the characters by 'and',and i have tried the 'scan' function,like ---product = scan(name,i,'and'),using the do loop.
the modifier is 'and' not blank,but the result is not i wanted. i am wondering the better solutions to it.
thank you!
Well, here's another way of doing it, which doesn't require scan at all, relying on the wonderful internal machinery of the infile statement:
data products;
infile cards dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
cards;
orange apple juice and cola
coffee candy and paper
banana sugar and pen and book
;
run;
You could stuff around for ages playing with buffer pointers, but if your requirements are as simple as you present, this may be all you need. It also gets around the problematic cANDy!
Well, here's another way of doing it, which doesn't require scan at all, relying on the wonderful internal machinery of the infile statement:
data products;
infile cards dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
cards;
orange apple juice and cola
coffee candy and paper
banana sugar and pen and book
;
run;
You could stuff around for ages playing with buffer pointers, but if your requirements are as simple as you present, this may be all you need. It also gets around the problematic cANDy!
The reason why scan() does not work in this instance is that it uses single characters as delimeters, i.e. you could look at it like:
scan(name,i,"A" or "N" or "D")
It does not split on words. Now whilst @LaurieF's strategy may work if you can put the data in datalines, if your data is already exisintg then you can use index or findw to split like:
data want; set have; length first second $200; first=substr(name,1,index(name,"and")); second=substr(name,index(name,"and")+3); run;
thank you!
actually i am sorry for not clarifying my problems. my dataset have existed in sas. the method you provided works out.
LaurieF's strategy is also a good solution!
In which case, to make sure it always works, in all instances, convert it into a flat file! So many ways to skin a cat:
data all_products;
infile cards;
length all_products $ 80;
input all_products $char80.;
all_products = strip(compbl(all_products));
cards;
orange apple juice and cola
coffee candy and paper
banana sugar and pen and book
;
run;
filename prodfile temp;
data _null_;
set all_products;
file prodfile;
put all_products;
run;
data products;
infile prodfile dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
run;
filename prodfile clear;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.