DATA Step, Macro, Functions and more

parsing the characters

Accepted Solution Solved
Reply
Contributor
Posts: 44
Accepted Solution

parsing the characters

Hi,

   As i have difficulties in parsing a string of characters.

   data set like;

   

    orange apple juice and cola

    coffee   candy and paper

    banana sugar and pen and book

 

i just want to parse the characters by 'and',and i have tried the 'scan' function,like ---product = scan(name,i,'and'),using the do loop.

the modifier is 'and' not blank,but the result is not i wanted. i am wondering the better solutions to it.

 

thank you!


Accepted Solutions
Solution
‎02-03-2017 02:15 AM
Super Contributor
Posts: 251

Re: parsing the characters

[ Edited ]

Well, here's another way of doing it, which doesn't require scan at all, relying on the wonderful internal machinery of the infile statement:

data products;
infile cards dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
cards;
 orange apple juice and cola
    coffee   candy and paper
    banana sugar and pen and book
;
run;

You could stuff around for ages playing with buffer pointers, but if your requirements are as simple as you present, this may be all you need. It also gets around the problematic cANDy!

View solution in original post


All Replies
Solution
‎02-03-2017 02:15 AM
Super Contributor
Posts: 251

Re: parsing the characters

[ Edited ]

Well, here's another way of doing it, which doesn't require scan at all, relying on the wonderful internal machinery of the infile statement:

data products;
infile cards dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
cards;
 orange apple juice and cola
    coffee   candy and paper
    banana sugar and pen and book
;
run;

You could stuff around for ages playing with buffer pointers, but if your requirements are as simple as you present, this may be all you need. It also gets around the problematic cANDy!

Super User
Super User
Posts: 7,413

Re: parsing the characters

The reason why scan() does not work in this instance is that it uses single characters as delimeters, i.e. you could look at it like:

scan(name,i,"A" or "N" or "D")

It does not split on words.  Now whilst @LaurieF's strategy may work if you can put the data in datalines, if your data is already exisintg then you can use index or findw to split like:

data want;
  set have;
  length first second $200;
  first=substr(name,1,index(name,"and"));
  second=substr(name,index(name,"and")+3);
run;
Contributor
Posts: 44

Re: parsing the characters

thank you!

actually i am sorry for not clarifying my problems. my dataset  have existed in sas. the method you provided works out.

LaurieF's strategy is also a good solution!

Super Contributor
Posts: 251

Re: parsing the characters

In which case, to make sure it always works, in all instances, convert it into a flat file! So many ways to skin a cat:

data all_products;
infile cards;
length all_products $ 80;
input all_products $char80.;
all_products = strip(compbl(all_products));
cards;
 orange apple juice and cola
    coffee   candy and paper
    banana sugar and pen and book
;
run;

filename prodfile temp;

data _null_;
set all_products;
file prodfile;
put all_products;
run;

data products;
infile prodfile dsd dlm=' ';
length product $ 30;
input product @@;
if product ne 'and' and not missing(product);
keep product;
run;

filename prodfile clear;

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 142 views
  • 1 like
  • 3 in conversation