BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

How can I extract the specific "word" all the time? Indexw is only retrieving sometimes

 
data original_data;
 infile datalines delimiter=','; 
  input txt $40. ;
    datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;
run;
 
data new_data;
    set original_data;
    indexw_166 = indexw(txt, '166');
    indexw_181 = indexw(txt, '181');
indexw_358 = indexw(txt, '358');
 
run;
 
proc print data=new_data;run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

Stalk
Pyrite | Level 9
Thank you for the quick solution

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1569 views
  • 1 like
  • 2 in conversation