BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

How can I extract the specific "word" all the time? Indexw is only retrieving sometimes

 
data original_data;
 infile datalines delimiter=','; 
  input txt $40. ;
    datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;
run;
 
data new_data;
    set original_data;
    indexw_166 = indexw(txt, '166');
    indexw_181 = indexw(txt, '181');
indexw_358 = indexw(txt, '358');
 
run;
 
proc print data=new_data;run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

Stalk
Pyrite | Level 9
Thank you for the quick solution

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1713 views
  • 1 like
  • 2 in conversation