BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

How can I extract the specific "word" all the time? Indexw is only retrieving sometimes

 
data original_data;
 infile datalines delimiter=','; 
  input txt $40. ;
    datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;
run;
 
data new_data;
    set original_data;
    indexw_166 = indexw(txt, '166');
    indexw_181 = indexw(txt, '181');
indexw_358 = indexw(txt, '358');
 
run;
 
proc print data=new_data;run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

It would help if you told it what delimiter(s) to use.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Results

                          indexw_    indexw_    indexw_
Obs    txt                  166        181        358

 1     166,181,410,333       1          5          0
 2     166,358               1          0          5
 3     166,181,105           1          5          0
 4     358                   0          0          1
 5     166,181,250,105       1          5          0
 6     181,                  0          1          0
 7     105,166               5          0          0

 And if the goal is to "extract" then perhaps you might want to use the FINDW() function instead.  With the E modifier it will return the word number instead of the first character position.  You can then use that with SCAN() to "extract" the word.

data original_data;
  input txt $40. ;
  indexw_166 = indexw(txt, '166',', ');
  indexw_181 = indexw(txt, '181',', ');
  indexw_358 = indexw(txt, '358',', ');
  loc_166=findw(txt,'166',',','er');
  value_166 = scan(txt,loc_166,',');
datalines;
166,181,410,333
166,358
166,181,105
358
166,181,250,105
181,
105,166
;

Result

                          indexw_    indexw_    indexw_               value_
Obs    txt                  166        181        358      loc_166     166

 1     166,181,410,333       1          5          0          1        166
 2     166,358               1          0          5          1        166
 3     166,181,105           1          5          0          1        166
 4     358                   0          0          1          0
 5     166,181,250,105       1          5          0          1        166
 6     181,                  0          1          0          0
 7     105,166               5          0          0          0

 

Stalk
Pyrite | Level 9
Thank you for the quick solution

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 262 views
  • 1 like
  • 2 in conversation