BookmarkSubscribeRSS Feed
keen_sas
Quartz | Level 8

Hi All,

 

data new;
length have $1000. want $100.;
have ="Sports data is sufficient";
want ="sports";
output ;
have="Basketball Players Data is listed on the Olympic data";
want= "Players Olympic";
output ;

have ="Soccer data and swimming data are not found in the Olympic data starting in the next month";
want="soccer swimming Olympic";
output ;
run ;

 

In the above text DATA is the key word that is present once or more than once in each string (HAVE variable) . I have to identify the WORD before DATA in each string and i have to replace it with other word. In the first example SPORTS is the word present prior to the keyword DATA. Similarly in the second string DATA keyword is repeated twice and words present before DATA in it are Players and Olympic. Any method to identify the sub string/word before the keyword DATA from the above example. After identifying these words i will be replacing with some other words.

6 REPLIES 6
PGStats
Opal | Level 21

What 'other words' ?

PG
Jagadishkatam
Amethyst | Level 16

Please try the below code

 

data have;
input text&$200.; 
string=tranwrd(lowcase(text),'data','#');
cards;
Sports data is sufficient
Basketball Players Data is listed on the Olympic data
Soccer data and swimming data are not found in the Olympic data starting in the next month
;

proc sort data=have;
by string;
run;

data want(where=(count(string,'#')=i));
length new $100.;
set have;
by string;
do i = 1 to count(string,'#');
new=catx(' ', new, scan(scan(string,i,'#'),-1,' '));
output;
end;
run;
Thanks,
Jag
novinosrin
Tourmaline | Level 20

FWIW, My share of fun


data new;
length have $1000. want $100.;
have ="Sports data is sufficient";
want ="sports";
output ;
have="Basketball Players data is listed on the Olympic data";
want= "Players Olympic";
output ;

have ="Soccer data and swimming data are not found in the Olympic data starting in the next month";
want="soccer swimming Olympic";
output ;
run ;


/*new_want is the output variable*/
data want;
set new;
_have=have;
length new_want $100;
do p=findw(_have,'data',' ','ei') by 0 while(p);
 new_want=catx(' ',new_want,scan(_have,p-1,' ','i'));
 call scan(_have, p, _p, _l,' ','i');
 substr(_have,_p,_l)='09'x;
 p=findw(_have,'data',' ','ei');
end;
drop _: p;
run;
hashman
Ammonite | Level 13

@keen_sas:

Just scroll the text one word at a time, and if it is "data", append the prior word to the variable you want:

data have ;                                                                                
  infile cards truncover ;                                                                 
  input have $ 1-100 ;                                                                     
  cards ;                                                                                  
Sports data is sufficient                                                                  
Basketball Players Data is listed on the Olympic data                                      
Soccer data and swimming data are not found in the Olympic data starting in the next month 
Data is the first word here - we need WANT blank in this case                              
;                                                                                          
run ;                                                                                      
                                                                                           
data want (drop = _:) ;                                                                    
  set have ;                                                                               
  length want _v $ 100 ;                                                                   
  do _x = 1 to countw (have) ;                                                             
    _w = scan (have, _x) ;                                                                 
    if lowcase (_w) = "data" then want = catx (" ", want, _v) ;                            
    _v = _w ;                                                                              
  end ;                                                                                    
run ;                                                                                      

Kind regards

Paul D.  

PGStats
Opal | Level 21

If the replacement word is always the same, then it is simple to do this with a regular expression substitution operation:

 

data new;
length have $1000. want $100.;
have ="Sports data is sufficient";
want ="sports";
output ;
have="Basketball Players Data is listed on the Olympic data";
want= "Players Olympic";
output ;
have ="Soccer data and swimming data are not found in the Olympic data starting in the next month";
want="soccer swimming Olympic";
output ;
run ;

data want;
set new;
length replaced $1000;
replaced = prxChange ("s/(\b\w+\b)(\s*data)/Other word\2/io",-1,have);
run;

proc print data=want noobs; var have replaced; run;
have 	replaced
Sports data is sufficient 	Other word data is sufficient
Basketball Players Data is listed on the Olympic data 	Basketball Other word Data is listed on the Other word data
Soccer data and swimming data are not found in the Olympic data starting in the next month 	Other word data and Other word data are not found in the Other word data starting in the next month

 

PG
hashman
Ammonite | Level 13

@PGStats: What I've been waiting for. 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1384 views
  • 1 like
  • 5 in conversation