Hello, experts,
I have a sample dataset 'Have.' I would like to remove all the names shown in the dataset 'Maste_Name.' The final result is shown in the dataset 'Want.' Please help me to approach it. Thank you.
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
data Want;
infile datalines delimiter='/';Re
input Disease : $300. ;
datalines;
/
CONGENITAL ATRESIA OF ESOPHA /
ECZEMA
;
run;
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
options noquotelenmax;
proc sql noprint;
select name into : names separated by '|' from master_name;
quit;
%put &names;
data want;
set have;
want=tranwrd(prxchange("s/\b&names\b/ /i",-1,disease),' ,',' ');
run;
Have you tried using the Tranwrd function? If the replacement value is '', then that should remove the text that you don't want. You might want to run a COMPBL thereafter to compress any multiple spaces.
Jim
For example, the following:
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
DATA _NULL_;
CALL SYMPUTX('Names_Cnt', Names_Cnt, 'G');
STOP;
SET Master_Name NOBS = Names_Cnt;
RUN;
DATA Want;
DROP Name;
DROP _:;
ARRAY Names [&Names_Cnt] $50 _TEMPORARY_;
SET Have;
IF _N_ = 1 THEN
DO;
DO WHILE (NOT End_Of_Master);
SET Master_Name
END = End_of_Master
;
_i + 1;
Names[_i] = Name;
END;
END;
DO _i = 1 TO &Names_Cnt;
Disease = COMPBL(TRANWRD(UPCASE(Disease), CATS(UPCASE(Names[_i]), ','), ''));
IF SUBSTR(Disease, LENGTH(TRIM(Disease)), 1) = ',' THEN
Disease = SUBSTR(Disease, 1, LENGTH(STRIP(Disease)) - 1);
Disease = COMPBL(TRANWRD(UPCASE(Disease), STRIP(UPCASE(Names[_i])), ''));
IF SUBSTR(Disease, LENGTH(TRIM(Disease)), 1) = ',' THEN
Disease = SUBSTR(Disease, 1, LENGTH(STRIP(Disease)) - 1);
END;
IF NOT INDEX(Disease, '/') THEN
Disease = CAT(STRIP(Disease), ' /');
RUN;
Which yields:
Jim
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
options noquotelenmax;
proc sql noprint;
select name into : names separated by '|' from master_name;
quit;
%put &names;
data want;
set have;
want=tranwrd(prxchange("s/\b&names\b/ /i",-1,disease),' ,',' ');
run;
Wow, this is SUPER!!
What is the 's' in front of '/' used for? And what is '-1' for?
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.