Hello, experts,
I have a sample dataset 'Have.' I would like to remove all the names shown in the dataset 'Maste_Name.' The final result is shown in the dataset 'Want.' Please help me to approach it. Thank you.
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
data Want;
infile datalines delimiter='/';Re
input Disease : $300. ;
datalines;
/
CONGENITAL ATRESIA OF ESOPHA /
ECZEMA
;
run;
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
options noquotelenmax;
proc sql noprint;
select name into : names separated by '|' from master_name;
quit;
%put &names;
data want;
set have;
want=tranwrd(prxchange("s/\b&names\b/ /i",-1,disease),' ,',' ');
run;
Have you tried using the Tranwrd function? If the replacement value is '', then that should remove the text that you don't want. You might want to run a COMPBL thereafter to compress any multiple spaces.
Jim
For example, the following:
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
DATA _NULL_;
CALL SYMPUTX('Names_Cnt', Names_Cnt, 'G');
STOP;
SET Master_Name NOBS = Names_Cnt;
RUN;
DATA Want;
DROP Name;
DROP _:;
ARRAY Names [&Names_Cnt] $50 _TEMPORARY_;
SET Have;
IF _N_ = 1 THEN
DO;
DO WHILE (NOT End_Of_Master);
SET Master_Name
END = End_of_Master
;
_i + 1;
Names[_i] = Name;
END;
END;
DO _i = 1 TO &Names_Cnt;
Disease = COMPBL(TRANWRD(UPCASE(Disease), CATS(UPCASE(Names[_i]), ','), ''));
IF SUBSTR(Disease, LENGTH(TRIM(Disease)), 1) = ',' THEN
Disease = SUBSTR(Disease, 1, LENGTH(STRIP(Disease)) - 1);
Disease = COMPBL(TRANWRD(UPCASE(Disease), STRIP(UPCASE(Names[_i])), ''));
IF SUBSTR(Disease, LENGTH(TRIM(Disease)), 1) = ',' THEN
Disease = SUBSTR(Disease, 1, LENGTH(STRIP(Disease)) - 1);
END;
IF NOT INDEX(Disease, '/') THEN
Disease = CAT(STRIP(Disease), ' /');
RUN;
Which yields:
Jim
data Master_Name;
infile datalines dsd;
input Name : $50. ;
datalines;
uairway,
uapnea,
ugastro,
ugerd,
uendo,
uallergy
;
run;
data Have;
infile datalines delimiter='/';
input Disease : $300. ;
datalines;
UGASTRO, UGERD, UALLERGY /
UAIRWAY, CONGENITAL ATRESIA OF ESOPHA, UGASTRO /
ECZEMA
;
run;
options noquotelenmax;
proc sql noprint;
select name into : names separated by '|' from master_name;
quit;
%put &names;
data want;
set have;
want=tranwrd(prxchange("s/\b&names\b/ /i",-1,disease),' ,',' ');
run;
Wow, this is SUPER!!
What is the 's' in front of '/' used for? And what is '-1' for?
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.