Hello Experts,
I would like to remove all the strings which contains 'UGESTAGE' in the sample dataset 'Have.' I am looking for the result like the dataset 'Want.' Please help me to approach it, thanks.
data Have; infile datalines delimiter='#'; input Comments : $200. ; datalines; UGESTAGE=34 5/7 WKS # CHOLESTASIS, UGESTAGE = 35WKS, ANEMIA# UGESTAGE=33.4 WKS , ROP # ASD, UGESTAGE= 24WKS ; run; data Want; infile datalines delimiter='/'; input Disease : $300. ; datalines; / / CHOLESTASIS, ANEMIA / ROP / /ASD / ; run;
No, this isn't a replacement; this is a concatenation. The CATX function is concatenating the strings together separated by a comma. With CATX, the first argument specifies the separator. The slash is your end of record delimiter, but a comma separates each "token" in your records.
The program first determines how many string tokens there are by counting the comma separators (and adding one since there is no comma after the final string token).
_Strings = COUNT(Comments, ',') + 1;
Then the program goes through the tokens, one at a time. Each token will be concatenated into the output record unless the token contains 'UGESTAGE' in which case it will be skipped.
DO _i = 1 TO _Strings;
IF NOT INDEX(SCAN(Comments, _i), 'UGESTAGE') THEN
DO;
Output_Comment = CATX(', ', STRIP(Output_Comment), STRIP(SCAN(Comments, _i, ',')));
END;
END;
Jim
The SCAN function should come in handy in this case; code example, below.
Do you really want to keep the record if all strings are removed? I deleted it, but the code can be easily modified to output the record even if all of the contents have been deleted.
I didn't include it in the below SAS code, but of course if your data is anything but upper case, you would need to use lowcase() or upcase() to make sure that the comparison is case insensitive.
data Have;
infile datalines delimiter='#';
input Comments : $200. ;
datalines;
UGESTAGE=34 5/7 WKS #
CHOLESTASIS, UGESTAGE = 35WKS, ANEMIA#
UGESTAGE=33.4 WKS , ROP #
ASD, UGESTAGE= 24WKS
;
run;
Data Want(RENAME=(Output_Comment=Comments));
DROP _:;
DROP Comments;
LENGTH Output_Comment $256;
SET Have;
_Strings = COUNT(Comments, ',') + 1;
DO _i = 1 TO _Strings;
IF NOT INDEX(SCAN(Comments, _i), 'UGESTAGE') THEN
DO;
Output_Comment = CATX(', ', STRIP(Output_Comment), STRIP(SCAN(Comments, _i, ',')));
END;
END;
IF LENGTHN(Output_Comment) = 0 THEN
DELETE;
ELSE
Output_Comment = CAT(STRIP(Output_Comment), ' /');
run;
Results:
Jim
Yes, it was deleted. See my comments in my first response.
So you really need it? No problem. Just comment out the DELETE logic like the below. See results below that. How does that look?
Data Want(RENAME=(Output_Comment=Comments));
DROP _:;
DROP Comments;
LENGTH Output_Comment $256;
SET Have;
_Strings = COUNT(Comments, ',') + 1;
DO _i = 1 TO _Strings;
IF NOT INDEX(SCAN(Comments, _i), 'UGESTAGE') THEN
DO;
Output_Comment = CATX(', ', STRIP(Output_Comment), STRIP(SCAN(Comments, _i, ',')));
END;
END;
/* IF LENGTHN(Output_Comment) = 0 THEN*/
/* DELETE;*/
/* ELSE*/
Output_Comment = CAT(STRIP(Output_Comment), ' /');
run;
Results:
Jim
Hello,
In your codes below, is the STRIP(Output_Comment) used to replace the ' STRIP(SCAN(Comments, _i, ','))'?
CATX(', ', STRIP(Output_Comment), STRIP(SCAN(Comments, _i, ',')));
No, this isn't a replacement; this is a concatenation. The CATX function is concatenating the strings together separated by a comma. With CATX, the first argument specifies the separator. The slash is your end of record delimiter, but a comma separates each "token" in your records.
The program first determines how many string tokens there are by counting the comma separators (and adding one since there is no comma after the final string token).
_Strings = COUNT(Comments, ',') + 1;
Then the program goes through the tokens, one at a time. Each token will be concatenated into the output record unless the token contains 'UGESTAGE' in which case it will be skipped.
DO _i = 1 TO _Strings;
IF NOT INDEX(SCAN(Comments, _i), 'UGESTAGE') THEN
DO;
Output_Comment = CATX(', ', STRIP(Output_Comment), STRIP(SCAN(Comments, _i, ',')));
END;
END;
Jim
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.