Dear all,
How can I remove observations that contain a specific character?
For example, remove observations contain E
ID DX Diag_Sequence
1 96 1
1 E96 1
1 87 2
1 E87 2
1 V87 3
Best,
Liang
if find(dx,'E','i')>0 then delete;
It depends on how complicated it gets.
If its a single value you can chain logic together, such as
if dx=:’E’ and dx ne ‘E23’ then delete;
But if the logic gets too complex you need a different approach. Do you have those lists of what to include/exclude in a data set or are they in your head? If you make them into a data set an entirely different approach is warranted, which may be easier overall.
Hi, Reeza. Thanks for your comment.
Here is a screenshot of part of my dataset. Variable Diagnosis_Code may contain observations with E, for example, E9505.
For this E9505, if its Diagnosis_Sequence is the same as previous observations, then delete this E9505 observation. But if this Diagnosis_Sequence is not the same as the previous one, keep it. How can I code this?
Best,
Liang
@Jackwangliang thats not at all like your initial question.
Please post data as text, I won't type out your data, and what you would expect as output. Since this is so different from your original question, I would suggest you start a new thread.
@Jackwangliang wrote:
Hi, Reeza. Thanks for your comment.
Here is a screenshot of part of my dataset. Variable Diagnosis_Code may contain observations with E, for example, E9505.
For this E9505, if its Diagnosis_Sequence is the same as previous observations, then delete this E9505 observation. But if this Diagnosis_Sequence is not the same as the previous one, keep it. How can I code this?
Best,
Liang
proc sql;
delete from table_name
where DX contains 'E';
If dx =: ‘E’ then delete; *removes all dx that start with E;
If find(dx, ‘E’, ‘i’)>0 then delete;*removes all dx that contain an E;
Use find to search for an E, there’s a possibility I have the parameters backwards in the FIND() function so you should verify that.
Or use =: which checks if the first characters are the same.
@Jackwangliang wrote:
Dear all,
How can I remove observations that contain a specific character?
For example, remove observations contain E
ID DX Diag_Sequence
1 96 1
1 E96 1
1 87 2
1 E87 2
1 V87 3
Best,
Liang
Thank you all for your great replies!
What if remove E that the Diag_Sequence is the same as previous observation, but do not remove E that has different Diag_Sequence number, for example,
ID DX Diag_Sequence
1 96 1
1 E96 1 (remove)
1 87 2
1 E87 2 (remove)
1 V87 3
1 E23 4 (do not remove)
1 12 5
1 15 6
1 E15 6 (remove)
I am a new SAS user, thank you very much for helping.
Liang
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.