Dear all,
How can I remove observations that contain a specific character?
For example, remove observations contain E
ID DX Diag_Sequence
1 96 1
1 E96 1
1 87 2
1 E87 2
1 V87 3
Best,
Liang
if find(dx,'E','i')>0 then delete;
It depends on how complicated it gets.
If its a single value you can chain logic together, such as
if dx=:’E’ and dx ne ‘E23’ then delete;
But if the logic gets too complex you need a different approach. Do you have those lists of what to include/exclude in a data set or are they in your head? If you make them into a data set an entirely different approach is warranted, which may be easier overall.
Hi, Reeza. Thanks for your comment.
Here is a screenshot of part of my dataset. Variable Diagnosis_Code may contain observations with E, for example, E9505.
For this E9505, if its Diagnosis_Sequence is the same as previous observations, then delete this E9505 observation. But if this Diagnosis_Sequence is not the same as the previous one, keep it. How can I code this?
Best,
Liang
@Jackwangliang thats not at all like your initial question.
Please post data as text, I won't type out your data, and what you would expect as output. Since this is so different from your original question, I would suggest you start a new thread.
@Jackwangliang wrote:
Hi, Reeza. Thanks for your comment.
Here is a screenshot of part of my dataset. Variable Diagnosis_Code may contain observations with E, for example, E9505.
For this E9505, if its Diagnosis_Sequence is the same as previous observations, then delete this E9505 observation. But if this Diagnosis_Sequence is not the same as the previous one, keep it. How can I code this?
Best,
Liang
proc sql;
delete from table_name
where DX contains 'E';
If dx =: ‘E’ then delete; *removes all dx that start with E;
If find(dx, ‘E’, ‘i’)>0 then delete;*removes all dx that contain an E;
Use find to search for an E, there’s a possibility I have the parameters backwards in the FIND() function so you should verify that.
Or use =: which checks if the first characters are the same.
@Jackwangliang wrote:
Dear all,
How can I remove observations that contain a specific character?
For example, remove observations contain E
ID DX Diag_Sequence
1 96 1
1 E96 1
1 87 2
1 E87 2
1 V87 3
Best,
Liang
Thank you all for your great replies!
What if remove E that the Diag_Sequence is the same as previous observation, but do not remove E that has different Diag_Sequence number, for example,
ID DX Diag_Sequence
1 96 1
1 E96 1 (remove)
1 87 2
1 E87 2 (remove)
1 V87 3
1 E23 4 (do not remove)
1 12 5
1 15 6
1 E15 6 (remove)
I am a new SAS user, thank you very much for helping.
Liang
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.