Hello,
Can anyone provide a code that can help me remove multiple occurences from a certain field, if a condition is met.
To be more clear, here is an exampl of my data set:
Member_ID Effective_date End _date Reason
1234 01JUL2019 31AUG2019 Backed out
1234 01SEP2019 31OCT2019 Re-enrolled
1234 01SEP2019 31OCT2019 Continuous enrollment
2345 01JUL2019 31JUL2019 Cancelled membership
3456 01SEP2019 30SEP2019 Cancelled membership
4567 01JUL2019 31JU2019 Backed out
In the above sample data,
I want to count number of occurences for member_ID and if there occurs more than 1 occurence of member_ID, I want to compare the effective_dates for that ID.
Condition: If the effective_date is repeated for each member_ID in the occurences, then remove the duplicates.
Final output that I expect:
Member_ID Effective_date End _date Reason
1234 01JUL2019 31AUG2019 Backed out
1234 01SEP2019 31OCT2019 Continuous enrollment
2345 01JUL2019 31JUL2019 Cancelled membership
3456 01SEP2019 30SEP2019 Cancelled membership
4567 01JUL2019 31JU2019 Backed out
Thank you
Update: Include additional data variables for clarity.
I want to be able to choose which occurence to keep if there are multiple occurences. In the above example, I chose to keep the record that has the reason as "Continuous Enrollment".
Hope this helps in helping me
Proc SQL perhaps?
proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
HI @sas1011 Can you please post your sample as a table format as one instead of one column followed by another? Thank you!
Sorry. I edited it now
Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
data have;
input Member_ID Effective_date :date9. End_Date :date9.;
format Effective_date date9. End_Date date9.;
datalines;
1234 01JUL2019 31AUG2019
1234 01SEP2019 31OCT2019
1234 01SEP2019 31OCT2019
2345 01JUL2019 31JUL2019
3456 01SEP2019 30SEP2019
4567 01JUL2019 31JUL2019
;
proc sort data=have out=want nodupkey dupout = want_dup;
by Member_ID Effective_date;
run;
Thank you for the response - Please see comment by @mkeintz : My situation is more realted towards the 2nd statement.
What if:
The "proc sort nodupkey" strategy works only if neither of the 2 conditions above occur.
Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.
Thank you for pointing that out.
In that case, I think @maguiremq 's solution should do. Am i right?
@sas1011 wrote:
Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.
Thank you for pointing that out.
We're going to need more data, it seems.
I edited my question to include additional info.
Pardon me for not being clear - I've not posted much in this community and am still learning.
Thank you
Proc SQL perhaps?
proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
This did it. Thank you 😄
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.