Hello,
Can anyone provide a code that can help me remove multiple occurences from a certain field, if a condition is met.
To be more clear, here is an exampl of my data set:
Member_ID Effective_date End _date Reason
1234 01JUL2019 31AUG2019 Backed out
1234 01SEP2019 31OCT2019 Re-enrolled
1234 01SEP2019 31OCT2019 Continuous enrollment
2345 01JUL2019 31JUL2019 Cancelled membership
3456 01SEP2019 30SEP2019 Cancelled membership
4567 01JUL2019 31JU2019 Backed out
In the above sample data,
I want to count number of occurences for member_ID and if there occurs more than 1 occurence of member_ID, I want to compare the effective_dates for that ID.
Condition: If the effective_date is repeated for each member_ID in the occurences, then remove the duplicates.
Final output that I expect:
Member_ID Effective_date End _date Reason
1234 01JUL2019 31AUG2019 Backed out
1234 01SEP2019 31OCT2019 Continuous enrollment
2345 01JUL2019 31JUL2019 Cancelled membership
3456 01SEP2019 30SEP2019 Cancelled membership
4567 01JUL2019 31JU2019 Backed out
Thank you
Update: Include additional data variables for clarity.
I want to be able to choose which occurence to keep if there are multiple occurences. In the above example, I chose to keep the record that has the reason as "Continuous Enrollment".
Hope this helps in helping me
Proc SQL perhaps?
proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
HI @sas1011 Can you please post your sample as a table format as one instead of one column followed by another? Thank you!
Sorry. I edited it now
Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
data have;
input Member_ID Effective_date :date9. End_Date :date9.;
format Effective_date date9. End_Date date9.;
datalines;
1234 01JUL2019 31AUG2019
1234 01SEP2019 31OCT2019
1234 01SEP2019 31OCT2019
2345 01JUL2019 31JUL2019
3456 01SEP2019 30SEP2019
4567 01JUL2019 31JUL2019
;
proc sort data=have out=want nodupkey dupout = want_dup;
by Member_ID Effective_date;
run;
Thank you for the response - Please see comment by @mkeintz : My situation is more realted towards the 2nd statement.
What if:
The "proc sort nodupkey" strategy works only if neither of the 2 conditions above occur.
Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.
Thank you for pointing that out.
In that case, I think @maguiremq 's solution should do. Am i right?
@sas1011 wrote:
Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.
Thank you for pointing that out.
We're going to need more data, it seems.
I edited my question to include additional info.
Pardon me for not being clear - I've not posted much in this community and am still learning.
Thank you
Proc SQL perhaps?
proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
This did it. Thank you 😄
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.