I have a data set that monitors a member ID and their enrollment dates. There are scenarios where a member can have enrollment in two different C_ID's. I'm trying to summarize this scenario and capture the MEMID's only that fall into this issue. I've tried a few SQL statements with count(*) as rcrds and grouping by MEMID & C_ID's but for some reason I can't figure this out.
proc sql;
create table TEST (MEMID num, CLNDR_CY_MO_NUM num, C_ID char(3));
insert into test
values(123, 201912, 'H41')
values(123, 202001, 'H41')
values(123, 202002, 'H41')
values(123, 202003, 'H41')
values(123, 202004, 'H41')
values(123, 202005, 'H41')
values(123, 202006, 'H41')
values(123, 202007, 'H05')
values(123, 202007, 'H41')
values(124, 201912, 'H51')
values(124, 202001, 'H51')
values(124, 202002, 'H51')
values(124, 202003, 'H51')
values(124, 202004, 'H51')
;
run;
/* I only want to see MEMID's who have two c_ID's for the same CLNDR_CY_MO_NUM */
proc sql;
create table want (MEMID num);
insert into want
values(123);
run;
Logic I need is where C_ID not = C_ID but CLNDR_CY_MO_NUM = CLND_CY_MO_NUM by MEMID
I want to be able to summarize / identify all MEMID's that fall into this scenario
proc sort data=inputdata;
by memid cid;
run;
data want;
set inputdata;
by memid cid;
if first.memid and last.memid then delete;
run;
proc sql;
select memid , count(memid) as count
from inputdata
group by memid
having count > 1;
quit;
proc sort data=inputdata nodupkey out=subset;
by memid cid;
run;
data want;
set subset;
by memid cid;
if first.memid and last.memid then delete;
run;
proc sql;
select memid , count(memid) as count
from (select distinct memid,cid from inputdata)
group by memid
having count > 1;
quit;
@smantha Thank you for this, it did produce results I was in favor of. I actually found my issue when trying the code out again.
proc sql;
create table b_want as
select memid,
clndr_cy_mo_num,
c_id
from TEST
group by 1,2
having count (clndr_cy_mo_num) >1;
quit;
Show the expected output for the given example data.
It helps to only include variables that are needed for the problem.
And it is best to provide data as data step code to recreate your data so we could test actual code.
As it is I cannot tell the types of any of the variables much less if your dates are actually SAS date values which is likely an important part of this.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.