Hello,
I have dataset list below.
data have; infile datalines dsd; format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.; input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.; datalines; 1, John, Doe, 01/01/2020, Y, 12, John, Doe, 01/01/2020, N, 2, Joe, Smith, 12/01/2019, Y, 22, Joe, Smith, 12/01/2019, Y, 27, Joe, Smith, 12/01/2019, Y, 32, Kevin, Mao, 02/06/2000, N, 36, Kevin, Yao, 02/06/2000, Y, 24, Beky, Lee, 08/23/2005, N, 41, Beky, Lee, 08/21/2005, N, 48, Beky, Lee, 08/23/2005, N, 49, Beky, Lee, 08/23/2005, Y, 15, Ashley, Taylor, 09/18/2000, N, 54, Ashley, Taylor, 09/18/2000, N, 65, Ashley, Taylor, 09/18/2000, N, ; run;
I would like to list all the unmatching names/dob/enroll as the result shown below. Please advice how to approach it, thanks.
data want; infile datalines dsd; format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.; input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.; datalines; 1, John, Doe, 01/01/2020, Y, 12, John, Doe, 01/01/2020, N, 27, Joe, Smith, 12/01/2019, Y, 32, Kevin, Mao, 02/06/2000, N, 36, Kevin, Yao, 02/06/2000, Y, 24, Beky, Lee, 08/23/2005, N, 41, Beky, Lee, 08/21/2005, N, 48, Beky, Lee, 08/23/2005, N, 49, Beky, Lee, 08/23/2005, Y, ; run;
Hi @ybz12003 Assuming I understand your requirement, the following should do?-
data have;
infile datalines dsd;
format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1, John, Doe, 01/01/2020, Y,
12, John, Doe, 01/01/2020, N,
2, Joe, Smith, 12/01/2019, Y,
22, Joe, Smith, 12/01/2019, Y,
27, Joe, Smith, 12/01/2019, Y,
32, Kevin, Mao, 02/06/2000, N,
36, Kevin, Yao, 02/06/2000, Y,
24, Beky, Lee, 08/23/2005, N,
41, Beky, Lee, 08/21/2005, N,
48, Beky, Lee, 08/23/2005, N,
49, Beky, Lee, 08/23/2005, Y,
15, Ashley, Taylor, 09/18/2000, N,
54, Ashley, Taylor, 09/18/2000, N,
65, Ashley, Taylor, 09/18/2000, N,
;
run;
proc sql;
create table want as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1;
quit;
I would like to list all the unmatching names/dob/enroll as the result shown below.
This is a really brief description, could you please elaborate and explain the logic you want to follow?
Also:
Why is Joe Smith in the output data set?
Why is this in the output data set?
48, Beky, Lee, 08/23/2005, N,
Try this, I don't have sas now to try the code
PROC SQL;
CREATE TABLE want AS
SELECT ID, firstname, lastname, birthdate,enroll,COUNT(firstname, lastname, birthdate,enroll) AS nbr_doublon
FROM have
GROUP BY firstname, lastname, birthdate,enroll
HAVING nbr_doublon =1;
QUIT;
ALTER TABLE want
DROP COLUMN nbr_doublon;
Hi @ybz12003 Assuming I understand your requirement, the following should do?-
data have;
infile datalines dsd;
format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1, John, Doe, 01/01/2020, Y,
12, John, Doe, 01/01/2020, N,
2, Joe, Smith, 12/01/2019, Y,
22, Joe, Smith, 12/01/2019, Y,
27, Joe, Smith, 12/01/2019, Y,
32, Kevin, Mao, 02/06/2000, N,
36, Kevin, Yao, 02/06/2000, Y,
24, Beky, Lee, 08/23/2005, N,
41, Beky, Lee, 08/21/2005, N,
48, Beky, Lee, 08/23/2005, N,
49, Beky, Lee, 08/23/2005, Y,
15, Ashley, Taylor, 09/18/2000, N,
54, Ashley, Taylor, 09/18/2000, N,
65, Ashley, Taylor, 09/18/2000, N,
;
run;
proc sql;
create table want as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1;
quit;
Although the code is working, the 'Becky' only have two. My final 'want' dataset have four 'Becky's. I would like to have all of the IDs for 'Becky.'
proc sql;
create table want1 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1 ;
quit;
proc sql;
create table want2 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)>1 ;
quit;
proc sql;
create table want as
select *
from want1 t1
left join want2 t2
on (t1.first_name=t2.first_name and t1.last_name=t2.last_name and t1.birthdate=t2.birthdate);
quit;
proc delete data=want1 want2 (gennum=all);
run;
Hello, @ybz12003, you've provided extremely brief information about the logic you are using, and we can't really understand the logic you want to use. Please take some time and describe in words, and in detail, what your logic is that takes you from the input data to the output data. These terribly brief statements of yours simply are not moving us forward.
How do you know
32, Kevin, Mao, 02/06/2000, N, 36, Kevin, Yao, 02/06/2000, Y,
is the same person , but have different last name ?
Assuming the first name is the same for the same person .
data have; infile datalines dsd; format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.; input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.; datalines; 1, John, Doe, 01/01/2020, Y, 12, John, Doe, 01/01/2020, N, 2, Joe, Smith, 12/01/2019, Y, 22, Joe, Smith, 12/01/2019, Y, 27, Joe, Smith, 12/01/2019, Y, 32, Kevin, Mao, 02/06/2000, N, 36, Kevin, Yao, 02/06/2000, Y, 24, Beky, Lee, 08/23/2005, N, 41, Beky, Lee, 08/21/2005, N, 48, Beky, Lee, 08/23/2005, N, 49, Beky, Lee, 08/23/2005, Y, 15, Ashley, Taylor, 09/18/2000, N, 54, Ashley, Taylor, 09/18/2000, N, 65, Ashley, Taylor, 09/18/2000, N, ; run; proc sql; create table want as select * from have group by first_name having count(distinct catx('|', last_name,birthdate,enroll )) ne 1; quit;
Sorry for the confusion, I am looking for the IDs whose any first_name/last_name/dob/enroll is different. Therefore, when 'Joe Smith' and 'Ashley Taylor' have the same name, DOB and enroll, I don't need them. All the rest are the IDs I am looking for.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.