BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Hello,

I have dataset list below.

data have;
	infile datalines dsd;
	format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
 	input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1,  John, Doe,   01/01/2020, Y,
12,  John, Doe,   01/01/2020, N,
2,  Joe,  Smith, 12/01/2019, Y,
22,  Joe,  Smith, 12/01/2019, Y,
27,  Joe,  Smith, 12/01/2019, Y,
32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,
24,  Beky,   Lee, 08/23/2005, N,
41,  Beky,   Lee, 08/21/2005, N,
48,  Beky,   Lee, 08/23/2005, N,
49,  Beky,   Lee, 08/23/2005, Y,
15,	Ashley, Taylor, 09/18/2000, N,
54,	Ashley, Taylor, 09/18/2000, N,
65,	Ashley, Taylor, 09/18/2000, N,
;
run;

I would like to list all the unmatching names/dob/enroll as the result shown below.   Please advice how to approach it, thanks.

 

data want;
	infile datalines dsd;
	format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
 	input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1,  John, Doe,   01/01/2020, Y,
12,  John, Doe,   01/01/2020, N,
27,  Joe,  Smith, 12/01/2019, Y,
32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,
24,  Beky,   Lee, 08/23/2005, N,
41,  Beky,   Lee, 08/21/2005, N,
48,  Beky,   Lee, 08/23/2005, N,
49,  Beky,   Lee, 08/23/2005, Y,
;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @ybz12003  Assuming I understand your requirement, the following should do?-


data have;
	infile datalines dsd;
	format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
 	input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1,  John, Doe,   01/01/2020, Y,
12,  John, Doe,   01/01/2020, N,
2,  Joe,  Smith, 12/01/2019, Y,
22,  Joe,  Smith, 12/01/2019, Y,
27,  Joe,  Smith, 12/01/2019, Y,
32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,
24,  Beky,   Lee, 08/23/2005, N,
41,  Beky,   Lee, 08/21/2005, N,
48,  Beky,   Lee, 08/23/2005, N,
49,  Beky,   Lee, 08/23/2005, Y,
15,	Ashley, Taylor, 09/18/2000, N,
54,	Ashley, Taylor, 09/18/2000, N,
65,	Ashley, Taylor, 09/18/2000, N,
;
run;


proc sql;
create table want as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1;
quit;

View solution in original post

13 REPLIES 13
PaigeMiller
Diamond | Level 26

I would like to list all the unmatching names/dob/enroll as the result shown below.  

This is a really brief description, could you please elaborate and explain the logic you want to follow?

 

Also:

 

Why is Joe Smith in the output data set?

 

Why is this in the output data set? 

48,  Beky,   Lee, 08/23/2005, N,

 

--
Paige Miller
mazouz
Calcite | Level 5
PTOC SQL;
CREATE TABLE want AS
SELECT COUNT(*) AS nbr_doublon, firstname, lastname, birthdate,enroll
FROM have
GROUP BY firstname, lastname, birthdate,enroll
HAVING COUNT(*) =1;
QUIT;

ALTER TABLE want
DROP COLUMN nbr_doublon;
ybz12003
Rhodochrosite | Level 12
Your result dataset is missing ID which I would like to include the final dataset.
ybz12003
Rhodochrosite | Level 12
Sorry, my mistake. The '27, Joe, Smith, 12/01/2019, Y,' should not be in the final dataset.
mazouz
Calcite | Level 5

Try this, I don't have sas now to try the code

 

PROC SQL;
CREATE TABLE want AS
SELECT ID, firstname, lastname, birthdate,enroll,COUNT(firstname, lastname, birthdate,enroll) AS nbr_doublon
FROM have
GROUP BY firstname, lastname, birthdate,enroll
HAVING nbr_doublon =1;
QUIT;

ALTER TABLE want
DROP COLUMN nbr_doublon;

ybz12003
Rhodochrosite | Level 12
Not working, see below.

WARNING: Function COUNT requires at most 3 argument(s). The extra one(s) will be ignored.
ERROR: Function COUNT requires a character expression as argument 3.
novinosrin
Tourmaline | Level 20

Hi @ybz12003  Assuming I understand your requirement, the following should do?-


data have;
	infile datalines dsd;
	format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
 	input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1,  John, Doe,   01/01/2020, Y,
12,  John, Doe,   01/01/2020, N,
2,  Joe,  Smith, 12/01/2019, Y,
22,  Joe,  Smith, 12/01/2019, Y,
27,  Joe,  Smith, 12/01/2019, Y,
32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,
24,  Beky,   Lee, 08/23/2005, N,
41,  Beky,   Lee, 08/21/2005, N,
48,  Beky,   Lee, 08/23/2005, N,
49,  Beky,   Lee, 08/23/2005, Y,
15,	Ashley, Taylor, 09/18/2000, N,
54,	Ashley, Taylor, 09/18/2000, N,
65,	Ashley, Taylor, 09/18/2000, N,
;
run;


proc sql;
create table want as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1;
quit;
ybz12003
Rhodochrosite | Level 12

Although the code is working, the 'Becky' only have two.   My final 'want' dataset have four 'Becky's.  I would like to have all of the IDs for 'Becky.'

mazouz
Calcite | Level 5

proc sql;
create table want1 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1 ;
quit;

 

proc sql;
create table want2 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)>1 ;
quit;

 

proc sql;
create table want as
select *
from want1 t1
left join want2 t2
on (t1.first_name=t2.first_name and t1.last_name=t2.last_name and t1.birthdate=t2.birthdate);
quit;


proc delete data=want1 want2 (gennum=all);
run;

PaigeMiller
Diamond | Level 26

Hello, @ybz12003, you've provided extremely brief information about the logic you are using, and we can't really understand the logic you want to use. Please take some time and describe in words, and in detail, what your logic is that takes you from the input data to the output data. These terribly brief statements of yours simply are not moving us forward.

--
Paige Miller
Ksharp
Super User

How do you know

32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,

is the same person , but have different last name ?

 

Assuming the first name is the same for the same person .

data have;
	infile datalines dsd;
	format ID 2. first_name $10. last_name $10. birthdate mmddyy10. enroll $1.;
 	input ID : 2. first_name : $10. last_name : $10. birthdate : mmddyy10. enroll : $1.;
datalines;
1,  John, Doe,   01/01/2020, Y,
12,  John, Doe,   01/01/2020, N,
2,  Joe,  Smith, 12/01/2019, Y,
22,  Joe,  Smith, 12/01/2019, Y,
27,  Joe,  Smith, 12/01/2019, Y,
32,  Kevin,  Mao, 02/06/2000, N,
36,  Kevin,  Yao, 02/06/2000, Y,
24,  Beky,   Lee, 08/23/2005, N,
41,  Beky,   Lee, 08/21/2005, N,
48,  Beky,   Lee, 08/23/2005, N,
49,  Beky,   Lee, 08/23/2005, Y,
15,	Ashley, Taylor, 09/18/2000, N,
54,	Ashley, Taylor, 09/18/2000, N,
65,	Ashley, Taylor, 09/18/2000, N,
;
run;
proc sql;
create table want as
select *
 from have 
  group by first_name
   having count(distinct catx('|', last_name,birthdate,enroll )) ne 1;
quit;

 

ybz12003
Rhodochrosite | Level 12

Sorry for the confusion,   I am looking for the IDs whose any first_name/last_name/dob/enroll is different.   Therefore, when 'Joe Smith' and 'Ashley Taylor' have the same name, DOB and enroll, I don't need them.   All the rest are the IDs I am looking for. 

mazouz
Calcite | Level 5
proc sql;
create table want1 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)=1 ;
quit;



proc sql;
create table want2 as
select *
from have
group by first_name,last_name,birthdate,enroll
having count(*)>1 ;
quit;



proc sql;
create table want as
select *
from want1 t1
left join want2 t2
on (t1.first_name=t2.first_name and t1.last_name=t2.last_name );
quit;


proc delete data=want1 want2 (gennum=all);
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 1379 views
  • 0 likes
  • 5 in conversation