BookmarkSubscribeRSS Feed
ybz12003
Rhodochrosite | Level 12
proc sql;
	create table want as 
	select lab as a
	       park as b
	from a b 
	where ((a.first_name=b.first_name and a.last_name=b.last_name)+
			(a.first_name=b.first_name and a.birthdate=b.DOB)+
	         (a.last_name=b.last_name and b.DOB)) >1 ;
quit;

Hello,

 

I have a dataset A with 50 variables with 2000 rows, dataset B with 10 variables with 300 rows.  Both dataset have same name, birthdate and sex variables.  As long as these two dataset have same (name and birthdate) or same (sex and name) or same (sex and birthday) have same value, I link those data together first.  Thank you.

16 REPLIES 16
ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;
  select A.*
  from A,B
  where (A.NAME=B.NAME and A.DATE = B.DATE)
     or (A.NAME=B.NAME and A.SEX  = B.SEX )
     or (A.DATE=B.DATE and A.SEX  = B.SEX );

 

ybz12003
Rhodochrosite | Level 12

I use VERY similar code, some how the result dataset only comes with dataset A variables, no dataset B variables.

Tom
Super User Tom
Super User

A dataset can only one variable of a given name.  So if you do select a.NAME,b.NAME you don't get two name variables, just one.  In particular PROC SQL will pick the first one and ignore the second one.  You need to give them unique names.

Tom
Super User Tom
Super User

SAS evaluates boolean expressions as 1 or 0.  So just sum them.

where sum( a.name=b.name, a.bday=b.bday, a.sex=b.sex )  > 1
ybz12003
Rhodochrosite | Level 12

I still only get set 1 variables.

Tom
Super User Tom
Super User
select a.name as nameA
     , b.name as nameB
...
ybz12003
Rhodochrosite | Level 12

 

 

Error: Summary functions are restricted to the SELECT and HAVING clauses only

ChrisNZ
Tourmaline | Level 20

Try using the + sign instead of the sum function.

SQL is pickier than a data step,

Otherwise, try my syntax.

Tom
Super User Tom
Super User

@ybz12003 wrote:

Error: Summary functions are restricted to the SELECT and HAVING clauses only


Don't use SQL aggregate functions. Use the SAS sum(,) function like in the code I posted.  Or just type the + operators instead.

ballardw
Super User

@ybz12003 wrote:

 

 

Error: Summary functions are restricted to the SELECT and HAVING clauses only


Since your example code at the start of this thread did not include any summary functions then perhaps it is time to show the actual code you are using when generating errors.

SASKiwi
PROC Star

I would advise joining by all three conditions first as that is the most reliable. The next most reliable would be birth date and name followed by sex and name as you have described. SQL is a good choice for this:

proc sql;
  create table want as
  select  A.name
            ,A.birthdate
            ,A.sex
            ,B.name
            ,B.birthdate
            ,B.sex
            ,'1 Name, Birthdate, Sex' as jointype length = 20
  from A
  left join B
  on A.name = B.name
  and A.birthdate = B.birthdate
  and A.sex = B.sex
  union all
  select  A.name
            ,A.birthdate
            ,A.sex
            ,B.name
            ,B.birthdate
            ,B.sex
            ,'2 Name, Birthdate' as jointype length = 20
  from A
  left join B
  on A.name = B.name
  and A.birthdate = B.birthdate
  union all
  select  A.name
            ,A.birthdate
            ,A.sex
            ,B.name
            ,B.birthdate
            ,B.sex
            ,'3 Name, Sex' as jointype length = 20
  from A
  left join B
  on A.name = B.name
  and A.sex = B.sex
  order by Name, jointype
  ;
quit;
   

You will end up with multiple join types for each name so then just choose the one with the lowest number.

ybz12003
Rhodochrosite | Level 12

sum didn't help,

ybz12003
Rhodochrosite | Level 12

I have been trying use sum or +, I still only get dataset 1 variables.  Not including dataset 2 variables.

ybz12003
Rhodochrosite | Level 12

Could someone let me know where my code went wrong, I keep getting dataset a variables.  That's it.  Where is the set b?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 1523 views
  • 3 likes
  • 5 in conversation