Help using Base SAS procedures

combining datasets

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 110
Accepted Solution

combining datasets

SAS community,

I would like to combine databases which includes all eligible entries from group 1 (where N ge 1 and X le 5) and group 2 (where N ge 1 and X gt 5).  At the bottom is a sample dataset and a beginning code.  The problem with my current code is that it won't include all entries for the same ID.

The final dataset should be resemble the dataset below:

data want;

ID X N

A 1 5

A 7 4

E 3 4

E 5 2

E 7 3

Thanks!

data have;

input ID $ X N;

datalines;

A 1 5

A 5 0

A 7 4

B 1 1

C 7 0

D 9 4

D 10 1

E 3 4

E 5 2

E 7 3

;;

run;

data group_1;

    set have;

if N ge 1 and X le 5;

run;

data group_2;

    set have;

if N ge 1 and X gt 5;

run;

proc sort data=group_1; by ID;

proc sort data=group_2; by ID;

quit;

data all;

merge     area_1 (in=a)

        area_2 (in=b);

by id;

if a and b;

run;


Accepted Solutions
Solution
‎09-13-2012 01:52 PM
Super User
Posts: 19,832

Re: combining datasets

Posted in reply to sophia_SAS

Can you use proc sql? (not tested)

proc sql;

     create table want as

select * from

(select * from group1 where id in (select distinct id from group2)

union

select * from group2 where id in (select distinct id from group1)) a;

quit;

View solution in original post


All Replies
Super User
Posts: 19,832

Re: combining datasets

Posted in reply to sophia_SAS

Why isn't B included?

It meets the criteria for Group 1 - N Ge 1 and X LE 5

Frequent Contributor
Posts: 110

Re: combining datasets

Because the ID has to be in both Group 1 and Group 2 in order to be in the final dataset.  B only satisfies the criteria for group 1.

Solution
‎09-13-2012 01:52 PM
Super User
Posts: 19,832

Re: combining datasets

Posted in reply to sophia_SAS

Can you use proc sql? (not tested)

proc sql;

     create table want as

select * from

(select * from group1 where id in (select distinct id from group2)

union

select * from group2 where id in (select distinct id from group1)) a;

quit;

Frequent Contributor
Posts: 110

Re: combining datasets

I think that worked!  Thanks

Super User
Posts: 5,513

Re: combining datasets

Posted in reply to sophia_SAS

Here's one way:

proc sort data=have;

by id;

run;

data group_1 group_2;

   set have;

   if N ge 1;

   if X le 5 then output group_1;

   else output group_2;

   keep id;

run;

data want;

   merge have group_1 (in=in1) group_2 (in=in2);

   by id;

   if in1 and in2;

run;

You may get messages related to a many-to-many merge, but they are harmless in this particular case.  (Most any other time, you would definitely want to investigate if you got this type of message.)  You could eliminate those messages by re-sorting group_1 and group_2 before merging:

proc sort data=group_1 NODUPKEY;

   by id;

run;

But for this particular case it won't hurt to skip this because HAVE will have at least as many observations for each ID as the other two data sets have.

Good luck.

Super User
Posts: 10,041

Re: combining datasets

Posted in reply to sophia_SAS

It is so weird.

data have;
input ID $ X N;
datalines;
A 1 5
A 5 0
A 7 4
B 1 1
C 7 0
D 9 4
D 10 1
E 3 4
E 5 2
E 7 3
;
run;
proc sql;
create table want as
 select *
  from have
   where n ge 1
    group by id
     having sum(x le 5) gt 0 and sum(X gt 5) gt 0;
quit;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 228 views
  • 3 likes
  • 4 in conversation