BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sophia_SAS
Obsidian | Level 7

SAS community,

I would like to combine databases which includes all eligible entries from group 1 (where N ge 1 and X le 5) and group 2 (where N ge 1 and X gt 5).  At the bottom is a sample dataset and a beginning code.  The problem with my current code is that it won't include all entries for the same ID.

The final dataset should be resemble the dataset below:

data want;

ID X N

A 1 5

A 7 4

E 3 4

E 5 2

E 7 3

Thanks!

data have;

input ID $ X N;

datalines;

A 1 5

A 5 0

A 7 4

B 1 1

C 7 0

D 9 4

D 10 1

E 3 4

E 5 2

E 7 3

;;

run;

data group_1;

    set have;

if N ge 1 and X le 5;

run;

data group_2;

    set have;

if N ge 1 and X gt 5;

run;

proc sort data=group_1; by ID;

proc sort data=group_2; by ID;

quit;

data all;

merge     area_1 (in=a)

        area_2 (in=b);

by id;

if a and b;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Can you use proc sql? (not tested)

proc sql;

     create table want as

select * from

(select * from group1 where id in (select distinct id from group2)

union

select * from group2 where id in (select distinct id from group1)) a;

quit;

View solution in original post

6 REPLIES 6
Reeza
Super User

Why isn't B included?

It meets the criteria for Group 1 - N Ge 1 and X LE 5

sophia_SAS
Obsidian | Level 7

Because the ID has to be in both Group 1 and Group 2 in order to be in the final dataset.  B only satisfies the criteria for group 1.

Reeza
Super User

Can you use proc sql? (not tested)

proc sql;

     create table want as

select * from

(select * from group1 where id in (select distinct id from group2)

union

select * from group2 where id in (select distinct id from group1)) a;

quit;

sophia_SAS
Obsidian | Level 7

I think that worked!  Thanks

Astounding
PROC Star

Here's one way:

proc sort data=have;

by id;

run;

data group_1 group_2;

   set have;

   if N ge 1;

   if X le 5 then output group_1;

   else output group_2;

   keep id;

run;

data want;

   merge have group_1 (in=in1) group_2 (in=in2);

   by id;

   if in1 and in2;

run;

You may get messages related to a many-to-many merge, but they are harmless in this particular case.  (Most any other time, you would definitely want to investigate if you got this type of message.)  You could eliminate those messages by re-sorting group_1 and group_2 before merging:

proc sort data=group_1 NODUPKEY;

   by id;

run;

But for this particular case it won't hurt to skip this because HAVE will have at least as many observations for each ID as the other two data sets have.

Good luck.

Ksharp
Super User

It is so weird.

data have;
input ID $ X N;
datalines;
A 1 5
A 5 0
A 7 4
B 1 1
C 7 0
D 9 4
D 10 1
E 3 4
E 5 2
E 7 3
;
run;
proc sql;
create table want as
 select *
  from have
   where n ge 1
    group by id
     having sum(x le 5) gt 0 and sum(X gt 5) gt 0;
quit;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 864 views
  • 3 likes
  • 4 in conversation