BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stephcl
Calcite | Level 5

set1

id1          id2          id3

1              1              3

2              2              2

3              3              3

 

set2

id1          id2          id3

1              2              2

1              2              3

1              2              4

 

the resulting union should be

id1          id2          id3

1              1              3

1              2              3

2              2              2

3              3              3

 

the groups are via id1

i.e. consider the group from set1 and set2 with id1=1

then look at each observation in set2 such that set1.id3 <= set2.id3

left join all matches in set2 to the resulting dataset

I only care about the next one if any (>=) observation in set2.

i.e. set2 should be ordered by id1, id2, then id3.

The resulting set should only contain the lowest (1st of) observation (via id3) in set2 that has a matching id1 in set1.

 

As you can see

1              2              4

Isn’t in the result

Since we already had

1              2              3

From set2 and 3 is less than 4.

1 ACCEPTED SOLUTION

Accepted Solutions
PhilC
Rhodochrosite | Level 12

Forget SORT.  I think Kurt is on the right track.  Here's my go;

 

Concepts to learn, Do-Whitlock loops and Set interleaving...

 

data set1 (sortedby=id1 id2 id3 index=(id1 /unique));
  input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;
data set2(sortedby=id1 id2 id3 );
  input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;
run;
data want;
  _lowest_case=1;
  do until (last.id1);
    set set1 set2;
      by id1;
    if first.id1 then do;
      _Set1_id3=id3;
      output;
      end;
    else 
      if _Set1_id3 <=id3 and _lowest_case then do;
        output;
        _lowest_case=0; /*lowest case is first matching instance on Set2*/
      end;
  end;
  drop _:;
run;

I assume id1 on SET1  is unique, am I correct?

View solution in original post

5 REPLIES 5
PhilC
Rhodochrosite | Level 12

You may want a to use the SORT function, perhaps?  Not PROC SQL.

 

SAS Help Center: SORT Function

stephcl
Calcite | Level 5
that never occurred to me. I don't see how sort could do this type of thing. please lmk how sort would union 2 datasets and select only 1 observation from the second dataset if it exists under the conditions described above.
PhilC
Rhodochrosite | Level 12

Forget SORT.  I think Kurt is on the right track.  Here's my go;

 

Concepts to learn, Do-Whitlock loops and Set interleaving...

 

data set1 (sortedby=id1 id2 id3 index=(id1 /unique));
  input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;
data set2(sortedby=id1 id2 id3 );
  input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;
run;
data want;
  _lowest_case=1;
  do until (last.id1);
    set set1 set2;
      by id1;
    if first.id1 then do;
      _Set1_id3=id3;
      output;
      end;
    else 
      if _Set1_id3 <=id3 and _lowest_case then do;
        output;
        _lowest_case=0; /*lowest case is first matching instance on Set2*/
      end;
  end;
  drop _:;
run;

I assume id1 on SET1  is unique, am I correct?

stephcl
Calcite | Level 5
thx so much, that did the trick. time to learn more about coding i guess
Kurt_Bremser
Super User

Try this:

data set1;
input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;

data set2;
input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;

data want;
set
  set1 (in=in1)
  set2
;
by id1;
retain id3_1;
if first.id1 then id3_1 = .;
if in1
then id3_1 = id3;
else /* second dataset */ if id3 ge id3_1; /* subsetting if */
drop id3_1;
run;

Untested, posted from my tablet.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1018 views
  • 0 likes
  • 3 in conversation