BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stephcl
Calcite | Level 5

set1

id1          id2          id3

1              1              3

2              2              2

3              3              3

 

set2

id1          id2          id3

1              2              2

1              2              3

1              2              4

 

the resulting union should be

id1          id2          id3

1              1              3

1              2              3

2              2              2

3              3              3

 

the groups are via id1

i.e. consider the group from set1 and set2 with id1=1

then look at each observation in set2 such that set1.id3 <= set2.id3

left join all matches in set2 to the resulting dataset

I only care about the next one if any (>=) observation in set2.

i.e. set2 should be ordered by id1, id2, then id3.

The resulting set should only contain the lowest (1st of) observation (via id3) in set2 that has a matching id1 in set1.

 

As you can see

1              2              4

Isn’t in the result

Since we already had

1              2              3

From set2 and 3 is less than 4.

1 ACCEPTED SOLUTION

Accepted Solutions
PhilC
Rhodochrosite | Level 12

Forget SORT.  I think Kurt is on the right track.  Here's my go;

 

Concepts to learn, Do-Whitlock loops and Set interleaving...

 

data set1 (sortedby=id1 id2 id3 index=(id1 /unique));
  input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;
data set2(sortedby=id1 id2 id3 );
  input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;
run;
data want;
  _lowest_case=1;
  do until (last.id1);
    set set1 set2;
      by id1;
    if first.id1 then do;
      _Set1_id3=id3;
      output;
      end;
    else 
      if _Set1_id3 <=id3 and _lowest_case then do;
        output;
        _lowest_case=0; /*lowest case is first matching instance on Set2*/
      end;
  end;
  drop _:;
run;

I assume id1 on SET1  is unique, am I correct?

View solution in original post

5 REPLIES 5
PhilC
Rhodochrosite | Level 12

You may want a to use the SORT function, perhaps?  Not PROC SQL.

 

SAS Help Center: SORT Function

stephcl
Calcite | Level 5
that never occurred to me. I don't see how sort could do this type of thing. please lmk how sort would union 2 datasets and select only 1 observation from the second dataset if it exists under the conditions described above.
PhilC
Rhodochrosite | Level 12

Forget SORT.  I think Kurt is on the right track.  Here's my go;

 

Concepts to learn, Do-Whitlock loops and Set interleaving...

 

data set1 (sortedby=id1 id2 id3 index=(id1 /unique));
  input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;
data set2(sortedby=id1 id2 id3 );
  input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;
run;
data want;
  _lowest_case=1;
  do until (last.id1);
    set set1 set2;
      by id1;
    if first.id1 then do;
      _Set1_id3=id3;
      output;
      end;
    else 
      if _Set1_id3 <=id3 and _lowest_case then do;
        output;
        _lowest_case=0; /*lowest case is first matching instance on Set2*/
      end;
  end;
  drop _:;
run;

I assume id1 on SET1  is unique, am I correct?

stephcl
Calcite | Level 5
thx so much, that did the trick. time to learn more about coding i guess
Kurt_Bremser
Super User

Try this:

data set1;
input id1 id2 id3;
datalines;
1              1              3
2              2              2
3              3              3
;

data set2;
input id1 id2 id3;
datalines;
1              2              2
1              2              3
1              2              4
;

data want;
set
  set1 (in=in1)
  set2
;
by id1;
retain id3_1;
if first.id1 then id3_1 = .;
if in1
then id3_1 = id3;
else /* second dataset */ if id3 ge id3_1; /* subsetting if */
drop id3_1;
run;

Untested, posted from my tablet.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1512 views
  • 0 likes
  • 3 in conversation