BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
genemroz
Quartz | Level 8

Esteemed Advisers:

I have two datasets, Observations and Stations. I want to search the variable USL in Observations for every instance of a station_id from the dataset of Stations. In this simple exemplar case, that would be observations 1,4, and 5. These three observations  dataset Observations would then be written to a third dataset called Matches. How can I accomplish this?  Thanks in advance for taking a look at this!

 

Dataset Observations
Obs USL
1 US0003,US0005,USL00V
2 USL007,USL00D
3 USL009,USL00A,USL00J,USL019
4 US0003,US0005,USL009,USL00V
5 US0003,US0004,US0005,US0007,US0008,US000A,US000D,US000K,USL00V
6 USL006,USL00B,USL00D
7 USL003,USL008,USL00G,USL00M
8 USL00A,USL00J,USL012,USV002,USV003
9 USL00B,USL00E,USL00J
10 US000J,USL00U,USL00V

 

Dataset Stations
Obs station_id
1 US0001
2 US0002
3 US0003
4 US0004
5 US0005
6 US0006
7 US0007
8 US0008
9 US0009
10 US000A

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data Observations;
input obs USL :$100.;
datalines;
1  US0003,US0005,USL00V                                           
2  USL007,USL00D                                                  
3  USL009,USL00A,USL00J,USL019                                    
4  US0003,US0005,USL009,USL00V                                    
5  US0003,US0004,US0005,US0007,US0008,US000A,US000D,US000K,USL00V 
6  USL006,USL00B,USL00D                                           
7  USL003,USL008,USL00G,USL00M                                    
8  USL00A,USL00J,USL012,USV002,USV003                             
9  USL00B,USL00E,USL00J                                           
10 US000J,USL00U,USL00V                                           
;

data Stations;
input station_id $;
datalines;
US0001
US0002
US0003
US0004
US0005
US0006
US0007
US0008
US0009
US000A
;

data Matches(drop = i station_id);
   if _N_ = 1 then do;
      dcl hash h(dataset : 'Stations');
      h.definekey('station_id');
      h.definedone();
   end;

   set Observations;

   do i = 1 to countw(USL);
      station_id = scan(USL, i, ',');
      if h.check() = 0 then do;
         output;
         leave;
      end;
   end;
run;

 

Result:

 

obs USL
1   US0003,US0005,USL00V
4   US0003,US0005,USL009,USL00V
5   US0003,US0004,US0005,US0007,US0008,US000A,US000D,US000K,USL00V

View solution in original post

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

Try this

 

data Observations;
input obs USL :$100.;
datalines;
1  US0003,US0005,USL00V                                           
2  USL007,USL00D                                                  
3  USL009,USL00A,USL00J,USL019                                    
4  US0003,US0005,USL009,USL00V                                    
5  US0003,US0004,US0005,US0007,US0008,US000A,US000D,US000K,USL00V 
6  USL006,USL00B,USL00D                                           
7  USL003,USL008,USL00G,USL00M                                    
8  USL00A,USL00J,USL012,USV002,USV003                             
9  USL00B,USL00E,USL00J                                           
10 US000J,USL00U,USL00V                                           
;

data Stations;
input station_id $;
datalines;
US0001
US0002
US0003
US0004
US0005
US0006
US0007
US0008
US0009
US000A
;

data Matches(drop = i station_id);
   if _N_ = 1 then do;
      dcl hash h(dataset : 'Stations');
      h.definekey('station_id');
      h.definedone();
   end;

   set Observations;

   do i = 1 to countw(USL);
      station_id = scan(USL, i, ',');
      if h.check() = 0 then do;
         output;
         leave;
      end;
   end;
run;

 

Result:

 

obs USL
1   US0003,US0005,USL00V
4   US0003,US0005,USL009,USL00V
5   US0003,US0004,US0005,US0007,US0008,US000A,US000D,US000K,USL00V
genemroz
Quartz | Level 8

Thanks for the prompt and helpful reply (I think...). Your solution worked on my larger datasets. But now I have to go and learn hash object code to figure out what just happened here.  But I'm truly grateful and marking this solution as accepted.

 

Regards,

 

Gene

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 654 views
  • 0 likes
  • 2 in conversation