BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
jibra
Obsidian | Level 7

Here's a sample of what I have:

 

var1 var2
AA Paul
BB Paul
XX Paul
dd Mike
pp Mike
cc Dave
ee Dave
ff Dave
gg Dave
XX Dave
hh Ken
ii Ken
jj Ken
XX Ken
kk Rick
ll Rick
mm Rick

 

 

Here's a sample of what I want:

 

var1 var2
AA Paul
BB Paul
XX Paul
cc Dave
ee Dave
ff Dave
gg Dave
XX Dave
hh Ken
ii Ken
jj Ken
XX Ken

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Or a data step hash table lookup

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'have(where=(var1=:"XX"))');
      h1.defineKey('var2');
      h1.defineDone();
    end;
  set have;
  if h1.check()=0;
run;

View solution in original post

12 REPLIES 12
Quentin
Super User

Please post your sample data as code (DATA step with CARDS statement).  Also please post the code you have tried.  This will help others help you.

Tom
Super User Tom
Super User

Trying to guess what you mean it sounds like you want all observations for any person that has VAR1='XX'.  (Note question are easier to read if the details are in the body of the question and not just the subject line.)

 

SQL should make that simple.

proc sql;
create table want as 
  select * from have
  where var2 in
    (select var2 from have where var1='XX')
;
quit;

 

Kurt_Bremser
Super User

Double DO method:

data want;
do until (last.var2);
  set have;
  by var2 notsorted;
  if var1 = "XX" then flag = 1;
end;
do until (last.var2);
  set have;
  by var2 notsorted;
  if flag then output;
end;
drop flag;
run;
Patrick
Opal | Level 21

Or a data step hash table lookup

data want;
  if _n_=1 then
    do;
      dcl hash h1(dataset:'have(where=(var1=:"XX"))');
      h1.defineKey('var2');
      h1.defineDone();
    end;
  set have;
  if h1.check()=0;
run;
jibra
Obsidian | Level 7
All suggestions in this thread worked, but the hash object technique seems to be the most efficient. Thank you!
jibra
Obsidian | Level 7

Most efficient of all the solutions offered.

mkeintz
PROC Star

If the dataset is not particularly big, then I think your best option is the hash object technique suggested by @Patrick. The code is simple.

 

That solution requires reading the data in two unsynchronized data streams, which could create a performance hit in the case of large datasets, due to disk activity.

 

Now if the data were sorted by VAR2, you could do:

 

data want;
  merge have (where=(var1='XX')  in=wanted) 
        have;
  by var2;
  if wanted;
run;

which synchronizes the data streams and reduces disk activity.

 

 

Your data are not sorted by VAR2, although it is grouped by VAR2.  So you can generate data stream synchronization with code such as this:

 

data want (drop=_:);
  set have (where=(var1='XX') rename=(var2=_var2));

  do until (last.var2=1 and var2=_var2);
    set have;
    by var2 notsorted;
    if var2=_var2 then output; 
  end;
run;

 

 

The above assumes no more than one var1='XX' for any given VAR2.  If you can have two or more var1='XX' cases, then:

data want (drop=_:);
  set have (where=(var1='XX') rename=(var2=_var2));
  by _var2 notsorted;

  if last._var2 then do until (last.var2=1 and var2=_var2);
    set have;
    by var2 notsorted;
    if var2=_var2 then output; 
  end;
run;

 

And yes, the degree of synchronization achieved would depend on the distribution of the "XX" cases.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
yabwon
Amethyst | Level 16

Just to have (more) complete picture.

 

The power of IN operator over a macro variable with constant list of values:

data have;
input var1 $ var2 $;
cards;
AA	Paul
BB	Paul
XX	Paul
dd	Mike
pp	Mike
cc	Dave
ee	Dave
ff	Dave
gg	Dave
XX	Dave
hh	Ken
ii	Ken
jj	Ken
XX	Ken
kk	Rick
ll	Rick
mm	Rick
;
run;
proc print data=have;
run;

proc SQL noprint;
  select 
  distinct quote(var2)
  into :myList separated by " "
  from have
  where var1 = "XX"
  ;
quit;

data want;
  set have;
  where var2 in (&myList.);
run;
proc print data=want;
run;

BTW. Just for fun, here is an article "Route Sixty-Six to SAS Programming, or 63 (+3) Syntax Snippets for a Table Look-up Task, or How to L..." that provides 66 examples of solving the table-lookup task 🙂

 

All the best

Bart

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Ksharp
Super User

Or you would like to get it automatically ? and not type XX by hand ?

 

data have;
input var1 $ var2 $;
cards;
AA	Paul
BB	Paul
XX	Paul
dd	Mike
pp	Mike
cc	Dave
ee	Dave
ff	Dave
gg	Dave
XX	Dave
hh	Ken
ii	Ken
jj	Ken
XX	Ken
kk	Rick
ll	Rick
mm	Rick
;

proc sql;
create table want as
select * from have where var2 in 
(
select distinct var2 from have group by var1 having count(distinct var2)>1
)
;
quit;
yabwon
Amethyst | Level 16

I'm not sure if this is possible to have such data, but what about "Joe" (last record) ?

data have;
input var1 $ var2 $;
cards;
AA	Paul
BB	Paul
XX	Paul
dd	Mike
pp	Mike
cc	Dave
ee	Dave
ff	Dave
gg	Dave
XX	Dave
hh	Ken
ii	Ken
jj	Ken
XX	Ken
kk	Rick
ll	Rick
mm	Rick
AA	Joe
;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Ksharp
Super User
yabwom,
I don't know either .
It is all depend on what the OP is looking for.
The code I posted is just guessing the intention of OP .

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1499 views
  • 6 likes
  • 8 in conversation