BookmarkSubscribeRSS Feed
yanshuai
Quartz | Level 8

Hello,

This is what I have,

DATA HAVE1;
    INPUT (NAME1) (:$8.);
    CARDS;
Chicken
Dinner
Winner
Pubg
;
run;

DATA LIST;
    INPUT (NAME2) (:$8.);
    CARDS;
en
bg
;
run;

This is what I want

DATA WANT;
    INPUT (NAME1 DUMMY) (:$8.);
    CARDS;
Chicken 1
Dinner 0
Winner 0
Pubg 1
;
run;

The basic logic is

IF HAVE1.NAME1 contains LIST.NAME2 then WANT.DUMMY = 1

 

I try to use SQL and LEFT JOIN but I am not sure how to create a new dummy variable here.

I think using the DATA step could also work out, using _null_ to define a third data or something.

But anyway, I wonder if you could please help me with this.

Thanks.

9 REPLIES 9
Shmuel
Garnet | Level 18

Next code is tested:

data _null_;
 set list end=eof;
     retain list; 
     length list $32000;
     list = catx(',',list,quote(strip(name2)));
     if eof then do;
        call symput('LIST',strip(list));
        call symput('qty',strip(put(_N_,best.)));
     end;
run;
%put &=list &=qty;

data want;
   array list{*} $ chk1-chk&qty (&list);
   set have1;
       dummy=0;
       do i=1 to dim(list);
	      if index(name1,strip(list(i))) 
		     then do; dummy=1; leave; end;
	   end;
	   keep name1 dummy;
run;

andreas_lds
Jade | Level 19

Try:

data want;
   array names[1000] $ 8 _TEMPORARY_;
   
   do _n_ = 1 by 1 until (listRead);
      set work.list end=listRead;
      names[_n_] = name2;
   end;
   
   lastElement = _n_;

   do _n_ = 1 by 1 until(done);
      set work.have1 end=done;
      
      dummy = 0;
      
      do i = 1 to lastElement until(dummy);
         dummy = find(name1, names[i], 'it') > 0;
      end;
      
      output;
   end;

   keep name1 dummy;
run;

You should adjust the array statement: the dimension of the array could to small, and if the entries in LIST are longer than eight chars, you have to increase then length, too.

PhilC
Rhodochrosite | Level 12

When PROC SQL is a better solution, IMO

PROC SQL;
   CREATE TABLE WORK.WANT AS 
   SELECT distinct HAVE1.NAME1, 
          /*LIST.NAME2, */
            MAX(find(HAVE1.NAME1,LIST.NAME2,'ti') > 0) AS DUMMY
      FROM WORK.HAVE1 HAVE1,WORK.LIST LIST
      GROUP BY HAVE1.NAME1;
QUIT;

 

PhilC
Rhodochrosite | Level 12

Using Hash objects.  This should, idealistically, be fast for large files.

 

data want (Keep=NAME1 DUMMY) ;
  if 0 then set Work.LIST;
  declare hash _list(dataset: 'Work.LIST');
  rc= _list.defineKey('name2')
    + _list.defineData('name2')
    + _list.defineDone( );
  declare hiter list('_list');

  do until (eof);
    set have1 end=eof;

    rc = list.first();       
    DUMMY=0;
    do while (rc = 0 and DUMMY=0);
       put "NOTE: " _all_;
       DUMMY = find(name1, name2, 'it') > 0;
       rc = list.next();
    end;
    output;
  end;
run;

 

yanshuai
Quartz | Level 8

 This is awesome. I love SQL!

I wonder if I could do this with some macros

%MACRO dummy(list)
DATA want;
SET have1;
IF FIND(have1.name1,'&list', 'i') ge 1 THEN dummy =1;
RUN;
%MEND;

DATA _null_;
SET list;
CALL EXECUTE('%dummy('name2');');
RUN;

This should be fast. but every time I run this macro, the outcome dummy = empty value

I don't know how to make the macro recognize and replace the '&list' with name2

 

Thanks

yanshuai
Quartz | Level 8

Hi,

I think I figure it out

I use

%unqote(%str(%')&name2%str(%') to include single quotation in the macro variable

yanshuai
Quartz | Level 8

Hi guys,

I figure it out

here is what I did

%macro list(identifier);
data have1;
   set have1;
   if find(name1,%unquote(%str(%')&identifier%str(%')),'i') ge 1 THEN dummy =1;
run;
%mend;


data _null_;
set list;
call execute('%list('!!name2!!');');
run;

I love macro and SQL!

Tom
Super User Tom
Super User

Why are you jumping through hoops to add quotes?  Macro triggers (& and %) are not evaluated inside of strings that start with single quotes.  Why do you want to use single quotes instead of double quotes?

find(name1,"&identifier",'i')

And if you need them then just pass the string in with the quotes already in it.

... find(name1,&identifier,'i') ...
%list('xyz')

Then you could use the QUOTE() function when generating the macro call in your data step.

data _null_;
  set list;
  call execute(cats('%nrstr(%list)(',quote(trim(name2),"'"),')'));
run;
yanshuai
Quartz | Level 8

Thansk.

I am not so sure about the difference between double quotes and single quotes.

But your second line of codes is a very clever way of solving using single quotes in macros.

 

Thanks for the reply

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2327 views
  • 0 likes
  • 5 in conversation