BookmarkSubscribeRSS Feed
yanshuai
Quartz | Level 8

Hello,

This is what I have,

DATA HAVE1;
    INPUT (NAME1) (:$8.);
    CARDS;
Chicken
Dinner
Winner
Pubg
;
run;

DATA LIST;
    INPUT (NAME2) (:$8.);
    CARDS;
en
bg
;
run;

This is what I want

DATA WANT;
    INPUT (NAME1 DUMMY) (:$8.);
    CARDS;
Chicken 1
Dinner 0
Winner 0
Pubg 1
;
run;

The basic logic is

IF HAVE1.NAME1 contains LIST.NAME2 then WANT.DUMMY = 1

 

I try to use SQL and LEFT JOIN but I am not sure how to create a new dummy variable here.

I think using the DATA step could also work out, using _null_ to define a third data or something.

But anyway, I wonder if you could please help me with this.

Thanks.

9 REPLIES 9
Shmuel
Garnet | Level 18

Next code is tested:

data _null_;
 set list end=eof;
     retain list; 
     length list $32000;
     list = catx(',',list,quote(strip(name2)));
     if eof then do;
        call symput('LIST',strip(list));
        call symput('qty',strip(put(_N_,best.)));
     end;
run;
%put &=list &=qty;

data want;
   array list{*} $ chk1-chk&qty (&list);
   set have1;
       dummy=0;
       do i=1 to dim(list);
	      if index(name1,strip(list(i))) 
		     then do; dummy=1; leave; end;
	   end;
	   keep name1 dummy;
run;

andreas_lds
Jade | Level 19

Try:

data want;
   array names[1000] $ 8 _TEMPORARY_;
   
   do _n_ = 1 by 1 until (listRead);
      set work.list end=listRead;
      names[_n_] = name2;
   end;
   
   lastElement = _n_;

   do _n_ = 1 by 1 until(done);
      set work.have1 end=done;
      
      dummy = 0;
      
      do i = 1 to lastElement until(dummy);
         dummy = find(name1, names[i], 'it') > 0;
      end;
      
      output;
   end;

   keep name1 dummy;
run;

You should adjust the array statement: the dimension of the array could to small, and if the entries in LIST are longer than eight chars, you have to increase then length, too.

PhilC
Rhodochrosite | Level 12

When PROC SQL is a better solution, IMO

PROC SQL;
   CREATE TABLE WORK.WANT AS 
   SELECT distinct HAVE1.NAME1, 
          /*LIST.NAME2, */
            MAX(find(HAVE1.NAME1,LIST.NAME2,'ti') > 0) AS DUMMY
      FROM WORK.HAVE1 HAVE1,WORK.LIST LIST
      GROUP BY HAVE1.NAME1;
QUIT;

 

PhilC
Rhodochrosite | Level 12

Using Hash objects.  This should, idealistically, be fast for large files.

 

data want (Keep=NAME1 DUMMY) ;
  if 0 then set Work.LIST;
  declare hash _list(dataset: 'Work.LIST');
  rc= _list.defineKey('name2')
    + _list.defineData('name2')
    + _list.defineDone( );
  declare hiter list('_list');

  do until (eof);
    set have1 end=eof;

    rc = list.first();       
    DUMMY=0;
    do while (rc = 0 and DUMMY=0);
       put "NOTE: " _all_;
       DUMMY = find(name1, name2, 'it') > 0;
       rc = list.next();
    end;
    output;
  end;
run;

 

yanshuai
Quartz | Level 8

 This is awesome. I love SQL!

I wonder if I could do this with some macros

%MACRO dummy(list)
DATA want;
SET have1;
IF FIND(have1.name1,'&list', 'i') ge 1 THEN dummy =1;
RUN;
%MEND;

DATA _null_;
SET list;
CALL EXECUTE('%dummy('name2');');
RUN;

This should be fast. but every time I run this macro, the outcome dummy = empty value

I don't know how to make the macro recognize and replace the '&list' with name2

 

Thanks

yanshuai
Quartz | Level 8

Hi,

I think I figure it out

I use

%unqote(%str(%')&name2%str(%') to include single quotation in the macro variable

yanshuai
Quartz | Level 8

Hi guys,

I figure it out

here is what I did

%macro list(identifier);
data have1;
   set have1;
   if find(name1,%unquote(%str(%')&identifier%str(%')),'i') ge 1 THEN dummy =1;
run;
%mend;


data _null_;
set list;
call execute('%list('!!name2!!');');
run;

I love macro and SQL!

Tom
Super User Tom
Super User

Why are you jumping through hoops to add quotes?  Macro triggers (& and %) are not evaluated inside of strings that start with single quotes.  Why do you want to use single quotes instead of double quotes?

find(name1,"&identifier",'i')

And if you need them then just pass the string in with the quotes already in it.

... find(name1,&identifier,'i') ...
%list('xyz')

Then you could use the QUOTE() function when generating the macro call in your data step.

data _null_;
  set list;
  call execute(cats('%nrstr(%list)(',quote(trim(name2),"'"),')'));
run;
yanshuai
Quartz | Level 8

Thansk.

I am not so sure about the difference between double quotes and single quotes.

But your second line of codes is a very clever way of solving using single quotes in macros.

 

Thanks for the reply

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 4549 views
  • 0 likes
  • 5 in conversation