BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tri_h
Fluorite | Level 6


data ex1;
input var1 $ var2 $ var3 $;
cards;
%a b f
b %c d
%g f i
;
run;

what i want is MACRvar=%a|%c|%g, i know the transpose+sort nodupkey could achieve this, how to achieve this with a data step like, data _null_; set ex1... i might thought take use of retain, array...

thanks for your helping.

 

thanks for helping

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

To avoid duplicates, compare the candidate word with the current set of concatenated values using the FINDW function, which returns a 0 if the candidate is not found as a "word" in the concatenation.  "word" here means strings separated by '|':

 

data ex1;
input var1 $ var2 $ var3 $;
cards;
%a b f
b %c d
%g f %c
run;

data _null_;
  set ex1 end=end_of_ex;
  length mv $24;
  retain mv ;
  if var1=:'%' and findw(mv,trim(var1),'|')=0 then mv=catx('|',mv,var1);
  if var2=:'%' and findw(mv,trim(var2),'|')=0 then mv=catx('|',mv,var2);
  if var3=:'%' and findw(mv,trim(var3),'|')=0 then mv=catx('|',mv,var3);
  if end_of_ex;
  call symput('mvar',mv);
run;

%put %superq(mvar);

Now it's getting busy enough that you'll want to use an array of var1 & var2 & var3, and program a loop.  Note the 2nd argument of the findw function is not var1, but trim(var1) - otherwise any trailing blanks in var1 will be included in the search for a matching string in the MV variable.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
mkeintz
PROC Star

Declare your new variable (call it mv) as a character variable with a length you know is sufficient.  Then:

data ex1;
input var1 $ var2 $ var3 $;
cards;
%a b f
b %c d
%g f i
run;

data _null_;
  set ex1 end=end_of_ex;
  length mv $24;
  retain mv ;
  if var1=:'%' then mv=catx('|',mv,var1);
  if var2=:'%' then mv=catx('|',mv,var2);
  if var3=:'%' then mv=catx('|',mv,var3);
  if end_of_ex;
  call symput('mvar',mv);
run;

%put %superq(mvar);

I leave it to you to construct the array and corresponding loop.  Just note that the "=:" comparison operator is a convenient way to find values starting with a % sign.

 

Note I used the %superq function in the %put statement, because otherwise I would have to use the macrovar &mvar (instead of mvar in the superq function).  This would cause the macro interpreter to attempt to resolve the %a%c%g and generate messages like "WARNING: Apparent invocation of macro A not resolved."

 

There may be a macro function such that I could use

  %put %somefunction(&mvar)

without messages generated, but I am unaware of it.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tri_h
Fluorite | Level 6
Sorry, The sample data isn't typical enough, think about:
data ex1;
input var1 $ var2 $ var3 $;
cards;
%a b f
b %c d
%g f %c
run;
there's a repeat ‘%c’, and i want to store it once...
mkeintz
PROC Star

While we're at it, what order do you want to concatenate the desired values?  Order of appearance?  Alphabetical?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tri_h
Fluorite | Level 6
Order of appearance would be better, i had tried an index function before the catx like:
if index(mv, arrayVARs)=0 then mv=catx('|', mv, arrayVARs), but it still add '%c' tiwce.
mkeintz
PROC Star

To avoid duplicates, compare the candidate word with the current set of concatenated values using the FINDW function, which returns a 0 if the candidate is not found as a "word" in the concatenation.  "word" here means strings separated by '|':

 

data ex1;
input var1 $ var2 $ var3 $;
cards;
%a b f
b %c d
%g f %c
run;

data _null_;
  set ex1 end=end_of_ex;
  length mv $24;
  retain mv ;
  if var1=:'%' and findw(mv,trim(var1),'|')=0 then mv=catx('|',mv,var1);
  if var2=:'%' and findw(mv,trim(var2),'|')=0 then mv=catx('|',mv,var2);
  if var3=:'%' and findw(mv,trim(var3),'|')=0 then mv=catx('|',mv,var3);
  if end_of_ex;
  call symput('mvar',mv);
run;

%put %superq(mvar);

Now it's getting busy enough that you'll want to use an array of var1 & var2 & var3, and program a loop.  Note the 2nd argument of the findw function is not var1, but trim(var1) - otherwise any trailing blanks in var1 will be included in the search for a matching string in the MV variable.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1443 views
  • 0 likes
  • 2 in conversation