DATA Step, Macro, Functions and more

Concatenate

Reply
Frequent Contributor
Posts: 97

Concatenate

Hello

I have the following data set with binary fields C1,C2.....C10.

 

Data tbl;
input ID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10;
cards;
111 1 0 0 0 0 0 0 1 1 0
222 0 0 0 0 0 0 0 0 1 1
333 0 1 0 0 0 0 0 0 0 0
444 0 0 0 0 0 0 0 0 0 0
555 0 0 0 0 0 0 1 1 0 1
;
run;

 

I want to create a new calculated variable that tell us which variables received value 1.

For example :

For ID  111 the new variable will receive value  C8,C9

For ID  222  the new variable will receive value  C9,C10

For ID  333  the new variable will receive value  C2

For ID  444  the new variable will receive value  'No_Fail' because  none of fields received value 1

For ID  555  the new variable will receive value  C7,C8,C10

 

 

Can anyone help to write the code for that?

thanks

Koein

 

Super User
Posts: 9,919

Re: Concatenate

Define an array for c1-c10.

Set a new variable of sufficient length to an empty string.

Loop through the array, and if the current member is true (1), use newvar = catx(',',newvar,vname(array{i})); to concatenate its name.

If newvar is still empty after the loop, set it to 'No_Fail'.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Frequent Contributor
Posts: 97

Re: Concatenate

Posted in reply to KurtBremser

Thanks. Is there another way without working with arrays. I have never did it before

Super User
Posts: 9,919

Re: Concatenate


@Ronein wrote:

Thanks. Is there another way without working with arrays. I have never did it before


Then it's time you start learning the use of arrays. They provide a simple and elegant solution for your problem.

The whole solution takes just 7 lines of code:

  1. array definition
  2. initialization of target variable
  3. start of do loop
  4. if array member, set newvar with catx and vname
  5. end of do loop
  6. if newvar is empty, set to No_Fail
  7. drop the iteration counter so it does not mess up the output dataset

For a nice paper about arrays, see https://support.sas.com/resources/papers/97529_Using_Arrays_in_SAS_Programming.pdf

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Trusted Advisor
Posts: 1,831

Re: Concatenate

Using arrays saves coding.

Here is the code @KurtBremser meant.

 It can be done in the same step, as below, or in a separate step:

Data tbl;
  infile cards;
  input ID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10;
  array cx c1 - c10;   /* c1-c10 is equivalent co c1 c2 ... c10 */
  length new_var $30;  /* 3 chars * 10 vars */
new_var = ' '; do i=1 to 10; /* check in a loop */ if cx(i) = 1 then new_var = catx(',' , new_var, vname(cx(i)) ); end; cards; 111 1 0 0 0 0 0 0 1 1 0 222 0 0 0 0 0 0 0 0 1 1 333 0 1 0 0 0 0 0 0 0 0 444 0 0 0 0 0 0 0 0 0 0 555 0 0 0 0 0 0 1 1 0 1 ; run;

Compare that code to the alternative, as a separate step:

data new_tbl;
  set tbl;
        length new_var $30;
        new_var = ' ';
        if c1 = 1 then new_var = 'C1';
        if c2 = 1 then new_var = catx(',' , new_var, 'C2');
        if c3 = 1 then new_var = catx(',' , new_var, 'C3');
        if c4 = 1 then new_var = catx(',' , new_var, 'C4');
        .....  /* i'm lazy to write the in between lines */
        if c10 = 1 then new_var = catx(',' , new_var, 'C10');
run;

 

 

 

Valued Guide
Posts: 560

Re: Concatenate

You can transpose the data and then find all the values.

 

Data tbl;
input ID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10;
cards;
111 1 0 0 0 0 0 0 1 1 0
222 0 0 0 0 0 0 0 0 1 1
333 0 1 0 0 0 0 0 0 0 0
444 0 0 0 0 0 0 0 0 0 0
555 0 0 0 0 0 0 1 1 0 1
;
run;

proc transpose data=tbl out=Trans name=Var;
by id;
var c1-c10;
run;

data final(Keep=ID New_Var);
Format New_Var $30.;
do until (last.id);
set Trans(where=(col1=1));
by id;
New_Var=Strip(New_Var)||","||Strip(Var);
end;
run;

Since the data is already sorted order, I'm not sorting it. You may need to sort your data before PROC TRANSPOSE

 

 

Thanks,
Suryakiran
Valued Guide
Posts: 560

Re: Concatenate

Another alternate solution,

 

Data tbl;
input ID C1 C2 C3 C4 C5 C6 C7 C8 C9 C10;
cards;
111 1 0 0 0 0 0 0 1 1 0
222 0 0 0 0 0 0 0 0 1 1
333 0 1 0 0 0 0 0 0 0 0
444 0 0 0 0 0 0 0 0 0 0
555 0 0 0 0 0 0 1 1 0 1
;
run;

proc transpose data=tbl out=Trans name=Var Prefix=Binary_;
by id;
var c1-c10;
run;
proc sort data=trans;
by id Binary_1;
proc transpose data=trans out=Trans2(drop=_label_ _name_) ;
by id Binary_1;
var var;
run;

DATA want(keep=id New_Var);
set trans2;
New_Var=CATX(",",of Col1-Col10);
if Binary_1=1;/* You can change this to 0 also */
run;
Thanks,
Suryakiran
Frequent Contributor
Posts: 97

Re: Concatenate

Posted in reply to SuryaKiran

Thank you so much all with a very clever answers! I am learning a lot from you. Hope that one day I will be able to answer by myself...lol

Ask a Question
Discussion stats
  • 7 replies
  • 149 views
  • 1 like
  • 4 in conversation