Concatinate multiple values in one variable

Soham0707 · Posted 11-28-2019 03:39 AM

Hello Guys,

I have below dataset :

a b c
1000 1 C 1 - 605
1000 2 C 2 - 457
1000 3 C 3 - 457
1000 4 C 4 - 457
1000 5 C 5 - 457
1000 6 C 6 - 457
1001 1 C 1 - 606
1001 2 C 2 - 455
1001 3 C 3 - 455
1001 4 C 4 - 455
1001 5 C 5 - 455

and output dataset should be like :

a b c d
1001 6 C6 - 457 C1 - 605 C2 - 457 C3 - 457 C4 - 457 C5 - 457 C6 - 457
1002 5 C5 - 455 C1 - 606 C2 - 455 C3 - 455 C4 - 455 C5 - 455

can any one suggest program for this ?

Kurt_Bremser · Posted 11-28-2019 03:49 AM

Please supply example data in a usable form, in a data step with datalines; right now you have three variable names, but six data items below.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

s_lassen · Posted 11-28-2019 03:54 AM

The CATX function does the right thing. But when repeatedly adding to the same variable, it is better to use CALL CATX:

data want;
  do until(last.a);
    set have;
    by a;
    length d $200; /* I put the length statement here, so that D gets defined after the other vars */
    call catx(' ',d,c);
    end;
run;

andreas_lds · Posted 11-28-2019 04:02 AM

@s_lassen wrote:

The CATX function does the right thing. But when repeatedly adding to the same variable, it is better to use CALL CATX

What's the advantage in using "call catx" instead of "catx"?

Tom · Posted 11-29-2019 01:18 PM

With CALL CATX() since it knows it is appending the first variable in the list it does not need to copy the resulting string around as many times.

In this test it averages about 1/8 the time to use CALL CATX() instead of CATX().

%let reps=1000;
%let words=200;
data test1;
do rep=1 to &reps ;
  length x $32767 ;
  call missing(x);
  msecs=datetime();
  do _n_=1 to &words;
  x=catx(',',repeat('X',rand('integer',0,10)));
  end;
  msecs=1000*(datetime()-msecs);
  output;
end;
  drop x;
run;
proc means; var msecs; run;
data test2;
do rep=1 to &reps ;
  length x $32767 ;
  call missing(x);
  msecs=datetime();
  do _n_=1 to &words;
  call catx(',',x,repeat('X',rand('integer',0,10)));
  end;
  msecs=1000*(datetime()-msecs);
  output;
end;
  drop x;
run;
proc means; var msecs; run;

andreas_lds · Posted 11-28-2019 03:57 AM

Fully concur with KurtBremser: please post data in usable form.

ballardw · Posted 11-29-2019 12:47 PM

Can you describe the advantage of having the data in that wide layout?

It doesn't really help when your example starting data does not include id=1002 but your output does and the Id=1000 in the starting data does not appear in the output. That raises lots of questions related to why is 1000 missing and where did 1002 come from. Are we supposed to manipulate the shown values of ID to get new ID values? If so, what are the rules.

Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Re: Concatinate multiple values in one variable

Registration is open

SAS Training: Just a Click Away