BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ryan_rodrig
Calcite | Level 5

Hi there!

I'm having a problem with concatenating a series of rows into a single row based on a group within a SAS dataset.

Basically a simplified version of what I want is to be able to is start with something like this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

run;

and end up with something like this:

LetterNumbers
a1,2,3
b4,5,6
c
d7,8
e9, 10
f11

The delimeter doesn't matter as long as I can see all the numbers that pertain to a group.  In this case I don't know how many 'numbers' would be applied to each letter (assume max 5) and I'd have 100K+ 'letters'.

Thanks in advance for your help!!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Try this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data want;

length cat $20.;

   do until (last.letter);

      set intable;

        by letter notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

proc print;run;

Haikuo

View solution in original post

14 REPLIES 14
Haikuo
Onyx | Level 15

Try this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data want;

length cat $20.;

   do until (last.letter);

      set intable;

        by letter notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

proc print;run;

Haikuo

jobermon
Fluorite | Level 6
SO GREAT! VERY PRECISE AND EASY TO RUN! THANKS!
blinkwy
Calcite | Level 5

Love this solution and have found it extremely usefully. Is there a way to exand this so that the "numbers" columns is grouped using two columns?

 

I tried this but it didn't work

 

data want;

length cat $20.;

   do until (last.letter1 and last.letter2);

      set intable;

        by letter2 letter1 notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

paganini59
Calcite | Level 5

For the do until line of code try the following instead. This will loop through different letter1 within each letter2.

do until (last.letter1);

 

Linlin
Lapis Lazuli | Level 10

or by hash:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data _null_;

length letter $8 cat $20;

  if _n_=1 then do;

    declare hash h(ordered:'a');

  h.definekey('letter');

  h.definedata('letter','cat');

  h.definedone();

  end;

  set intable end=last;

  if h.find() ne 0 then do;

   cat=number;

   h.add();

   end;

   else do;

   cat=catx(',',cat,number);

   h.replace();

   end;

   if last then h.output(dataset:'want');

   run;

   proc print data=want;run;

                             Obs    letter    cat

                               1       a       1,2,3

                               2       b       4,5,6

                               3       c       7

                               4       d       8

                               5       e       9,10

                               6       f       11

Linlin

SteveNZ
Obsidian | Level 7

Or:

data want ;

    length cat $200 ;

    retain cat ;

set intable ;

    by letter notsorted ;

        if first.letter then cat = cats(number) ;

            else cat = catx(',',cat,number) ;

        if last.letter then output ;

run ;

jobermon
Fluorite | Level 6
Excellent! Your way of doing it helped me a lot! I was wondering what I would do to see step-by-step before getting straight to the final output! I just erased the "if last.letter then output" and it worked! Thanks! 🙂
Ksharp
Super User
data inTable;
input letter $ number $;
datalines;
a 1
a 2
a 3
b 4
b 5
b 6
c 7
d 8
e 9
e 10
f 11
;
run;
data want(drop=number);
 set inTable ;
 by letter;
 length l $ 4000;
 retain l;
 l=catx(',',l,number);
 if last.letter then do; output;call missing(l);end;
run;

Ksharp

Howles
Quartz | Level 8

Or, use the SAS tools which match the needs, in this case PROC TRANSPOSE and the CATX function.

proc transpose data=intable out=tempwide ;

by letter ;

var number ;

run ;

data want(keep = letter cat) ;

set tempwide ;

length cat $20 ;

cat = catx( ',' , of col: ) ;

run ;

PhilG
Fluorite | Level 6

Good afternoon!

 

I thought this was really cool solution--thank you for posting it. I was curious about doing something similar, but rather than populating the transposed cell with the series of values (i.e., 1,2,3, etc.), I would want to populate the cell with the sum of values (i.e., 6). Could I modify this proc transpose syntax to accomplish this?

 

I appreciate your learned insight. 

 

Cheers!

 

P

SPChoudary
Fluorite | Level 6

Dear Phil,

 

I am really sorry I did not get your question, could you please share your question with examples?

 

Regards,

SP.

PhilG
Fluorite | Level 6

Sure thing. Basically, I want to be able to turn this:

 

ID       LOS       

a         10

a         20

a         30

b         20

b         20

b         20

c         20

c         10

c         50

 

into something like this:

 

ID       Total LOS       

a         60

b         60

c         80

 

or, even something like this:

 

ID       Mean LOS       

a         20

b         20

c         26.67

 

 

Does that make sense? I appreciate the follow up question.

 

Cheers!

 

P

 

 

SPChoudary
Fluorite | Level 6

Dear Phil,

 

Here is the solution for your question, Hope it will helps you.

 

data inds;
input id $ los;
cards;
a 10

a 20

a 30

b 20

b 20

b 20

c 20

c 10

c 50
;

proc sort data = inds;
by id;
run;

/* Sum by Group */
data outds1 (keep = id sumvar);
set inds;
by id;

if first.id then sumvar = 0;

sumvar + los;

if last.id;
run;

/* Mean by Group */
data outds2 (keep = id meanvar);
set inds;
by id;

if first.id then do;
sumvar = 0;
grpcnt = 0;
end;

sumvar + los;
grpcnt + 1;

meanvar = round(sumvar/grpcnt, .01);

if last.id;
run;

 

Regards,

SP Choudary.

SPChoudary
Fluorite | Level 6

Here is another example from SP Choudary.

 

data inds;
input grp $ invar $;
cards;
a 1
a 2
a 3
b 4
b 5
b 6
c 7
d .
e 8
e 9
f 10
g .
g .
;

data outds (drop = invar);
set inds;
by grp;
length outvar $200;
retain outvar;
if first.grp then call missing(outvar);
outvar = catx(', ', outvar, invar);
if last.grp;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 100187 views
  • 29 likes
  • 11 in conversation