BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ryan_rodrig
Calcite | Level 5

Hi there!

I'm having a problem with concatenating a series of rows into a single row based on a group within a SAS dataset.

Basically a simplified version of what I want is to be able to is start with something like this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

run;

and end up with something like this:

LetterNumbers
a1,2,3
b4,5,6
c
d7,8
e9, 10
f11

The delimeter doesn't matter as long as I can see all the numbers that pertain to a group.  In this case I don't know how many 'numbers' would be applied to each letter (assume max 5) and I'd have 100K+ 'letters'.

Thanks in advance for your help!!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Try this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data want;

length cat $20.;

   do until (last.letter);

      set intable;

        by letter notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

proc print;run;

Haikuo

View solution in original post

14 REPLIES 14
Haikuo
Onyx | Level 15

Try this:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data want;

length cat $20.;

   do until (last.letter);

      set intable;

        by letter notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

proc print;run;

Haikuo

jobermon
Fluorite | Level 6
SO GREAT! VERY PRECISE AND EASY TO RUN! THANKS!
blinkwy
Calcite | Level 5

Love this solution and have found it extremely usefully. Is there a way to exand this so that the "numbers" columns is grouped using two columns?

 

I tried this but it didn't work

 

data want;

length cat $20.;

   do until (last.letter1 and last.letter2);

      set intable;

        by letter2 letter1 notsorted;

      cat=catx(',',cat,number);

   end;

   drop number;

run;

paganini59
Calcite | Level 5

For the do until line of code try the following instead. This will loop through different letter1 within each letter2.

do until (last.letter1);

 

Linlin
Lapis Lazuli | Level 10

or by hash:

data inTable;

input letter $ number $;

datalines;

a 1

a 2

a 3

b 4

b 5

b 6

c 7

d 8

e 9

e 10

f 11

;

data _null_;

length letter $8 cat $20;

  if _n_=1 then do;

    declare hash h(ordered:'a');

  h.definekey('letter');

  h.definedata('letter','cat');

  h.definedone();

  end;

  set intable end=last;

  if h.find() ne 0 then do;

   cat=number;

   h.add();

   end;

   else do;

   cat=catx(',',cat,number);

   h.replace();

   end;

   if last then h.output(dataset:'want');

   run;

   proc print data=want;run;

                             Obs    letter    cat

                               1       a       1,2,3

                               2       b       4,5,6

                               3       c       7

                               4       d       8

                               5       e       9,10

                               6       f       11

Linlin

SteveNZ
Obsidian | Level 7

Or:

data want ;

    length cat $200 ;

    retain cat ;

set intable ;

    by letter notsorted ;

        if first.letter then cat = cats(number) ;

            else cat = catx(',',cat,number) ;

        if last.letter then output ;

run ;

jobermon
Fluorite | Level 6
Excellent! Your way of doing it helped me a lot! I was wondering what I would do to see step-by-step before getting straight to the final output! I just erased the "if last.letter then output" and it worked! Thanks! 🙂
Ksharp
Super User
data inTable;
input letter $ number $;
datalines;
a 1
a 2
a 3
b 4
b 5
b 6
c 7
d 8
e 9
e 10
f 11
;
run;
data want(drop=number);
 set inTable ;
 by letter;
 length l $ 4000;
 retain l;
 l=catx(',',l,number);
 if last.letter then do; output;call missing(l);end;
run;

Ksharp

Howles
Quartz | Level 8

Or, use the SAS tools which match the needs, in this case PROC TRANSPOSE and the CATX function.

proc transpose data=intable out=tempwide ;

by letter ;

var number ;

run ;

data want(keep = letter cat) ;

set tempwide ;

length cat $20 ;

cat = catx( ',' , of col: ) ;

run ;

PhilG
Fluorite | Level 6

Good afternoon!

 

I thought this was really cool solution--thank you for posting it. I was curious about doing something similar, but rather than populating the transposed cell with the series of values (i.e., 1,2,3, etc.), I would want to populate the cell with the sum of values (i.e., 6). Could I modify this proc transpose syntax to accomplish this?

 

I appreciate your learned insight. 

 

Cheers!

 

P

SPChoudary
Fluorite | Level 6

Dear Phil,

 

I am really sorry I did not get your question, could you please share your question with examples?

 

Regards,

SP.

PhilG
Fluorite | Level 6

Sure thing. Basically, I want to be able to turn this:

 

ID       LOS       

a         10

a         20

a         30

b         20

b         20

b         20

c         20

c         10

c         50

 

into something like this:

 

ID       Total LOS       

a         60

b         60

c         80

 

or, even something like this:

 

ID       Mean LOS       

a         20

b         20

c         26.67

 

 

Does that make sense? I appreciate the follow up question.

 

Cheers!

 

P

 

 

SPChoudary
Fluorite | Level 6

Dear Phil,

 

Here is the solution for your question, Hope it will helps you.

 

data inds;
input id $ los;
cards;
a 10

a 20

a 30

b 20

b 20

b 20

c 20

c 10

c 50
;

proc sort data = inds;
by id;
run;

/* Sum by Group */
data outds1 (keep = id sumvar);
set inds;
by id;

if first.id then sumvar = 0;

sumvar + los;

if last.id;
run;

/* Mean by Group */
data outds2 (keep = id meanvar);
set inds;
by id;

if first.id then do;
sumvar = 0;
grpcnt = 0;
end;

sumvar + los;
grpcnt + 1;

meanvar = round(sumvar/grpcnt, .01);

if last.id;
run;

 

Regards,

SP Choudary.

SPChoudary
Fluorite | Level 6

Here is another example from SP Choudary.

 

data inds;
input grp $ invar $;
cards;
a 1
a 2
a 3
b 4
b 5
b 6
c 7
d .
e 8
e 9
f 10
g .
g .
;

data outds (drop = invar);
set inds;
by grp;
length outvar $200;
retain outvar;
if first.grp then call missing(outvar);
outvar = catx(', ', outvar, invar);
if last.grp;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 99430 views
  • 29 likes
  • 11 in conversation