DATA Step, Macro, Functions and more

Concatenating variable variables problem

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Concatenating variable variables problem

Hello,

I'm stuck on the following problem. I have a dataset that looks like this.

Type          Value

animal         sheep

animal         dog

animal         cow

office           chair

office           table

building       skyscraper

building       appartment

building       flat

building       townhouse

I need to concatenate the values per type to 1 single string, seperated by a comma and a space (, ). So then I end up with somethinig like

Type          NewValue

animal          sheep, dog, cow

office            chair, table

building        skyscraper, appartment, flat, townhouse

I can transpose this dataset and then concatenate the values together. But the problem is that the number of values per type can vary up to an unknown number. So on one moment in time there could be 4 maximum values per type. On another moment in time there could be 10 values per type.

I figured out I can use ATTRN(NSVARS) to calculate the number of variables/columns that the transpose creates but I can't figure out the syntax mnsto use this number to write it down to something like

catx(', ',&Column1 to &Column) where is the last Column1 is the first column created by the transpose and Column is the last.

Does anyone have a solution for this?

Many thanks in advance.,


Accepted Solutions
Solution
‎07-23-2014 05:33 AM
Super Contributor
Posts: 297

Re: Concatenating variable variables problem

How about using a DOW loop?

DATA HAVE;

INFILE DATALINES;

LENGTH TYPE $8 VALUE $10;

INPUT TYPE $ VALUE $;

DATALINES;

ANIMAL SHEEP

ANIMAL DOG

ANIMAL COW

OFFICE CHAIR

OFFICE TABLE

BUILDING SKYSCRAPER

BUILDING APPARTMENT

BUILDING FLAT

BUILDING TOWNHOUSE

;

RUN;

PROC SORT DATA=HAVE;

BY TYPE;

RUN;

DATA WANT;

LENGTH VALUE2 $2000;

DO UNTIL (LAST.TYPE);

  SET HAVE;

  BY TYPE;

  VALUE2 = CATX(", ",VALUE2,VALUE);

END;

RUN;

View solution in original post


All Replies
Super User
Super User
Posts: 7,417

Re: Concatenating variable variables problem

Well, the simplest one which jumps to mind is (assuming sorted):

data want (keep=type newvalue);

     length newvalue $2000;

     set have;

     by type;

     if first.type then newvalue=value;

     else newvalue=strip(newvalue)||","||strip(value);

     if last.type then output;

run;

Super User
Super User
Posts: 7,417

Re: Concatenating variable variables problem

You could also use transpose, just create a template dataset with far more variables than ever will happen and then set that with the transpose  (do note however that one variable can only hold up to 2000 characters, thereofr if you have lots of data it will not all fit and you will get truncation - also on previous):

data temp;

     attrib val1-val2000 format=$200.;

     if _n_<0 then output;

run;

proc transpose data=have out=want prefix=val;

     by type;

run;

data want;

     set want;

     attrib newvalue format=$2000.;

     newvalue=catx(',',of val1-val2000);

run;

Occasional Contributor
Posts: 8

Re: Concatenating variable variables problem

This didn't work for me unfortunately.

Super User
Super User
Posts: 7,417

Re: Concatenating variable variables problem

Sorry, which post didn't work, what were the errors/warnings?

Occasional Contributor
Posts: 8

Re: Concatenating variable variables problem

DATA HAVE;

INFILE DATALINES;

LENGTH TYPE $8 VALUE $10;

INPUT TYPE $ VALUE $;

DATALINES;

ANIMAL SHEEP

ANIMAL DOG

ANIMAL COW

OFFICE CHAIR

OFFICE TABLE

BUILDING SKYSCRAPER

BUILDING APPARTMENT

BUILDING FLAT

BUILDING TOWNHOUSE

;

RUN;

PROC SORT DATA=HAVE;

BY TYPE;

RUN;

data want2 (keep=type newvalue);

     length newvalue $2000;

     set have;

     by type;

     if first.type then newvalue=value;

     else newvalue=strip(newvalue)||","||strip(value);

     if last.type then output;

run;

There are no errors. But the result looks like this:

,COWANIMAL
,TOWNHOUSEBUILDING
,TABLEOFFICE

And I couldn't really figure out how to make it work. So I went with Scott_Mitchell's solution :-)

Super User
Super User
Posts: 7,417

Re: Concatenating variable variables problem

Yep, I forget the retain statement:

data want2 ;

     length newvalue $2000;

     set have;

     retain newvalue;

     by type;

     if first.type then newvalue=value;

     else newvalue=strip(newvalue)||","||strip(value);

     if last.type then output;

run;

Occasional Contributor
Posts: 8

Re: Concatenating variable variables problem

Ah, there. Now it works yes. I assumed it was a retain and was messing around with it but I put it in the wrong place in your code. Thanks for the help!

Valued Guide
Posts: 3,208

Re: Concatenating variable variables problem

Use retaining and by processing in this well ordered dataset. Your questions to solve:

- maximum length of resulting variable

- output processing on last of the by group

- initialisation on first of the by group.

---->-- ja karman --<-----
Solution
‎07-23-2014 05:33 AM
Super Contributor
Posts: 297

Re: Concatenating variable variables problem

How about using a DOW loop?

DATA HAVE;

INFILE DATALINES;

LENGTH TYPE $8 VALUE $10;

INPUT TYPE $ VALUE $;

DATALINES;

ANIMAL SHEEP

ANIMAL DOG

ANIMAL COW

OFFICE CHAIR

OFFICE TABLE

BUILDING SKYSCRAPER

BUILDING APPARTMENT

BUILDING FLAT

BUILDING TOWNHOUSE

;

RUN;

PROC SORT DATA=HAVE;

BY TYPE;

RUN;

DATA WANT;

LENGTH VALUE2 $2000;

DO UNTIL (LAST.TYPE);

  SET HAVE;

  BY TYPE;

  VALUE2 = CATX(", ",VALUE2,VALUE);

END;

RUN;

Occasional Contributor
Posts: 8

Re: Concatenating variable variables problem

This seems to do the trick without needing the transpose but why? I'm not familliar with putting a do until statement in front of the set statement.

Valued Guide
Posts: 3,208

Re: Concatenating variable variables problem

My proposal to programming a retain is more easy to understand. It is sharing the by processing but the retaining is solved different.

The whole trick is based on some definitions with the PDV (program data vector) and the automatic record processing / reinitializing values of recordfields.

http://support.sas.com/resources/papers/proceedings12/052-2012.pdf    Scott Mitchels code can be copied and works great.    

---->-- ja karman --<-----
Occasional Contributor
Posts: 8

Re: Concatenating variable variables problem

Thank you very much for the link. I will go look into it :-)

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 370 views
  • 7 likes
  • 4 in conversation