turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Need help to combine multiple column into a single...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2008 09:20 PM

Hi All,

I have a problem. I have SAS table which has many variables (not fixed). I want to append this all variables into one variable. For example: I have table called MULTCOL in which I have variables A1, A2, A3, A4, A5, ... and so on. I want to create a table called CMBCOL in which I have just one variable A which is formed by appending all the variables from table MULTCOL. How can I do this. Any kind of help is appreciated.

Thanks.

I have a problem. I have SAS table which has many variables (not fixed). I want to append this all variables into one variable. For example: I have table called MULTCOL in which I have variables A1, A2, A3, A4, A5, ... and so on. I want to create a table called CMBCOL in which I have just one variable A which is formed by appending all the variables from table MULTCOL. How can I do this. Any kind of help is appreciated.

Thanks.

Accepted Solutions

Solution

09-28-2016
11:40 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 12:47 AM - last edited on 09-28-2016 11:46 AM by Kathryn_SAS

*Editor's Note: This solution provided by Cynthia best addressed the original question.*

Hi:

The program below shows the use of the CATT, CATS and CATX functions -- which you can read about in the documentation. In the "fake" data below, variables A1-A5 are character variables; and variables N1-N5 are numeric variables. In order to do the concatenate, you should use the explicit PUT function if you are going to create a character string from numeric values (to avoid conversion errors). There are, of course, many other ways to approach this problem -- but this is an example to get you started with the basic variable value concatenation concepts.

cynthia

```
data fakedata;
** Vars a1-a5 are character;
** vars n1-n5 are numeric;
length a1 $2 a2 $3 a3 $1 a4 $5 a5 $2;
infile datalines;
input a1 a2 a3 a4 a5 n1 n2 n3 n4 n5;
return;
datalines;
aa xyz q bbbbb tt 11 22 33 44 55
bb xyz r ccccc uu 99 88 77 66 55
;
run;
data cmbcol;
length A N $1000 A_oth N_oth $1000;
set fakedata;
A = catt(a1,a2,a3,a4,a5);
A_oth = catx(':',a1,a2,a3,a4,a5);
N = cats(put(n1, best8.), put(n2, best8.),
put(n3, best8.), put(n4, best8.),put(n5, best8.));
N_oth = catx(':',put(n1, best8.),put(n2, best8.),
put(n3, best8.), put(n4, best8.),put(n5, best8.));
run;
proc print data=cmbcol;
title 'concatenate separate variables into one big variable';
run;
```

All Replies

Solution

09-28-2016
11:40 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 12:47 AM - last edited on 09-28-2016 11:46 AM by Kathryn_SAS

*Editor's Note: This solution provided by Cynthia best addressed the original question.*

Hi:

The program below shows the use of the CATT, CATS and CATX functions -- which you can read about in the documentation. In the "fake" data below, variables A1-A5 are character variables; and variables N1-N5 are numeric variables. In order to do the concatenate, you should use the explicit PUT function if you are going to create a character string from numeric values (to avoid conversion errors). There are, of course, many other ways to approach this problem -- but this is an example to get you started with the basic variable value concatenation concepts.

cynthia

```
data fakedata;
** Vars a1-a5 are character;
** vars n1-n5 are numeric;
length a1 $2 a2 $3 a3 $1 a4 $5 a5 $2;
infile datalines;
input a1 a2 a3 a4 a5 n1 n2 n3 n4 n5;
return;
datalines;
aa xyz q bbbbb tt 11 22 33 44 55
bb xyz r ccccc uu 99 88 77 66 55
;
run;
data cmbcol;
length A N $1000 A_oth N_oth $1000;
set fakedata;
A = catt(a1,a2,a3,a4,a5);
A_oth = catx(':',a1,a2,a3,a4,a5);
N = cats(put(n1, best8.), put(n2, best8.),
put(n3, best8.), put(n4, best8.),put(n5, best8.));
N_oth = catx(':',put(n1, best8.),put(n2, best8.),
put(n3, best8.), put(n4, best8.),put(n5, best8.));
run;
proc print data=cmbcol;
title 'concatenate separate variables into one big variable';
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 01:33 AM

Hi Cynthia,

Thanks for your post. But this not I am looking for. You are considering a fixed number of variables in the data set. But in my case the dataset I am dealing with has number of variables not fixed. For one day it has 6 variables and for another day it has 10 variables.

I thought of creating macro variable. But I am not getting the idea.

I appreciate your help.

Thanks

Thanks for your post. But this not I am looking for. You are considering a fixed number of variables in the data set. But in my case the dataset I am dealing with has number of variables not fixed. For one day it has 6 variables and for another day it has 10 variables.

I thought of creating macro variable. But I am not getting the idea.

I appreciate your help.

Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 02:02 AM

Hi:

In that case, you are going to have to jump into the deep end. You can use a PROC SQL query on DICTIONARY.COLUMNS to find out what the column names are (and how many there are) on any particular day. Then you'll probably have to work on the logic of how a macro program would operate and that would depend on whether you need macro conditional logic or not.

The key is having a working SAS program before you "macro-ize" it. And you have to resolve how you are going to pass the dataset name, how you are going to make the macro program, how you are pass in the output dataset name, etc. But you still have to start with a working SAS program and at the heart of the working SAS program will be one of the concatenate functions.

The program below concatenates all the variables from SASHELP.SHOES into two new variables.

cynthia

[pre]

proc sql;

select name into :varstr2 separated by ','

from dictionary.columns

where libname = "SASHELP" and

memname = "SHOES";

quit;

data makenew;

length newvar newvar2 $1000;

set sashelp.shoes;

newvar = catt(&varstr2);

newvar2 = catx(':',&varstr2);

run;

proc print data=makenew(obs=5);

title "Resolved Var List is: &varstr2";

run;

[/pre]

In that case, you are going to have to jump into the deep end. You can use a PROC SQL query on DICTIONARY.COLUMNS to find out what the column names are (and how many there are) on any particular day. Then you'll probably have to work on the logic of how a macro program would operate and that would depend on whether you need macro conditional logic or not.

The key is having a working SAS program before you "macro-ize" it. And you have to resolve how you are going to pass the dataset name, how you are going to make the macro program, how you are pass in the output dataset name, etc. But you still have to start with a working SAS program and at the heart of the working SAS program will be one of the concatenate functions.

The program below concatenates all the variables from SASHELP.SHOES into two new variables.

cynthia

[pre]

proc sql;

select name into :varstr2 separated by ','

from dictionary.columns

where libname = "SASHELP" and

memname = "SHOES";

quit;

data makenew;

length newvar newvar2 $1000;

set sashelp.shoes;

newvar = catt(&varstr2);

newvar2 = catx(':',&varstr2);

run;

proc print data=makenew(obs=5);

title "Resolved Var List is: &varstr2";

run;

[/pre]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 12:05 PM

Hi Cynthia,

Thank you for your quick response. I read the manual for CATT function. Actually CATT function concatenates the variable. What I understood is suppose I have two variables x and y then CATT will make it XY. My problem is not to concatenate but to append. I should have made it clear at my first posting. I am trying to re write it. Sorry for the confusion.

I have a dataset MULTICOL as given below.

A1 A2 A3 A4

1 2 3 4

5 6 7 8

I want to create a table COMCOL with one variable as

A

1

5

2

6

3

7

4

8.

Condition is : The number of variable is not fixed in the dataset MULTCOL. Everytime it keeps on changing. In the above case I have just 4 variables but in real it could be any number.

Thanks for your help.

rookie72.

Thank you for your quick response. I read the manual for CATT function. Actually CATT function concatenates the variable. What I understood is suppose I have two variables x and y then CATT will make it XY. My problem is not to concatenate but to append. I should have made it clear at my first posting. I am trying to re write it. Sorry for the confusion.

I have a dataset MULTICOL as given below.

A1 A2 A3 A4

1 2 3 4

5 6 7 8

I want to create a table COMCOL with one variable as

A

1

5

2

6

3

7

4

8.

Condition is : The number of variable is not fixed in the dataset MULTCOL. Everytime it keeps on changing. In the above case I have just 4 variables but in real it could be any number.

Thanks for your help.

rookie72.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 03:14 PM

Hi:

Ah, well, I wouldn't call that either an append or a combine or a concatenate. Basically it falls under the category, in my mind, of changing the structure of the data, based on some criteria that is unique to your needs.

In a lot of cases, folks use PROC TRANSPOSE for this kind of task. You can also use a DATA step program -- in your example (much easier to come up with the -right- solution with an example of the input file and the desired output file) -- and you don't even need any macro programming at all -- except maybe for the input file names, depending on whether they changed.

That's because SAS has a couple of tools available that allow you to references ALL the numeric variables or ALL the character variables in your dataset. This would let you build an array reference for the variables and you would not necessarily need to know how many columns there were from data file to data file.

Take a look at the program below and run it. Examine the output and see if that's more along the lines of what you were thinking of. I made the orig_obsnum and the orig_varname variables in the final dataset, because you may, someday, wish to trace the original origins of some of these values/measurements.

cynthia

[pre]

data multcol;

infile datalines;

input A1 A2 A3 A4;

return;

datalines;

1 2 3 4

5 6 7 8

;

run;

proc print data=multcol noobs;

title 'multcol';

run;

data cmbcol(keep=a orig_varname orig_obsnum);

set multcol;

array myvars _numeric_;

do i = 1 to dim(myvars);

orig_varname = vname(myvars(i));

orig_obsnum = _n_;

A = myvars(i);

output;

end;

run;

proc sort data=cmbcol;

by orig_varname a;

run;

proc print data=cmbcol noobs;

title 'cmbcol';

run;

[/pre]

Ah, well, I wouldn't call that either an append or a combine or a concatenate. Basically it falls under the category, in my mind, of changing the structure of the data, based on some criteria that is unique to your needs.

In a lot of cases, folks use PROC TRANSPOSE for this kind of task. You can also use a DATA step program -- in your example (much easier to come up with the -right- solution with an example of the input file and the desired output file) -- and you don't even need any macro programming at all -- except maybe for the input file names, depending on whether they changed.

That's because SAS has a couple of tools available that allow you to references ALL the numeric variables or ALL the character variables in your dataset. This would let you build an array reference for the variables and you would not necessarily need to know how many columns there were from data file to data file.

Take a look at the program below and run it. Examine the output and see if that's more along the lines of what you were thinking of. I made the orig_obsnum and the orig_varname variables in the final dataset, because you may, someday, wish to trace the original origins of some of these values/measurements.

cynthia

[pre]

data multcol;

infile datalines;

input A1 A2 A3 A4;

return;

datalines;

1 2 3 4

5 6 7 8

;

run;

proc print data=multcol noobs;

title 'multcol';

run;

data cmbcol(keep=a orig_varname orig_obsnum);

set multcol;

array myvars _numeric_;

do i = 1 to dim(myvars);

orig_varname = vname(myvars(i));

orig_obsnum = _n_;

A = myvars(i);

output;

end;

run;

proc sort data=cmbcol;

by orig_varname a;

run;

proc print data=cmbcol noobs;

title 'cmbcol';

run;

[/pre]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-13-2008 02:52 PM

Thank you Cynthia. This is exactly what I wanted. Proc transpose came into my mind but since the variable number was not constant I couldn't think much. The ARRAY never came into my mind.

Thank you again for your help.

Thank you again for your help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-22-2014 05:19 AM

Hi Cynthia,

In continuation to earlier communication, I have similar kind of requirement as below but little change..hope you can help on this.

I have consumers name field with character length 80. that i have splited in to multiple columns based on spaces..and now i have got around 15 columns after spliting.

now, my requirement is I need to fit all the 15 columns(that were created above with different lengths) values in to 5 output columns of length 26 each.Can you please suggest and let me know if you need more inputs.

Thanks in advance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2017 06:54 PM

Hi,

I need something similar to this, but I would like only UNIQUE Values populated into my new string variable.

What I'm doing is Transposing some information and then srunching all of that items into one variable. I have written a macro to do this, but I can get multiple items sent to my new string variable.

I have added a duplicate record in the datalines. (11 in line 1 and 99 in line 2)

How would I code it so that these duplicated items were only listed once?

**data** fakedata;

** Vars a1-a5 are character;

** vars n1-n5 are numeric;

length a1 $**2** a2 $**3** a3 $**1** a4 $**5** a5 $**2**;

infile datalines;

input a1 a2 a3 a4 a5 n1 n2 n3 n4 n5;

return;

datalines;

aa xyz q bbbbb tt 11 22 11 44 55

bb xyz r ccccc uu 99 88 99 66 55

;

**run**;

**data** cmbcol;

length A N $**1000** A_oth N_oth $**1000**;

set fakedata;

A_oth = catx(',',a1,a2,a3,a4,a5);

N_oth = catx(',',put(n1, best8.),put(n2, best8.),

put(n3, best8.), put(n4, best8.),put(n5, best8.));

**run**;

**proc** **print** data=cmbcol;

title 'concatenate separate variables into one big variable';

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2008 12:47 AM

Consider using the CATT or CATX function in a SAS 9 DATA step, if you need to concatenate variables together, for whatever obscure reason. Remember there is a max length limit to a SAS character variable. Consult the SAS Language DOC for pertinent details.

Scott Barry

SBBWorks, Inc.

Scott Barry

SBBWorks, Inc.