Is there a general purpose way of concatenating each variable in an observation into one larger variable whilst preserving the format of numeric/currency fields in terms of how it looks when you do a proc print on the dataset. (see sashelp.shoes for example)
Here is some code you can run, as you can see when looking at the log, using the catx function to produce a comma separated output removes both the $ currency sign as well as the period from the numeric variables
proc print data=sashelp.shoes (obs=10); run; proc sql; select name into :varstr2 separated by ',' from dictionary.columns where libname = "SASHELP" and memname = "SHOES"; quit; data stuff(); format all $5000.; set sashelp.shoes ; all = catx(',',&varstr2.) ; put all; run;
If you really wanted to do this you could use the CALL VNEXT() function and the VVALUEX() function to loop over all of the variables.
You could even add in some smarts to deal with missing values and adding quotes around values with special characters.
Let's make a sample little dataset to use.
data have;
set sashelp.shoes(obs=5);
if _n_=2 then call missing(region);
if _n_=3 then call missing(stores);
if _n_=4 then product='.';
run;
So we will need to make some new variables to drive the logic. At least one to take the name of the variable. But more to add more flexibility.
data want;
set have;
length __name $32 __type __dlm $1 __value __all $5000 ;
__dlm=',';
do _n_=1 by 1 while (1);
call vnext(__name,__type);
if __name='__name' then leave;
__value=vvaluex(__name);
if strip(__value) = getoption('missing') and __type='N' then __value=' ';
if indexc(__value,'"','0D0A'x,__dlm) then __value=quote(strip(__value));
if _n_=1 then __all=__value;
else __all=cats(__all,__dlm,__value);
end;
drop __name __type __value __dlm ;
run;
So the loop will use CALL VNEXT() to find the next variables name and then use VVALUEX() to get the formatted value.
It then checks to make sure that the formatted value for a numeric variable is just the character being used for missing values as set in the MISSING option. Then it adds quotes around values that have a quote or the delimiter or carriage return or a linefeed.
Then when it is the first variable is just stores the value, but for the additional variables at adds in the delimiter.
Finally it drops the extra variables it created.
Obs Region Product Subsidiary Stores Sales Inventory Returns 1 Africa Boot Addis Ababa 12 $29,761 $191,821 $769 2 Men's Casual Addis Ababa 4 $67,242 $118,036 $2,284 3 Africa Men's Dress Addis Ababa . $76,793 $136,273 $2,433 4 Africa . Addis Ababa 10 $62,819 $204,284 $1,861 5 Africa Slipper Addis Ababa 14 $68,641 $279,795 $1,771 Obs __all 1 Africa,Boot,Addis Ababa,12,"$29,761","$191,821",$769 2 ,Men's Casual,Addis Ababa,4,"$67,242","$118,036","$2,284" 3 Africa,Men's Dress,Addis Ababa,,"$76,793","$136,273","$2,433" 4 Africa,.,Addis Ababa,10,"$62,819","$204,284","$1,861" 5 Africa,Slipper,Addis Ababa,14,"$68,641","$279,795","$1,771"
Here is a version converted to a macro that uses a temporary array to reduce the chance of conflicts between the names of variables in the input dataset and the variables needed for the processing.
%macro cat_all
(inds= /* Input dataset name */
,outds= /* Output dataset name */
,dlm=',' /* Delimiter (as quoted string) */
,var=all /* New variable name */
,len=5000 /* Max length of new variable */
);
/*
The DO loop will stop when at reaches the new variable.
The _N_ counter is used to be able to tell when processing the first
variable.
Uses a temporary array to reduce potential name conflicts.
So only variable names not allowed in the input dataset are
the value of &VAR and the name ___ used for the array.
Meaning of the Array entries
___[1] is NAME of the current variable
___[2] is TYPE of the current variable
___[3] is VALUE of the current variable
Because the temporary array values are retained need to force
___[1] to empty so that CALL VNEXT() starts at first variable name each
iteration of the data step loop.
If the variable is numeric and its only value is the character used
by the MISSING option then force value to blanks.
When value contains quotes, CR, LF or the delimiter then add quotes
around the value.
Will not work with space as the delimiter.
*/
data &outds;
set &inds;
length &var $&len ;
array ___ [3] $&len _temporary_;
___[1]=' ';
do _n_=1 by 1 while (1);
call vnext(___[1],___[2]);
if ___[1]="&var" then leave;
___[3]=vvaluex(___[1]);
if strip(___[3]) = getoption('missing') and ___[2]='N' then ___[3]=' ';
if indexc(___[3],'"','0D0A'x,&dlm) then ___[3]=quote(strip(___[3]));
if _n_=1 then &var=___[3];
else &var=cats(&var,&dlm,___[3]);
end;
run;
%mend cat_all;
Why do you want all values in one variable?
The function vvalue returns the formatted value of a variable, and vvaluex allows the usage of expressions:
proc sql noprint;
select Name format=$quote32.
into :NameList separated by ','
from sashelp.vcolumn
where libname = 'SASHELP' and memname = 'SHOES'
;
quit;
%put &=NameList;
data narf;
set sashelp.shoes;
length everything anything $ 1000 _var $ 32;
everything = catx(',',of _all_);
do _var = &NameList.;
anything = catx(',', anything, vvaluex(_var));
end;
keep everything anything;
run;
proc print;run;
Is there a way to have a null field instead of a period for missing data?
So instead of A,B,.,D we would have A,B,,D
See if this comes close to what you want. This uses a temporary text file that you should set to a location and name of your choice.
May want to run this code once with your data set a length for the LONGVAR that is appropriate.
filename myfile "x:\data\junk.csv"; Proc export data=sashelp.shoes file= myfile dbms=csv replace; run; data want; infile myfile dlm=',' lrecl=1000; input; longvar= _infile_; run;
@taupirho wrote:
I don't have access to the filesystem to save files to
Just use the TEMP engine on the FILENAME statement to make a temporary file.
The PUT statement already knows how to do this.
data _null_;
set sashelp.shoes(obs=3);
file 'myfile.csv' dsd ;
put (_all_) (+0);
run;
What is the goal of this exercise? If you want to create a file why not just write the file directly?
data _null_;
set sashelp.shoes(obs=3);
file 'myfile.csv' dsd ;
put (_all_) (+0);
run;
If you really do want to get that string into a dataset variable then you could just read the file back in.
filename junk temp;
data _null_;
set sashelp.shoes(obs=3);
file junk dsd ;
put (_all_) (+0);
run;
data stuff ;
set sashelp.shoes(obs=3);
infile junk truncover ;
input all $5000.;
run;
You can even do it without creating the full text file. Instead just write one line at a time and save the line into a variable using the _FILE_
automatic variable.
filename junk temp;
data stuff;
set sashelp.shoes(obs=3);
file junk dsd lrecl=5000 ;
length all $5000;
put @1 (_all_) (+0) +(-2) ' ' @;
all = _file_;
output;
all=' ';
put @1 all $5000. @;
run;
I don't want to write a file!. When I concatenate the fields together, missing fields get represented by a period character. Can it be made so that nothing is output for a missing field (without writing files!). So instead of A,B.,.,.,.,.,Y I get A,B,,,,,Y
Also a supplementary question. Does the field separator have to be a printable character. Can it be something like a control character such as CTRL-A
@taupirho wrote:
I don't want to write a file!.
How are we supposed to know that?
@taupirho wrote:
When I concatenate the fields together, missing fields get represented by a period character. Can it be made so that nothing is output for a missing field (without writing files!). So instead of A,B.,.,.,.,.,Y I get A,B,,,,,Y
This depends on how you use the CAT functions. If you use OF it doesn't do that, if you use the full list it does.
@taupirho wrote:
Also a supplementary question. Does the field separator have to be a printable character. Can it be something like a control character such as CTRL-A
Did you try it? What happened? It doesn't have to be a printable character, but then it depends on how the output is displayed in your destination. Listing will likely show it it differently than HTML or PDF. So you may not see it either which makes it harder to debug.
Some one else has a very similar question right now, so I'm assuming this is homework or something?
There is a request to have this functionality built into SAS but it currently doesn't exist:
You can roll your own with PROC FCMP though or see the example of a solution in the comments.
If you really wanted to do this you could use the CALL VNEXT() function and the VVALUEX() function to loop over all of the variables.
You could even add in some smarts to deal with missing values and adding quotes around values with special characters.
Let's make a sample little dataset to use.
data have;
set sashelp.shoes(obs=5);
if _n_=2 then call missing(region);
if _n_=3 then call missing(stores);
if _n_=4 then product='.';
run;
So we will need to make some new variables to drive the logic. At least one to take the name of the variable. But more to add more flexibility.
data want;
set have;
length __name $32 __type __dlm $1 __value __all $5000 ;
__dlm=',';
do _n_=1 by 1 while (1);
call vnext(__name,__type);
if __name='__name' then leave;
__value=vvaluex(__name);
if strip(__value) = getoption('missing') and __type='N' then __value=' ';
if indexc(__value,'"','0D0A'x,__dlm) then __value=quote(strip(__value));
if _n_=1 then __all=__value;
else __all=cats(__all,__dlm,__value);
end;
drop __name __type __value __dlm ;
run;
So the loop will use CALL VNEXT() to find the next variables name and then use VVALUEX() to get the formatted value.
It then checks to make sure that the formatted value for a numeric variable is just the character being used for missing values as set in the MISSING option. Then it adds quotes around values that have a quote or the delimiter or carriage return or a linefeed.
Then when it is the first variable is just stores the value, but for the additional variables at adds in the delimiter.
Finally it drops the extra variables it created.
Obs Region Product Subsidiary Stores Sales Inventory Returns 1 Africa Boot Addis Ababa 12 $29,761 $191,821 $769 2 Men's Casual Addis Ababa 4 $67,242 $118,036 $2,284 3 Africa Men's Dress Addis Ababa . $76,793 $136,273 $2,433 4 Africa . Addis Ababa 10 $62,819 $204,284 $1,861 5 Africa Slipper Addis Ababa 14 $68,641 $279,795 $1,771 Obs __all 1 Africa,Boot,Addis Ababa,12,"$29,761","$191,821",$769 2 ,Men's Casual,Addis Ababa,4,"$67,242","$118,036","$2,284" 3 Africa,Men's Dress,Addis Ababa,,"$76,793","$136,273","$2,433" 4 Africa,.,Addis Ababa,10,"$62,819","$204,284","$1,861" 5 Africa,Slipper,Addis Ababa,14,"$68,641","$279,795","$1,771"
Here is a version converted to a macro that uses a temporary array to reduce the chance of conflicts between the names of variables in the input dataset and the variables needed for the processing.
%macro cat_all
(inds= /* Input dataset name */
,outds= /* Output dataset name */
,dlm=',' /* Delimiter (as quoted string) */
,var=all /* New variable name */
,len=5000 /* Max length of new variable */
);
/*
The DO loop will stop when at reaches the new variable.
The _N_ counter is used to be able to tell when processing the first
variable.
Uses a temporary array to reduce potential name conflicts.
So only variable names not allowed in the input dataset are
the value of &VAR and the name ___ used for the array.
Meaning of the Array entries
___[1] is NAME of the current variable
___[2] is TYPE of the current variable
___[3] is VALUE of the current variable
Because the temporary array values are retained need to force
___[1] to empty so that CALL VNEXT() starts at first variable name each
iteration of the data step loop.
If the variable is numeric and its only value is the character used
by the MISSING option then force value to blanks.
When value contains quotes, CR, LF or the delimiter then add quotes
around the value.
Will not work with space as the delimiter.
*/
data &outds;
set &inds;
length &var $&len ;
array ___ [3] $&len _temporary_;
___[1]=' ';
do _n_=1 by 1 while (1);
call vnext(___[1],___[2]);
if ___[1]="&var" then leave;
___[3]=vvaluex(___[1]);
if strip(___[3]) = getoption('missing') and ___[2]='N' then ___[3]=' ';
if indexc(___[3],'"','0D0A'x,&dlm) then ___[3]=quote(strip(___[3]));
if _n_=1 then &var=___[3];
else &var=cats(&var,&dlm,___[3]);
end;
run;
%mend cat_all;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.