Hi all
I have a requirement to write over 100 variables to a flat file. I am using PUT statement for that purpose.
Currently I give position and format for each and every variable while writing.
For eg:
DATA _NULL_;
FILE TEST;
SET OUTDATA;
PUT @1 FIELD1 $5.
@6 FIELD2 $10.
.........
@1000 FIELD100 $50;
RUN;
Do we have any simpler way to do this? I tried one approach of using _ALL_ with put statment.
DATA _NULL_;
FILE TEST;
SET OUTDATA;
PUT (_ALL_) (:);
RUN;
This works, but it doesn't give me the control over format of the variables. Giving formats while writing to a file may be optional, but it is required in my case as some fields have leading spaces. If they are written without giving correct format, the leading space gets automatically removed by SAS.
I tried using FORMAT _CHARACTER_ /_NUMERIC - But it doesn't seem to do the things as desired. I tried giving FORMAT statement before the DATA _NULL_ step for each variable. But that still doesn't work and they get overridden when written to the file. I also tried specifying formats for individual variable below PUT (_ALL_) statement. But still leading and trailing spaces were getting removed from the fields in the file.
Any ways to overcome this problem? Or is there any simpler way to write to big files while having control over the formats?
Are your desired formats already specified, for each variable, are you outputting all of the variables in that data set, is each supposed to be immediately adjacent to the previous one, and are they all in the same order as they exist in your data set?
Yes, I did. Still leading and trailing spaces get automatically removed.
for instance, consider this data in the SAS dataset:
Name Age
Alfred 15
Bob 20
Podolski 20
While I write this to the file, it is written as:
Alfred 15
Bob 20
Podolski 20
The leading and trailing spaces automatically go away.
But what I expect is,
Alfred 15
Bob 20
Podolski 20
I keep variables in the order while writing. Else I use retain statement to preserve the order that I wish.
I wasn't offering advice but, rather, asking a number of questions.
It looks like you are asking for a fixed column output. Which probably won't lend itself to a short solution. One basic approach is the @ symbol to specify exactly which colum you want a variable to start in.
put @1 name $10. @15 age f2.0 ;
You are using a belt and suspenders with that style of code.
You should be able to do what you want with just the formats without the column pointers.
Put your metadata into a file and generate the code by pulling the metadata into macro variables.
data structure;
length name $32 format $40 ;
input name format;
cards;
field1 $5.
field2 $10.
field3 date9.
run;
proc sql noprint ;
select name , format into :names separated by ' ' , :formats separated by ' '
from structure ;
quit;
data _null_;
file test;
set outdata ;
put (&names) (&formats);
run;
Tom, thanks for the suggestion.
Do we have any way to give formats only to few variables and not for the rest while using put statement without requiring to list all the other variables explicitly (something combined with _all_)?
Although I need to write all the fields in the dataset need to be written to the file, basically only one or two fields in the entire file would have leading/trailing spaces and applying formats for them alone would be sufficient for me. Other fields can be written as it is. I am not really sure why put statment still removes leading spaces even after applying format to _character_.
I have lost track of what your actual requirements are. If you just want to output a dataset to a text file without having to specify all of the variable names then it is much easier to use a delimited file than a fixed column file. It will also be much easier for whoever is going to have to use the file you generate.
You seemed to have some very specific format for the output file in mind. In that case I am not sure how you can not specify all of it. But you could specify the beginning columns instead of the formats.
format field3 date9.;
put @1 field1 @6 field2 @16 field3;
In that case your metadata could mainly be the starting column for each variable.
data structure;
length name $32 location 8 format $40.;
input name location format;
cards;
field1 1
field2 6
field3 16 date9.
run;
proc sql noprint ;
select catx(' ','@',location,name , format) into :names separated by ' '
from structure ;
quit;
data _null_;
file test;
set outdata ;
put &names ;
run;
Got that. Apologies for sounding confusing!
Thanks Tom.
Surely the most complex part is assembling the name|format pairs. (Obviously in a metadata table prior to generating the code).
Among that challenge I think only defaults require caution.
Already mentioned: use $char. rather than $. (Ofcourse care with user formats is needed too).
Should the resulting text be ansi, unicode and/or a NLS level?
Is default numeric output to appear as BEST8. or BEST18. or BEST32.?
For date and datetime formatted data should format length be set to ensure century appears?
For percent., TIME. and datetime. formated values, how many decimal places should appear? (One teradata timestamp default sets 6 decimal places!).
Should all percent, date and datetime formatted data not be presented in the style required by the receiving system - whatever that is?
I hope that might cover all the questions I would hsve to ask for building a generic macro to support this kind of data storage.
After all these have been resolved assembling code might become straightforward allbeit tedious ;-))
petrrC
I think that and I are thinking along the same lines. I used sashelp.class as a test, as it DOESN'T have any formats assigned, thus would need some work before this could be done. My approach would be:
data testdata;
format name $20.;
format sex $1.;
format age 2.0;
format height 4.1;
format weight 7.2;
set sashelp.class;
run;
data need;
set sashelp.vcolumn
(where=(libname eq 'WORK' and memname eq 'TESTDATA'));
len=int(compress(format,'.','kd'));
if _n_ eq 1 then pos=1;
output;
pos+len;
run;
proc sql noprint ;
select cat('@',pos,' ',trim(name),' ',trim(format))
into :names separated by ' '
from need ;
quit;
filename test "c:\testout.txt";
data _null_;
file test;
set testdata ;
put &names.;
run;
You have to realize that the code you are writing will automatically remove leading blanks. The $10. format removes leading blanks. If you want to preserve them, you have to use the $char10. format instead.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.