BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fieldsa83
Quartz | Level 8

I have a .sas file with proc format for 200 variables (let's call it FORMATS.sas).

 

Now, when I read in actual data from an INFILE statement (let's call this DATAFILE), I %include the FORMATS.sas file.

 

Then, in the data step I have a format statement which then lists all 200 variable names and their format labels, so imagine:

 

format  
   var1 var1.
   var2 var2.
   var3 var3. ;
(...etc until all 200 are listed).
 

 

There are 2 issues with this process:

 

1. If I choose to only read in 10 of the 200 variables in the INFILE, the FORMAT statement will give 190 errors. So how can I only apply formats to variables that actually have been read in (without manually commenting out 190 lines in the format statement)?

 

2. How can I make this more efficient in general? I would like it to just have a database that it automatically grabs the format labels since the variable and its corresponding format is exactly the same name?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Make the dataset without formats, then run a macro calling dictionary.columns to get all the varnames and make a macrovar listing the varnames and matching format names. Then run PROC DATASETS to assign the formats, using that macrovar.

 

data test;
  input a b c;
datalines;
1 11 111
run;

proc format;
  value a other=[3.];
  value b other=[3.];
  value c other=[3.];
  value x other=[3.];
  value y other=[3.];
run;

%macro assign_fmts(dsn);
   %let dsn=%upcase(&dsn);
   %let dataset=%scan(&dsn,-1,%str(.));
   %if &dataset=&dsn %then %let libname=WORK;
   %else %let libname=%upcase(%scan(&dsn,1,%str(.)));
   %put _local_;
   proc sql noprint;
     select distinct catx(' ',name,cats(name,.)) into :fmts separated by " "
       from dictionary.columns where libname="&libname" and memname="&dataset";
   quit;
   proc datasets lib=&libname noprint ;
      modify &dataset;
      format &fmts;
      run;
   quit;
%mend ;

%assign_fmts(test);

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

9 REPLIES 9
art297
Opal | Level 21

I would just assign the formats to the entire file and save the file with the formats attached. Isn't that possible?

 

Conversely, if that isn't possible, then I'd create a macro in which you could specify the desired variable, use proc sql to read all of the variables from dictionary.columns, have it write a macro variable that included all of the desired format assignments, and then apply them while selecting the desired file.

 

Art, CEO, AnalystFinder.com

 

fieldsa83
Quartz | Level 8

The dictionary.column macro sounds intriguing. Do you have a sample code of how that would work?

 

(the first option isn't possible since the source data is actually in non-writeable text files, and there are hundreds of gigabytes worth--so I don't want to make 1 single database. also new text files are added each month to the folder)

art297
Opal | Level 21

Since they're text files you would need to be able to read in all variables. As such, you would need to know the file layout.

 

If you kept the variable names, location, informats and relevant formats if any for all variables, then that file could be accessed and, using statements in sql like:

 

select catx(' ','input ',varname,column_range,informat,'@;',

          catx(' ',format,varname,format),

          etc. into : inputs separated by ' ', formats separated by ' '

            where varname in &macro_variable_listing_all_desired_variables.

 

those macro variables could be used to both input and format the data.

 

Art, CEO, AnalystFinder.com

 

fieldsa83
Quartz | Level 8

Thanks. I'm very interested in how I would use this to input variables, as you say.

 

If for example my text file has AGE at line 10 for 3 characters, how would I incorporate that in this?

mkeintz
PROC Star

Make the dataset without formats, then run a macro calling dictionary.columns to get all the varnames and make a macrovar listing the varnames and matching format names. Then run PROC DATASETS to assign the formats, using that macrovar.

 

data test;
  input a b c;
datalines;
1 11 111
run;

proc format;
  value a other=[3.];
  value b other=[3.];
  value c other=[3.];
  value x other=[3.];
  value y other=[3.];
run;

%macro assign_fmts(dsn);
   %let dsn=%upcase(&dsn);
   %let dataset=%scan(&dsn,-1,%str(.));
   %if &dataset=&dsn %then %let libname=WORK;
   %else %let libname=%upcase(%scan(&dsn,1,%str(.)));
   %put _local_;
   proc sql noprint;
     select distinct catx(' ',name,cats(name,.)) into :fmts separated by " "
       from dictionary.columns where libname="&libname" and memname="&dataset";
   quit;
   proc datasets lib=&libname noprint ;
      modify &dataset;
      format &fmts;
      run;
   quit;
%mend ;

%assign_fmts(test);

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
fieldsa83
Quartz | Level 8

This is perfect, thanks so much! Really appreciate it.

 

One thing: it seems to break if the varname has an underscore in it, do you think there's a way to solve that (not a big deal if not)?

ballardw
Super User

@fieldsa83 wrote:

This is perfect, thanks so much! Really appreciate it.

 

One thing: it seems to break if the varname has an underscore in it, do you think there's a way to solve that (not a big deal if not)?


You should provide an example of the variable name and the format you are want to assign and describe any symptoms in more detail than "seems to break". if you get errors post the log with the errors. Since a macro may be involved run the code with

OPTIONS MPRINT SYMBOLGEN;

to get more detail into the log. 

fieldsa83
Quartz | Level 8

Sorry I don't think the underscore is actually a problem as much as I am confused...

 

How can i add a suffix to the variable format name? I'm trying to add "en" to the end of every variable name format.

 

I tried to add it in the same section as the period but it's saying "the following columns were not found in the contribitng tables: en"

 

And if i try to add _en it says "syntax error, expecting one of the following: a name, *."

 

Ideally the result would be varname1 varname1_en.

 

 

mkeintz
PROC Star

Then instead of

                   cats(name,.)

use

                   cats(name,'en',.))

 

 

Study the CATS function, and the related CATX, etc.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2173 views
  • 4 likes
  • 4 in conversation