01-09-2017 12:38 PM
I have a dataset with more than 200 variables and as many user-defined formats (I have a separate sas file with the formats). Each variable is associated with a separate format. I need to find out the format associated with each variable, but I am not able to do so. proc contents give me only the variable names, while proc catalog outputs only the formats. Thanks in advance!
01-09-2017 12:50 PM
Have you looked into proc datasets? see: https://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#p120panelmbpren1m0j2...
Art, CEO, AnalystFinder.com
01-09-2017 02:17 PM - edited 01-09-2017 04:17 PM
Keep looking. PROC CONTENTS provides the format names as well.
IF they have been assigned. I you don't see them then you may be out of luck as until SAS has been told which formats are associated there isn't any way to guess (though formats with names similar to a varaible name may be a clue).
By default the SAS supplied dataset doesn't have formats associated and uses defaults. So Proc contents doesn't show them. But when I explicitly assign a format to a couple variables in a different data set based on that set the format names appear for those variables.
Proc contents data=sashelp.class; run; data junk; set sashelp.class; format height f5.3 weight best7.; run; proc contents data=junk; run;
If you don't see the formats they aren't assigned.
01-09-2017 02:22 PM
ballardw, you could easily be right. It may just be a pile of variables, and a pile of formats, with no association between them (other than in the mind of an analyst).
Mahip, if you could see all the information on paper in front of you, how would you know which format belongs to which variable(s)?
01-09-2017 06:11 PM
Thanks! Looks like the formats are not getting assigned to the variables. I have permanently stored formats in a SAS catalog and I am calling them with "options fmtsearch=(libraryname)". But, the formats are not getting associated with the variables; the dataset doesn't show the formats when opened. I am not sure what's the problem here.
01-09-2017 07:00 PM - edited 01-10-2017 11:39 AM
It has been awhile since I used SPSS but I vaguely recall a couple of different options for exporting to SAS and at least one involved formats. Something like "change value labels to format". It may be that option wasn't selected from the SPSS side. Which I also vaguely recall created the SAS Proc format syntax for the value label to format and then assigned the formats to the variables.
So if you have access to SPSS, or what ever your source had, they may be able to re-export the data with format related options.
01-09-2017 07:22 PM
There is no requirement to permanently attach a format to a variable to be able to use the format with the variable. Just add the FORMAT statement in the step where you want to use it.
proc print data=have ; format sex sex. ; run;
If you don't have a data dictionary that explains the variables and their formats then you might have make one yourself. First place to look is to see if there are any FORMAT statements in the SAS code that you do have. Next is to look at the formats themselves and see if it is obvious which format go with which variables. The FMTLIB option of the PROC FORMAT statement is good for that. Or you can use CNTLOUT= option to create a dataset of the formats and look at the START and LABEL values. Perhaps the formats have names that match the variables? Or perhaps they have names that match the meaning of the format.
01-09-2017 10:27 PM
The problem is I have 200 variables and 200 formats. I have permanently stored those formats with cntlin option. My variable entries are numeric, while the formats applied to them are descriptions. Now, when I open the dataset - even after using option fmtsearch - the formats (which are descriptive in nature) are not there; rather, I find the same numeric entries. One way is I write all the variable names and the formats associated in data step. But isn't there any efficient way to do this, considering that I already have formats stored in permanent library? Hope I made myself clear. Thanks in advance!
01-09-2017 10:42 PM
FMTSEARCH only tells SAS where to retrieve a format definition from when it gets used. It doesn't tell SAS which formats need to be assigned to which variables and as there is a multitude of candidate formats for a variable, there is no way for SAS "to know" which format you would like to use if you haven't told SAS explicitely so (=permanent coded assignment).
Formats are an attribute of a variable in a table and that's the level where you need to permanently assign the format.
Issuing Proc Contents data=<table>; will report on the variables of a table and show if and which formats have been assigned to variables. If the Format column is blank in the report then no specific formats have been assigned and SAS will use defaults for printing.
the formats (which are descriptive in nature) are not there; rather, I find the same numeric entries.
So the formats haven't been assigned permanently.
There are multiple ways how you can assign a format permanently (ie. usind Proc Datasets) but you need to do this explicitely.
01-10-2017 12:01 PM
If you have the Cntlin/cntlout data set then the fun part is matching the distribution of start/end values with the values of the variables in the data set. It might be a fun programming exercise but I doubt that it would be short or even efficient.
That may be reduced if you have variable names that match some of the format names. Have you looked at that yet? Compare proc contents (dictionary.columnsoutput to a frequency of the CNTLIN data set FMTNAME
This may give you start:
/* this should find variable name matches to formatname*/ proc sql; create table namematch as select a.name,b.fmtname from (select name from dictionary.columns where libname='YOURLIB' and memname='YOURDATA') as a left join (select distinct fmtname from YOURCNTLINDATASET) as b on upcase(a.name)=(b.fmtname) where upcase(a.name)=(b.fmtname); quit; /* this should find variable name matches where the variable name is part of the format name*/ proc sql; create table nearmatch as select a.name,b.fmtname from (select name from dictionary.columns where libname='YOURLIB' and memname='YOURDATA') as a ,(select distinct fmtname from YOURCNTLINDATASET) as b where index(upcase(b.fmtname),upcase(a.name))>0; quit;
Note that hte libname and memname in dictionary.columns are uppercase. You will need to supply the data sets and libraries as needed.
If you get matches you should be able to create text of the format assignment statements. Easiest would be one statement for each matched variable. Note that the nearmatch will duplicate matches from the namematch.