Hi, I have the attached dataset and I would like to use it to create a proc format, similar to the value statement. So any value between the start and end values will be labelled with the value of the label column.
Is there any way I can do this without hard coding it (as in the code below)?
proc format;
value expenditure
low -675.86 = '332.89'
675.86<-889.91 = '350.97'
889.91<-1031.6 = '374.2'
1031.6<-1227.59 = 402.5'
1227.59<-1403.99 = '423.56'
1403.99<-1657.6 = '415.88'
1657.6<-1913.33 = '438.46'
1913.33<-2280.49 = '465.96'
2280.49<-2960.73 = '522.08'
2960.73<- high = '628.77';
run;
Your data even has the names proc format is expecting. You just need to add FMTNAME and TYPE. Since you example use LOW and HIGH in the range is did the same for low=0 and high=999999
data control;
retain fmtname 'expenditure' type 'N';
infile cards expandtabs firstobs=2;
input start end label $;
if start eq 0 then hlo = 'L';
if end eq 999999 then hlo='H';
cards;
Start End Label
0 675.86 332.89
675.87 889.91 350.97
889.92 1031.6 374.2
1031.61 1227.59 402.5
1227.6 1403.99 423.56
1404 1657.6 415.88
1657.61 1913.33 438.46
1913.34 2280.49 465.96
2280.5 2960.73 522.08
2960.74 999999 628.77
;;;;
run;
proc print;
run;
proc format cntlin=control cntlout=cntlout;
run;
proc print;
run;
Thanks for the reply. Is there any way I can do this without having to write out all the values (my aim here is that I might have to completely replace the label/start/end values with a new data set, and I'm interested in if there's a quick way of doing this without re-writing).
If you have your values in a SAS data set you can create a CNTLIN data set to provide the instructions on building your format(s).
Example:
data work.cntlin; input Start End Label ; length fmtname $ 32 HLO $ 13 type sexcl eexcl $ 1; fmtname='expenditure'; type='N'; If start=0 then do; HLO='L'; sexcl='N'; eexcl='N'; end; else if end=999999 then do; HLO='H'; sexcl='Y'; eexcl='N'; end; else do; sexcl='Y'; eexcl='N'; end; datalines; 0 675.86 332.89 675.87 889.91 350.97 889.92 1031.6 374.2 1031.61 1227.59 402.5 1227.6 1403.99 423.56 1404 1657.6 415.88 1657.61 1913.33 438.46 1913.34 2280.49 465.96 2280.5 2960.73 522.08 2960.74 999999 628.77 ; run; proc format cntlin=work.cntlin cntlout=work.cntrl; run;
Some details: The cntlin data set must contain the variables FMTNAME TYPE START and LABEL at a minimum. If you have actual ranges then you need the END variable. The TYPE variable can have the values N for numeric or C for character format. If you want to define multiple formats in a single data set all values for one format must be together, each record must have the format name in the FMTNAME variable, if the values are mixed numeric and character then START and END will have to be a character value.
IF you are specifying ranges the additional variables SEXCL (start exclusion) and EEXCL (end exclusion) come into play with values of N => do not exclude the endpoint and Y=>exclude the endpoint. The variable HLO contains additional information such as L, the 'low' value is used, H the 'high' value is used and O for 'other' range. Other values such as M appear for multilabel formats, I for informat, U used with informat to upcase values before applying the informat, J used with informat to Justify (align values basically to ignore leading spaces)
All of the options involved with a format can appear such as prefix characters, noedit, fill characters, decimal separator, are the values date/time/datetime type.
Best way to learn which options are needed for your specific use is to create syntax for a few values of a format or informat and run the code with the option CNTLOUT=LIB.DATASETNAME to create a data set that could be used as a CNTLIN option. That set will have all the options you set appear in the data set.
@ballardw: Good explanations. Minor correction: The combination of SEXCL='Y' and modified START values, e.g. 1913.34, causes unwanted (small) gaps between the ranges, e.g. put(1913.34,expenditure.)='1913.3', but should be '465.96'.
@FreelanceReinh wrote:
@ballardw: Good explanations. Minor correction: The combination of SEXCL='Y' and modified START values, e.g. 1913.34, causes unwanted (small) gaps between the ranges, e.g. put(1913.34,expenditure.)='1913.3', but should be '465.96'.
Right.
I didn't bother to check OP ranges mostly to demonstrate the CNTLIN.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: