For the following simple sas code,
data myfmt;
retain fmtname "salegroup";
length start $20 label $20;
infile datalines delimiter='/';
input start $ label $;
datalines;
low-<700 / need improvement
700-<900 / good
900-high / top sale
;
run;
proc format cntlin=myfmt fmtlib;
run;
When I run the code, SAS complains the following
I use the most conventional way to write the same functionality as the following,
proc format fmtlib;
value salegroupv
low-<700 = 'need improvement'
700-<900 = 'good'
900-high = 'top sale';
run;
and it seems it works fine. Why is this difference? How can I possibly use the first way to define a numeric range to format my data?
It says, among other things:
If you are creating a format with ranges of input values, then you must specify the END variable. If range values are to be noninclusive, then the variables SEXCL and EEXCL must each have a value of Y
. Inclusion is the default.
You don't have an END variable. Example:
It says, among other things:
If you are creating a format with ranges of input values, then you must specify the END variable. If range values are to be noninclusive, then the variables SEXCL and EEXCL must each have a value of Y
. Inclusion is the default.
You don't have an END variable. Example:
Thanks for the hint, the following is the corrected sas code to make life easier for newcomers in case,
data myfmt;
retain fmtname "salegroup";
length start SEXCL end EEXCL label $20;
infile datalines delimiter='/';
input start sexcl end eexcl label $;
datalines;
low / n / 700 / y / need improvement
700 / n / 900 / y / good
900 / n / 1000 / n / top sale
;
run;
proc format cntlin=myfmt fmtlib;
run;
When SAS begins to work with this DATA step code, the first time it encounters variables START and END is in the LENGTH statement, which makes them character variables. After that, the DATA step will treat these variable as character variables.
@PaigeMiller wrote:When SAS begins to work with this DATA step code, the first time it encounters variables START and END is in the LENGTH statement, which makes them character variables. After that, the DATA step will treat these variable as character variables.
so, basically $20 has been applied to all the variables before it? If the case, how to restrict start, end etc to be only numeric but let the last one to have the $20 property?
@Newlifewithegg wrote:
so, basically $20 has been applied to all the variables before it? If the case, how to restrict start, end etc to be only numeric but let the last one to have the $20 property?
In the example, BEGIN and END are character variables, because values of LOW, HIGH or OTHER are valid values. If you don't want them all as length 20 you could do something like this:
length start $10 SEXCL $2 end $10 EEXCL $2 label $20;
Read the documentation of the LENGTH statement.
This will make START and END as numeric and SEXCL and EEXCL as $1 instead of $20.
length start 8 SEXCL $1 end 8 EEXCL $1 label $20;
Also you don't need the $ in the INPUT statement if you have already defined LABEL as character.
input start sexcl end eexcl label ;
Of course you cannot read LOW and HIGH into START and END if they are numeric.
data myfmt;
length fmtname $32 start end 8 SEXCL EEXCL $1 HLO $3 label $20;
retain fmtname "salegroup";
infile datalines dsd dlm='/' truncover ;
input start sexcl end eexcl hlo label ;
datalines;
/ n / 700 / y / L / need improvement
700 / n / 900 / y / / good
900 / n / 1000 / n / / top sale
1000 / y / / n / H / WOW!!
;
proc format cntlin=myfmt fmtlib;
run;
---------------------------------------------------------------------------- | FORMAT NAME: SALEGROUP LENGTH: 16 | | MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 16 FUZZ: STD | |--------------------------------------------------------------------------| |START |END |LABEL (VER. 9.4 03NOV2023:13:08:47)| |----------------+----------------+----------------------------------------| |LOW | 700<need improvement | | 700| 900<good | | 900| 1000|top sale | | 1000<HIGH |WOW!! | ----------------------------------------------------------------------------
"Low" and "High" range elements can also be specified in the data set by setting the HLO variable to include "L" for low and/or "H" for high .
"O" , that's a capital letter o, is for the range value "Other" .
You can create a full data set containing the result of creating a format with the CNTLOUT option such as:
proc format cntlin=myfmt cntlout=fmtout; run;
If you look in the Fmtout data set created you will see the HLO variable with the value L.
The HLO variable may be more needed if you create datasets that create multiple formats at one time.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.