I'm having an issue with proc format.
I create a format table using the following code:
data quote_policy_map;
infile map1 firstobs=2 dlm = ',' dsd missover lrecl = 10000 ;
format
start 8.
label $20.
;
input
start
label
;
fmtname = "QuotePolMap";
run;
proc format cntlin = quote_policy_map fmtlib; run;
A screenshot of the dataset is shown below:
I have another dataset called fleet_quote_pols (screenshot below)
I want to use the format table to read in value for the policy nunber where there is a matching quote number. Here is the code I'm using:
data fleet_quotes_pols1(drop = quote_no key rename = (quote_no2 = quote_no )) ;
set fleet_quotes_pols;
quote_no2 = input(quote_no,comma8.);
policy_no1 = put(quote_no2,QuotePolMap.);
run;
The code is returning policy numbers from the format table when there is a matching quote number BUT when there is not match it returns the quote number which I don't understand. Below is a screenshot of the dataset fleet_quotes_pols1. The cell with the red box is what I want but every other cell under the field policyno1 should be blank as there isn't a match.
Any ideas where I'm going wrong?
This is the documented behaviour of PROC FORMAT when no OTHER= entry is provided in the format definition : the entry is left unmodified for character strings or is given a default format for numeric values.
If you want unidentified values to have a blank value then you need to have your format include a start value of **OTHER** with a label of ' ' (there is a space between the quotes) in your cntlin data set. The keyword OTHER for use in a value list is somewhat buried in the Proc Format Specifying Values or Ranges documentation. And **OTHER** is what is stored when the option is used.
You can find some these nasty details out by creating a simple format by hand with the options and a cntlout statement to see what the actual values used by the format might be. Then you can include them in the cntlin data set.
Thanks Ballardw. I tried doing what you suggested but my start field is numeric and other is a character. Here is the code I used.
Just to clarify, should it be start = "**OTHER**" or start = "OTHER"
Thanks
data quote_policy_map ;
set quote_policy_map;
by fmtname;
if last.fmtname then do;
output;
start = "OTHER";
label = " ";
output;
end;
else output;
run;
Use variable HLO to indicate the OTHER= range :
data quote_policy_map ;
set quote_policy_map;
by fmtname;
if last.fmtname then do;
output;
HLO = "O";
label = " ";
output;
end;
else output;
run;
Alternatively, you could introduce HLO directly in your data step:
data quote_policy_map;
infile map1 firstobs=2 dlm = ',' dsd missover lrecl = 10000 end=last;
format start 8. label $20.;
input start label;
fmtname = "QuotePolMap";
output;
if last then do;
hlo='O';
label=' ';
output;
end;
run;
It has no impact on the format that variable START will repeat its previous value in the last observation.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.