Hello,
I'm trying to use proc format to read in a table using the following code:
data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	low,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;
proc format cntlIn = test_storm_categories;
	run;But, I'm getting the following error:
ERROR: Cannot mix missing and nonmissing values in the same range: .-63.
Can someone explain to me why 'low' isn't being read as a keyword here?
Thanks,
kelbrosna
PS. Here are some more details:
1. Version number (per proc setinit) is 9.04.01M6P110718, and I'm using SAS Studio.
2. I'm aware that you can specify keywords for ranges using an 'HLO' column, and that fix works for me. I don't need this code to work, I'm just curious why it fails.
3. This code is based on an exercise from the SAS Programming 2: Data Manipulation Techniques course notes (Lesson 4.2, page 4-20 step 2).
a. The exercise uses a premade table (storm_categories.sas7bdat) from the course data (which I would include, but I don't think I'm allowed to share it), and I also read in that table using the following code:
data storm_categories;
	set pg2.storm_categories (rename=
	(Low = Start High = End Category = Label));
	retain fmtName "stormFormat";
	run;
proc format cntlIn = storm_categories;
	run;This code runs as expected, and it does not give any errors. What's perplexing about this is that the test_storm_categories dataset from the datalines statement in the original code is seemingly identical to this storm_categories dataset, but their behavior is different.
b. When I export and re-import the storm_categories.sas7bdat table as a CSV file (using proc export and proc import), then perform the necessary operations and feed it to proc format, I get the same error as that from test_storm_categories. This makes me think that there may be some kind of hidden metadata in the storm_categories.sas7bdat table that matters. Is that a possibility?
4. The reason why I think the 'low' keyword is being misread is because when changed to a number, e.g. 40, no error is raised. However, the 'high' keyword doesn't raise any error, even when left unchanged. My current understanding of what is happening is that the character string "low" is being read as a missing numeric value, but I'm not sure why this is.
@kelbrosna So your initial code doesn't work because you have leading spaces in front of the word low, so SAS isn't recognizing it as low. If you trim the field it works fine. Leading spaces don't show in the table view which makes this one a bit harder to spot initially.
data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	start = trim(start);
	infile datalines dlm = ",";
	datalines;
low,63,No Category
64,82,Category 1 
83,95,Category 2 	
96,112,Category 3 	
113,136,Category 4 
137,high,Category 5
	;
	run;
	
	proc format cntlin = test_storm_categories;
	run;Thanks for the help with the issue resolution @Tom
High/Low keyword require the HLO flag set. Because you could have a value of High/Low you wanted to recode as well, in fact those are common in informants/formats so you need to tell SAS to not interpret it as value but as the keyword, which you do via the HLO flag.
One quick way I test questions like this is to generate the format manually and then pipe the format to a CNTLOUT dataset and check the data set it creates.
proc format;
value testStormFormat
low - 63 = "No Category"
64 - 82 = "Category 1"
83 - 95 = "Category 2"
96 - 112 = "Category 3"
112 - 136 = "Category 4"
137 - High = "Category 5";
run;
proc format cntlout=demo;
select testStormFormat;
run;
proc print data=demo;
run;
Regarding 3B -> Since you're not sharing the code from the storm program I cannot comment on that but I'm guessing it does set the HLO flag somewhere.
From the documentation:
If you specify START='LOW', and the HLO variable does not contain 'L', then the literal value of LOW is used. If you specify START='OTHER', and the HLO variable does not contain 'O', then the literal value of OTHER is used. If you specify END='HIGH', and the HLO variable does not contain 'H', then the literal value of HIGH is used.
@kelbrosna wrote:
Hello,
I'm trying to use proc format to read in a table using the following code:
data test_storm_categories; length Start $8 End $8 Label $15; retain fmtName "testStormFormat" type "n"; input Start $ End $ Label $; infile datalines dlm = ","; datalines; low,63,No Category 64,82,Category 1 83,95,Category 2 96,112,Category 3 113,136,Category 4 137,high,Category 5 ; run; proc format cntlIn = test_storm_categories; run;But, I'm getting the following error:
ERROR: Cannot mix missing and nonmissing values in the same range: .-63.Can someone explain to me why 'low' isn't being read as a keyword here?
Thanks,
kelbrosna
PS. Here are some more details:
1. Version number (per proc setinit) is 9.04.01M6P110718, and I'm using SAS Studio.
2. I'm aware that you can specify keywords for ranges using an 'HLO' column, and that fix works for me. I don't need this code to work, I'm just curious why it fails.
3. This code is based on an exercise from the SAS Programming 2: Data Manipulation Techniques course notes (Lesson 4.2, page 4-20 step 2).
a. The exercise uses a premade table (storm_categories.sas7bdat) from the course data (which I would include, but I don't think I'm allowed to share it), and I also read in that table using the following code:
data storm_categories; set pg2.storm_categories (rename= (Low = Start High = End Category = Label)); retain fmtName "stormFormat"; run; proc format cntlIn = storm_categories; run;This code runs as expected, and it does not give any errors. What's perplexing about this is that the test_storm_categories dataset from the datalines statement in the original code is seemingly identical to this storm_categories dataset, but their behavior is different.
b. When I export and re-import the storm_categories.sas7bdat table as a CSV file (using proc export and proc import), then perform the necessary operations and feed it to proc format, I get the same error as that from test_storm_categories. This makes me think that there may be some kind of hidden metadata in the storm_categories.sas7bdat table that matters. Is that a possibility?
4. The reason why I think the 'low' keyword is being misread is because when changed to a number, e.g. 40, no error is raised. However, the 'high' keyword doesn't raise any error, even when left unchanged. My current understanding of what is happening is that the character string "low" is being read as a missing numeric value, but I'm not sure why this is.
From the documentation:
"If you are creating a format with the CNTLIN= option, the value of LOW or OTHER for the START variable and the value of HIGH for the END variable are interpreted as their corresponding keywords of LOW, OTHER, or HIGH. This interpretation occurs whether you specify the values in uppercase or lowercase. If you want the explicit values of LOW, OTHER, or HIGH to be used, then specify the HLO variable with the values as described below."
Based on the wording, if there is no HLO variable in the CNTLIN table, shouldn't SAS be expected to interpret "low" and "high" as keywords?
Based on the error in the original post, it seems that "low" is being interpreted as the literal value "low". However, as mentioned in part 4, when the value "low" is changed to 40 in the input table, SAS seems to interpret "high" as a keyword, as shown by the code below:
data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	40,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;
proc format cntlIn = test_storm_categories fmtlib;
	select testStormFormat;
	run;
	
proc format cntlOut = demo;
	select testStormFormat;
	run;The CNTLOUT table's HLO column correctly assigns a value of "H" for the row with "high". Is there any particular reason why "high" and "low" should be interpreted differently by SAS?
Two questions:
1. Is there a way to get SAS to display hidden columns?
2. If the HLO column is blank, why does "high" get interpreted as a keyword?
From my previous reply:
data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	40,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;
proc format cntlIn = test_storm_categories fmtlib;
	select testStormFormat;
	run;
	
proc format cntlOut = demo;
	select testStormFormat;
	run;
proc print data = demo;
	run;
Running this code, we get a valid format, and the auto-generated HLO column has the value "H" in the row where the END column had value "high". Based on what you said above, the auto-generated HLO column should be blank. The fact that this code generates an HLO column with a non-blank entry seems to contradict that claim.
When you run the above code, do you get a different result?
If you don't want it to interpret 'high' as meaning HIGH then you would need to create an character format instead of a numeric format.
If you want to make a valid CNTLIN dataset that is clearer to you and more closely matches the CNTLOUT dataset then include the HLO variable.
data test_storm_categories;
  retain fmtName "testStormFormat" type "n" ;
  input Start $ End $ Label $15.;
  length hlo $3;
  if start='other' then do; hlo='O'; start=' '; end=' '; end;
  if start='low' then do; hlo='L'; start=' '; end;
  if end='high' then do; hlo=cats(hlo,'H'); end=' '; end;
datalines;
40 63 No Category
64 82 Category 1
83 95 Category 2
96 112 Category 3
113 136 Category 4
137 high Category 5
other . Invalid number
;
proc format cntlIn = test_storm_categories fmtlib cntlout=demo;
  select testStormFormat;
run;
proc print data=demo;
  var fmtname type start end hlo label;
run;DEMO output
Obs FMTNAME TYPE START END HLO LABEL 1 TESTSTORMFORMAT N 40 63 No Category 2 TESTSTORMFORMAT N 64 82 Category 1 3 TESTSTORMFORMAT N 83 95 Category 2 4 TESTSTORMFORMAT N 96 112 Category 3 5 TESTSTORMFORMAT N 113 136 Category 4 6 TESTSTORMFORMAT N 137 HIGH H Category 5 7 TESTSTORMFORMAT N **OTHER** **OTHER** O Invalid number
Low and OTHER seem to work fine for me:
data test_storm_categories;
  retain fmtName "testStormFormat" type "n" ;
  input Start $ End $ Label $15.;
datalines;
low 63 No Category
64 82 Category 1
83 95 Category 2
96 112 Category 3
113 136 Category 4
137 high Category 5
other . Invalid number
;
proc format cntlIn = test_storm_categories fmtlib cntlout=demo;
  select testStormFormat;
run;
proc print data=demo;
  var fmtname type start end hlo label;
run;
Once you digest Reeza's explanation, it is possible you just need to add a single statement to your DATA step:
if start="low" then HLO = "L";When HLO is set to "L", SAS ignores the value of START and assumes that you want to use the lowest undefined value as the beginning of your range.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
