BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kelbrosna
Fluorite | Level 6

Hello,

     I'm trying to use proc format to read in a table using the following code:

data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	low,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;

proc format cntlIn = test_storm_categories;
	run;

But, I'm getting the following error:

ERROR: Cannot mix missing and nonmissing values in the same range: .-63.

Can someone explain to me why 'low' isn't being read as a keyword here?

 

Thanks,

kelbrosna

 

 

 

 

 

PS. Here are some more details:

1. Version number (per proc setinit) is 9.04.01M6P110718, and I'm using SAS Studio.

2. I'm aware that you can specify keywords for ranges using an 'HLO' column, and that fix works for me. I don't need this code to work, I'm just curious why it fails.

3. This code is based on an exercise from the SAS Programming 2: Data Manipulation Techniques course notes (Lesson 4.2, page 4-20 step 2).

     a. The exercise uses a premade table (storm_categories.sas7bdat) from the course data (which I would include, but I don't think I'm allowed to share it), and I also read in that table using the following code:

data storm_categories;
	set pg2.storm_categories (rename=
	(Low = Start High = End Category = Label));
	retain fmtName "stormFormat";
	run;

proc format cntlIn = storm_categories;
	run;

This code runs as expected, and it does not give any errors. What's perplexing about this is that the test_storm_categories dataset from the datalines statement in the original code is seemingly identical to this storm_categories dataset, but their behavior is different.

    b. When I export and re-import the storm_categories.sas7bdat table as a CSV file (using proc export and proc import), then perform the necessary operations and feed it to proc format, I get the same error as that from test_storm_categories. This makes me think that there may be some kind of hidden metadata in the storm_categories.sas7bdat table that matters. Is that a possibility?

4. The reason why I think the 'low' keyword is being misread is because when changed to a number, e.g. 40, no error is raised. However, the 'high' keyword doesn't raise any error, even when left unchanged. My current understanding of what is happening is that the character string "low" is being read as a missing numeric value, but I'm not sure why this is.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

@kelbrosna  So your initial code doesn't work because you have leading spaces in front of the word low, so SAS isn't recognizing it as low. If you trim the field it works fine. Leading spaces don't show in the table view which makes this one a bit harder to spot initially. 

 

data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	start = trim(start);
	infile datalines dlm = ",";
	datalines;
low,63,No Category
64,82,Category 1 
83,95,Category 2 	
96,112,Category 3 	
113,136,Category 4 
137,high,Category 5
	;
	run;
	
	proc format cntlin = test_storm_categories;
	run;

Thanks for the help with the issue resolution @Tom 

 

 

 

 

View solution in original post

19 REPLIES 19
Reeza
Super User

High/Low keyword require the HLO flag set. Because you could have a value of High/Low you wanted to recode as well, in fact those are common in informants/formats so you need to tell SAS to not interpret it as value but as the keyword, which you do via the HLO flag. 

 

One quick way I test questions like this is to generate the format manually and then pipe the format to a CNTLOUT dataset and check the data set it creates. 

 

proc format;
value testStormFormat
low - 63 = "No Category"
64 - 82 = "Category 1"
83 - 95 = "Category 2"
96 - 112 = "Category 3"
112 - 136 = "Category 4"
137 - High = "Category 5";
run;

proc format cntlout=demo;
select testStormFormat;
run;

proc print data=demo;
run;

Regarding 3B -> Since you're not sharing the code from the storm program I cannot comment on that but I'm guessing it does set the HLO flag somewhere. 

 

From the documentation:

If you specify START='LOW', and the HLO variable does not contain 'L', then the literal value of LOW is used. If you specify START='OTHER', and the HLO variable does not contain 'O', then the literal value of OTHER is used. If you specify END='HIGH', and the HLO variable does not contain 'H', then the literal value of HIGH is used.

 

Spoiler

@kelbrosna wrote:

Hello,

     I'm trying to use proc format to read in a table using the following code:

data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	low,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;

proc format cntlIn = test_storm_categories;
	run;

But, I'm getting the following error:

ERROR: Cannot mix missing and nonmissing values in the same range: .-63.

Can someone explain to me why 'low' isn't being read as a keyword here?

 

Thanks,

kelbrosna

 

 

 

 

 

PS. Here are some more details:

1. Version number (per proc setinit) is 9.04.01M6P110718, and I'm using SAS Studio.

2. I'm aware that you can specify keywords for ranges using an 'HLO' column, and that fix works for me. I don't need this code to work, I'm just curious why it fails.

3. This code is based on an exercise from the SAS Programming 2: Data Manipulation Techniques course notes (Lesson 4.2, page 4-20 step 2).

     a. The exercise uses a premade table (storm_categories.sas7bdat) from the course data (which I would include, but I don't think I'm allowed to share it), and I also read in that table using the following code:

data storm_categories;
	set pg2.storm_categories (rename=
	(Low = Start High = End Category = Label));
	retain fmtName "stormFormat";
	run;

proc format cntlIn = storm_categories;
	run;

This code runs as expected, and it does not give any errors. What's perplexing about this is that the test_storm_categories dataset from the datalines statement in the original code is seemingly identical to this storm_categories dataset, but their behavior is different.

    b. When I export and re-import the storm_categories.sas7bdat table as a CSV file (using proc export and proc import), then perform the necessary operations and feed it to proc format, I get the same error as that from test_storm_categories. This makes me think that there may be some kind of hidden metadata in the storm_categories.sas7bdat table that matters. Is that a possibility?

4. The reason why I think the 'low' keyword is being misread is because when changed to a number, e.g. 40, no error is raised. However, the 'high' keyword doesn't raise any error, even when left unchanged. My current understanding of what is happening is that the character string "low" is being read as a missing numeric value, but I'm not sure why this is.


kelbrosna
Fluorite | Level 6

From the documentation:

"If you are creating a format with the CNTLIN= option, the value of LOW or OTHER for the START variable and the value of HIGH for the END variable are interpreted as their corresponding keywords of LOW, OTHER, or HIGH. This interpretation occurs whether you specify the values in uppercase or lowercase. If you want the explicit values of LOW, OTHER, or HIGH to be used, then specify the HLO variable with the values as described below."

 

Based on the wording, if there is no HLO variable in the CNTLIN table, shouldn't SAS be expected to interpret "low" and "high" as keywords?

 

Based on the error in the original post, it seems that "low" is being interpreted as the literal value "low". However, as mentioned in part 4, when the value "low" is changed to 40 in the input table, SAS seems to interpret "high" as a keyword, as shown by the code below:

data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	40,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;

proc format cntlIn = test_storm_categories fmtlib;
	select testStormFormat;
	run;
	
proc format cntlOut = demo;
	select testStormFormat;
	run;

The CNTLOUT table's HLO column correctly assigns a value of "H" for the row with "high". Is there any particular reason why "high" and "low" should be interpreted differently by SAS?

Reeza
Super User
The paragraph is a bit ambiguous, and the paragraph that follows contradicts that...and the SAS behaviour aligns with the second paragraph.
Reeza
Super User
And there's always a HLO value in the data set, SAS creates it automatically and it's blank.
kelbrosna
Fluorite | Level 6

Two questions:

1. Is there a way to get SAS to display hidden columns?

2. If the HLO column is blank, why does "high" get interpreted as a keyword?

 

Reeza
Super User
As mentioned previously, CNTLOUT. For your second question read the paragraph exactly AFTER the one you posted initially that describes how to set HIGH as a literal value, ie if you had HIGH, LOW, MEDIUM as text values and wanted to recode those to 1, 2, 3 via a format, how would you pass those values to SAS?

proc format cntlout=mydata;
select formatName;
run;
Reeza
Super User
FROM the documentation as well: If you specify END='HIGH', and the HLO variable does not contain 'H', then the literal value of HIGH is used.

Blanks are not H.
kelbrosna
Fluorite | Level 6

From my previous reply:

data test_storm_categories;
	length Start $8  End $8 Label $15;
	retain fmtName "testStormFormat" type "n";
	input Start $ End $ Label $;
	infile datalines dlm = ",";
	datalines;
	40,63,No Category
	64,82,Category 1 
	83,95,Category 2 	
	96,112,Category 3 	
	113,136,Category 4 
	137,high,Category 5
	;
	run;

proc format cntlIn = test_storm_categories fmtlib;
	select testStormFormat;
	run;
	
proc format cntlOut = demo;
	select testStormFormat;
	run;

proc print data = demo;
	run;

 

Running this code, we get a valid format, and the auto-generated HLO column has the value "H" in the row where the END column had value "high". Based on what you said above, the auto-generated HLO column should be blank. The fact that this code generates an HLO column with a non-blank entry seems to contradict that claim.

When you run the above code, do you get a different result?

Reeza
Super User
Open a ticket with SAS tech support.
Tom
Super User Tom
Super User

If you don't want it to interpret 'high' as meaning HIGH then you would need to create an character format instead of a numeric format.

 

If you want to make a valid CNTLIN dataset that is clearer to you and more closely matches the CNTLOUT dataset then include the HLO variable.

data test_storm_categories;
  retain fmtName "testStormFormat" type "n" ;
  input Start $ End $ Label $15.;
  length hlo $3;
  if start='other' then do; hlo='O'; start=' '; end=' '; end;
  if start='low' then do; hlo='L'; start=' '; end;
  if end='high' then do; hlo=cats(hlo,'H'); end=' '; end;
datalines;
40 63 No Category
64 82 Category 1
83 95 Category 2
96 112 Category 3
113 136 Category 4
137 high Category 5
other . Invalid number
;

proc format cntlIn = test_storm_categories fmtlib cntlout=demo;
  select testStormFormat;
run;

proc print data=demo;
  var fmtname type start end hlo label;
run;

DEMO output

Obs        FMTNAME        TYPE         START                END           HLO        LABEL

 1     TESTSTORMFORMAT     N                    40                  63           No Category
 2     TESTSTORMFORMAT     N                    64                  82           Category 1
 3     TESTSTORMFORMAT     N                    83                  95           Category 2
 4     TESTSTORMFORMAT     N                    96                 112           Category 3
 5     TESTSTORMFORMAT     N                   113                 136           Category 4
 6     TESTSTORMFORMAT     N                   137    HIGH                 H     Category 5
 7     TESTSTORMFORMAT     N      **OTHER**           **OTHER**            O     Invalid number

 

Reeza
Super User
High seems to work correctly, but Low does not - see first set of code ergo my recommendation to talk to Tech Support now.
Tom
Super User Tom
Super User

Low and OTHER seem to work fine for me:

data test_storm_categories;
  retain fmtName "testStormFormat" type "n" ;
  input Start $ End $ Label $15.;
datalines;
low 63 No Category
64 82 Category 1
83 95 Category 2
96 112 Category 3
113 136 Category 4
137 high Category 5
other . Invalid number
;

proc format cntlIn = test_storm_categories fmtlib cntlout=demo;
  select testStormFormat;
run;

proc print data=demo;
  var fmtname type start end hlo label;
run;
Reeza
Super User
FYI - if you're paying for the course you can ask SAS about 3a/b.

Astounding
PROC Star

Once you digest Reeza's explanation, it is possible you just need to add a single statement to your DATA step:

 

if start="low" then HLO = "L";

When HLO is set to "L", SAS ignores the value of START and assumes that you want to use the lowest undefined value as the beginning of your range.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 19 replies
  • 1020 views
  • 6 likes
  • 4 in conversation