06-23-2014 12:55 PM
SAS version 9.3
I can write and read data with a pd3. informat as long as I don't use my own user defined informat.
I would like to use my own informat because I would like to do some testing prior to reading.
I define my own informats like this:
proc format; invalue inthree default=3 max=3 low-high = [3.] ;
invalue inpdthree default=3 max=3 low-high = [pd3.] ;
This data step writes 3 packed bytes in positions 5:7
put @1 x 3.
@5 x pd3.
This data step reads 3 packed bytes from positions 5:7
input @1 x1 3.
@5 x2 pd3.
proc print data=test3a; run;
This data step produces a lost card error:
input @1 x1 3.
@5 x2 inpdthree.
Does anyone know why I'm getting the lost card error?
This data step reads x1 correctly but produces an invalid data error for x2 in positions 5-7:
informat x1 inthree.
input @1 x1
Does anyone know why I'm getting the invalid data error for x2?
06-25-2014 08:08 AM
Agoldma and Data_Null_,
Is your code to exclude parentheses from "Default= Max=" intentional? When I leave off the parentheses in the invalue statement, it appears that the default= and max= statements are part of the 'start' and 'end' definitions of the format. When I put the parentheses around
the default and max words, the format looks like I expect it to.
invalue inpdthree (default=3 max=3) other = [pd3.]
proc format; invalue inthree default=3 max=3
invalue inpdthree default=3 max=3
proc format; value inthree (default=3 max=3)
value inpdthree (default=3 max=3)
proc format fmtlib;
select @inthree inthree @inpdthree inpdthree ;
06-23-2014 01:04 PM
You might want to post some example data.
Likely you may need to look at missover or truncover or similar on your infile statement.
06-23-2014 01:30 PM
The sample data is generated in the first data step.
There's only one record, so I don't see why there's an issue with the carriage return.
I don't have any difficulty using the pd3. informat -- as illustrated in the second data step, which runs perfectly well without any special carriage returns.
I get this error only when using PROC FORMAT to define my own informat.
06-23-2014 01:08 PM
When you are using PD types at windows do not also use the default windows record separation cr-lf. It is possible to have that created by using PD fields.
Causing some same confusing while can using comma as decimal separator.
You can create your own format using the fcmp function
06-23-2014 02:10 PM
What are the values in your .txt file? My txt file shows "-12 € ".
put @1 x 3.
@5 x pd3.
06-23-2014 02:27 PM
Jwillis, thank you for trying to run through this.
The text file is generated by the first data step.
(I'm not using any data that's not in my original post)
The "-12 € " is the way it looks on one line -- the "€ " is the packed representation of the value -12
The first data step generates one record with two representations of the value -12 (the regular representation and the packed representation separated by a space).
Please try running the second data step, which reads that text file correctly
Please try running the third data step, which generates an error, and... look at the log
In the log, you'll see the HEX representation of the packed value -12
06-24-2014 07:51 AM
I played with your code and was unable to break the code to find out the problem nor find documentation to explain how to "format a format". Essentially you are coding a format to choose which format to use. My workaround is to code a macro that contains the format value I want to apply.
put @1 x 3.
@5 x &frmat.
proc sql; drop table work.test3a; quit;
input @1 x13 3.
@4 sep1 $1.
@5 x1pd &frmat.
proc print data=test3a; run;
06-24-2014 08:26 AM
Jwillis, thank you for putting some serious effort into this.
The example that I provided seems pointless, but it's sufficient to illustrate the behavior of SAS that appears to be inconsistent with documentation. This example is not the real code that I'm using in production, but it gives the same error.
This is not just a theoretical exercise because user-defined formats allow us to do fast and streamlined testing of the data prior to reading it. This is very useful for reading inconsistent data. There are many papers written on proc format (easy to find on the internet). Here are some examples in these forums:
Coincidentally, the third link above seems to be describing a similar problem that I'm having. I didn't find this link prior to posting my question because I was looking for the PDw.d informat. I didn't know that this problem applied to other informats.
Since SAS is very versatile, there are many ways to accomplish the same thing.
My current workaround is to break the input statement in half and do the testing in the middle there. It works, but I was hoping for a nicer solution.
06-24-2014 08:59 AM
To fix the LOSTCARD use INFILE statement option TRUNCOVER. It seems the default width is not being honored from the INFORMAT, not sure why as it does seem to be honored in the step where you use INFORMAT statement.
06-24-2014 09:28 AM
Data _null_, thank you very much for both suggestions: using "other" instead of low-high and using truncover.
Both were necessary.
I don't see why they should be necessary, but I'm glad it works.
This official Help article uses low-high as an example: Specifying Values or Ranges
06-24-2014 09:41 AM
I think low-high would be for "normal" numbers but PD is I think what SAS calls Hexadecimal input. I don't know. :smileysilly:
06-24-2014 10:04 AM
Review the PD description. You are making wrong assumptions on that. It is an ancient type known on mainframes. Every but is split in nibbles for the digits an sign as a separate nibble. The pd3. format is describing 2 bytes not 3. Reading numeric from text is one digit for each byte.
That is causing your problems with lengths. Writing 2 bytes an reading 3 is not logical.
06-24-2014 11:36 PM
Jaap, I agree that every byte is split into 2 nibbles, and the sign is in the first byte (unfortunately in Windows it takes a full byte to store the sign). Still, this leaves plenty of room to fit the value of -12 into the "pd3." format. Even pd2. has enough room to store the value of -12.
If you run through the code in my original post, you'll see that it includes a data step where the pd3. informat works perfectly as long as I don't use PROC FORMAT to define my own informat, so the problem is not in the use of the pd3. informat. It's in using it inside the proc format to create my own informat. If I write something with a pd3. format, the right way to read it is with a pd3. informat. As some previous posts implied, the PROC FORMAT doesn't handle some lengths correctly for user-defined informats, so we have to take additional measures.
If you still think that I assumed something incorrectly, please specify which assumption it is.
Also, your suggestion about using the fcmp function sounds interesting, but I couldn't find that function.
Can you please check the spelling of "fcmp"
06-25-2014 02:05 AM
The use of fcmp with formats is an example, see http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#p1gg77jyhc9s42n1f1vjy...
The PD description is at http://support.sas.com/documentation/cdl/en/leforinforref/63324/HTML/default/viewer.htm#n0xnvrbp96w3... see the remark that all elements including the sign are half byte the nibble not a byte.
That is resulting the pd3. Format will describe 2 bytes where as the common char numeric will use 3 bytes. That is causing length problems in your program.
Need further help from the community? Please ask a new question.