BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
agoldma
Pyrite | Level 9

SAS version 9.3

I can write and read data with a pd3. informat as long as I don't use my own user defined informat.

I would like to use my own informat because I would like to do some testing prior to reading.


I define my own informats like this:

proc format; invalue inthree   default=3 max=3 low-high = [3.]   ;

             invalue inpdthree default=3 max=3 low-high = [pd3.] ; 

run;

This data step writes 3 packed bytes in positions 5:7

data _null_;

    x=-12;

    file 'C:\PROD\SAS93\test3.txt';

    put  @1  x  3.

         @5  x  pd3.

         ;

run;

This data step reads 3 packed bytes from positions 5:7

data test3a;

   infile 'C:\PROD\SAS93\test3.txt';

   input  @1 x1  3.

          @5 x2  pd3.

          ;

run;

proc print data=test3a; run;

This data step produces a lost card error:

data test3b;

   infile 'C:\PROD\SAS93\test3.txt';

   input  @1 x1  3.

          @5 x2  inpdthree.

          ;

run;

Does anyone know why I'm getting the lost card error?

This data step reads x1 correctly but produces an invalid data error for x2 in positions 5-7:

data test3b;

   infile 'C:\PROD\SAS93\test3.txt';

   informat x1 inthree.

            x2 inpdthree.

            ;

   input  @1 x1

          @5 x2

          ;

run;

Does anyone know why I'm getting the invalid data error for x2?

1 ACCEPTED SOLUTION

Accepted Solutions
jwillis
Quartz | Level 8

Agoldma and Data_Null_,

Is your code to exclude parentheses from "Default= Max=" intentional?  When I leave off the parentheses in the invalue statement, it appears that the default= and max= statements are part of the 'start' and 'end' definitions of the format. When I put the parentheses around

the default and max words, the format looks like I expect it to.

invalue inpdthree (default=3 max=3) other = [pd3.]


options nofmterr;
proc format; invalue inthree   default=3 max=3
              other=3. ;
             invalue inpdthree default=3 max=3
              other=pd3. ; 
run;

proc format; value inthree   (default=3 max=3)
              other=3. ;
             value inpdthree (default=3 max=3)
              other=pd3. ; 
run;


proc format fmtlib;
select @inthree inthree @inpdthree inpdthree ;
run;

View solution in original post

25 REPLIES 25
ballardw
Super User

You might want to post some example data.

Likely you may need to look at missover or truncover or similar on your infile statement.

agoldma
Pyrite | Level 9

The sample data is generated in the first data step.

There's only one record, so I don't see why there's an issue with the carriage return.

I don't have any difficulty using the pd3. informat -- as illustrated in the second data step, which runs perfectly well without any special carriage returns.

I get this error only when using PROC FORMAT to define my own informat.

jakarman
Barite | Level 11

When you are using PD types at windows do not also use the default windows record separation cr-lf.  It is possible to have that created by using PD fields.

Causing some same confusing while can using comma as decimal separator.

You can create your own format using the fcmp  function

---->-- ja karman --<-----
jwillis
Quartz | Level 8

What are the values in your .txt file?  My txt file shows "-12 €  ". 

data _null_;

    x=-12;

    file 'E:\test3.txt';

    put  @1  x  3.

         @5  x  pd3.

         ;

run;


agoldma
Pyrite | Level 9

Jwillis, thank you for trying to run through this.

The text file is generated by the first data step.

(I'm not using any data that's not in my original post)

The "-12 €  " is the way it looks on one line -- the "€  " is the packed representation of the value -12

The first data step generates one record with two representations of the value -12 (the regular representation and the packed representation separated by a space).

Please try running the second data step, which reads that text file correctly

Please try running the third data step, which generates an error, and... look at the log

In the log, you'll see the HEX representation of the packed value -12

jwillis
Quartz | Level 8

Dear agoldma,

I played with your code and was unable to break the code to find out the problem nor find documentation to explain how to "format a format".  Essentially you are coding a format to choose which format to use. My workaround is to code a macro that contains the format value I want to apply.   
%let frmat=pd3.;

data _null_;
    x=-12;
    file 'E:\test4.txt';
    put  @1  x  3.
         @4 '*'
         @5  x  &frmat.
         ;
run;

proc sql; drop table work.test3a; quit;
data test3a;
   infile 'E:test4.txt';
   input @1 x13 3.
         @4 sep1 $1.
         @5 x1pd &frmat.
         ;
run;

proc print data=test3a; run;

agoldma
Pyrite | Level 9

Jwillis, thank you for putting some serious effort into this.

The example that I provided seems pointless, but it's sufficient to illustrate the behavior of SAS that appears to be inconsistent with documentation. This example is not the real code that I'm using in production, but it gives the same error.

This is not just a theoretical exercise because user-defined formats allow us to do fast and streamlined testing of the data prior to reading it. This is very useful for reading inconsistent data. There are many papers written on proc format (easy to find on the internet). Here are some examples in these forums:

Must be a better way to do this (missing values)

Converting Raw Files with Dynamically Calculated Layout

Proc format issues: invalue statement ignores length specified with informats

Coincidentally, the third link above seems to be describing a similar problem that I'm having. I didn't find this link prior to posting my question because I was looking for the PDw.d informat. I didn't know that this problem applied to other informats.

Since SAS is very versatile, there are many ways to accomplish the same thing.

My current workaround is to break the input statement in half and do the testing in the middle there. It works, but I was hoping for a nicer solution.

data_null__
Jade | Level 19

Your informat does not use the proper range.  Try OTHER.

invalue inpdthree default=3 max=3 other = [pd3.] ;
data_null__
Jade | Level 19

To fix the LOSTCARD use INFILE statement option TRUNCOVER.  It seems the default width is not being honored from the INFORMAT, not sure why as it does seem to be honored in the step where you use INFORMAT statement.

*This data step produces a lost card error:;
data test3b;
   infile '~/test3.txt' truncover;
  
input  @1 x1  3.
          @
5 x2  inpdthree.
          ;
run;
agoldma
Pyrite | Level 9

Data _null_, thank you very much for both suggestions: using "other" instead of low-high and using truncover.

Both were necessary.

I don't see why they should be necessary, but I'm glad it works.

This official Help article uses low-high as an example: Specifying Values or Ranges

data_null__
Jade | Level 19

I think low-high would be for "normal" numbers but PD is I think what SAS calls Hexadecimal input.   I don't know. :smileysilly:

jakarman
Barite | Level 11

Review the PD description. You are making wrong assumptions on that. It is an ancient type known on mainframes. Every but is split in nibbles for the digits an sign as a separate nibble. The pd3. format is describing 2 bytes not 3. Reading numeric from text is one digit for each byte.

That is causing your problems with lengths. Writing 2 bytes an reading 3 is not logical.

---->-- ja karman --<-----
agoldma
Pyrite | Level 9

Jaap, I agree that every byte is split into 2 nibbles, and the sign is in the first byte (unfortunately in Windows it takes a full byte to store the sign). Still, this leaves plenty of room to fit the value of -12 into the "pd3." format. Even pd2. has enough room to store the value of -12.

If you run through the code in my original post, you'll see that it includes a data step where the pd3. informat works perfectly as long as I don't use PROC FORMAT to define my own informat, so the problem is not in the use of the pd3. informat. It's in using it inside the proc format to create my own informat. If I write something with a pd3. format, the right way to read it is with a pd3. informat. As some previous posts implied, the PROC FORMAT doesn't handle some lengths correctly for user-defined informats, so we have to take additional measures.

If you still think that I assumed something incorrectly, please specify which assumption it is.

Also, your suggestion about using the fcmp function sounds interesting, but I couldn't find that function.

Can you please check the spelling of "fcmp"

jakarman
Barite | Level 11

The use of fcmp with formats is an example, see http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#p1gg77jyhc9s42n1f1vjy...

The PD description is at http://support.sas.com/documentation/cdl/en/leforinforref/63324/HTML/default/viewer.htm#n0xnvrbp96w3...  see the remark that all elements including the sign are half byte the nibble not a byte.

That is resulting the pd3. Format will describe 2 bytes where as the common char numeric will use 3 bytes. That is causing length problems in your program.

---->-- ja karman --<-----

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 25 replies
  • 1457 views
  • 0 likes
  • 5 in conversation