BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
YEGUser
Fluorite | Level 6

Hello,  I am looking for some suggestions.  I'm importing a .csv that has NULL values in numeric fields (using SAS 9.4). I have created the code below based on some google searches.  It works fine EXCEPT when I reference nulls in the informat statement, it crops my numeric values to two decimal places.  If I try to specify a numeric format (e.g. ConVol nulls. 9.5)  I get an error "Syntax error, expecting one of the following: a name, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_."  I would like to retain all decimal places. And suggestions on how I can modify this code?  Thanks in advance!  

 

PROC FORMAT;
invalue nulls 'NULL'=. ;
RUN;

 

DATA yield;
informat

FMAID 2.0
Model $10.0
YieldClass 3.0
ConVol nulls. ;

infile 'Z:\vol.csv' dsd dlm=',' missover firstobs=2 LRECL=32767 n=500;

 

input
FMAID
Model $
YieldClass
ConVol; 


RUN;

 

(Sample Data for reference)

 

FMAIDModelYieldClassConVol
1Vol/Age10
1Vol/Age10.74387
1Vol/Age1NULL
1Vol/Age117.6377
1Vol/Age136.11641
1Vol/Age1115.1822
1Vol/Age186.87595
1Vol/Age1NULL
1Vol/Age1143.0345
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Can you show the actual TEXT of the CSV file?  You posted something that looks like it was copied from a SPREADSPEET, not a TEXT file.

 

The number of decimal places you SHOW has nothing to do with how you READ in the values.  How you PRINT the values is determined by what FORMAT specification you attach to the variable.

 

Your program looks ok, but there some things that are over complicated (or perhaps show a misunderstanding of how SAS works).

 

For example you are attaching an INFORMAT to a character variable.  Why?  SAS already knows how to read in character variables.  And why would you specify a number of decimal places on the $ informat??  Character strings do not have decimal places.  And you only ever need to specify a number of decimal places on a numeric informat when you want SAS to insert the decimal place for you.  If the strings being read already have a period in them then that is were the decimal place will be placed.  But if the string does not then the number of decimal places specified in the informat tell SAS what power or 10 to use to move the decimal point to the right place.

 

You normally never want to use MISSOVER instead of TRUNCOVER.  If you use MISSOVER and instruct the INPUT statement to read more characters than remain on the line then those characters are ignored (set to missing). But if you use TRUNCOVER then it uses the truncated string instead.  Since you are using LIST MODE input it does not matter in this particular program since LIST MODE ignores the width specified on the informat and just reads the whole next "word" on the line.

 

Your NULLS informat will only recognize NULL in all uppercase.  Which from your example might be ok.  If you want it to also recognize the string null or Null as a missing value then you should include the UPCASE option when defining it.

 

Or if you know that any value that does not actually represent a number should be set to missing then you can just not bother with the informat.  Instead use the ? or ?? modifier on the INPUT statement to tell it to not generate any errors when it cannot convert the value into a number.

 

So let's assume your actual CSV file looks like this.  Let's make a temporary file with your example so we have something to program with.

options parmcards=csv;
filename csv temp;

parmcards4;
FMAID,Model,YieldClass,ConVol
1,Vol/Age,1,0
1,Vol/Age,1,0.74387
1,Vol/Age,1,NULL
1,Vol/Age,1,17.6377
1,Vol/Age,1,36.11641
1,Vol/Age,1,115.1822
1,Vol/Age,1,86.87595
1,Vol/Age,1,NULL
1,Vol/Age,1,143.0345
;;;;

 

So to read it just use a data step with:

  1. INFILE statement using DSD, TRUNCOVER and FIRSTOBS= options.
  2. Define your variables using a LENGTH statement.
  3. Attach any required FORMAT or INFORMAT to your variables (most variables will need neither).
  4. Attach any LABEL values that you might need (if no label is attached the variable name will be used instead).
  5. INPUT the variables.

So perhaps something like this (notice the ?? after ConVol on the INPUT statement).  Also notice there is no need to add a $ after Model since it was already defined as character when was first seen in the LENGTH statement.

data want;
  infile csv dsd truncover firstobs=2;
  length FMAID 8 Model $10 YieldClass ConVol 8;
  input FMAID Model YieldClass ConVol ?? ;
run;

Which will print like this:

                           Yield
Obs    FMAID     Model     Class     ConVol

 1       1      Vol/Age      1        0.000
 2       1      Vol/Age      1        0.744
 3       1      Vol/Age      1         .
 4       1      Vol/Age      1       17.638
 5       1      Vol/Age      1       36.116
 6       1      Vol/Age      1      115.182
 7       1      Vol/Age      1       86.876
 8       1      Vol/Age      1         .
 9       1      Vol/Age      1      143.035

Since PROC PRINT will attempt to find a common number of decimal places that can best display all of the values for a variable on a page.

 

If you want ConVol to print with a specific number of decimal places then attach a format.  Either in the step that creates it .  Or perhaps just in the step that is doing the printing.

proc print data=want;
  format convol 12.6;
run;
                           Yield
Obs    FMAID     Model     Class          ConVol

 1       1      Vol/Age      1          0.000000
 2       1      Vol/Age      1          0.743870
 3       1      Vol/Age      1           .
 4       1      Vol/Age      1         17.637700
 5       1      Vol/Age      1         36.116410
 6       1      Vol/Age      1        115.182200
 7       1      Vol/Age      1         86.875950
 8       1      Vol/Age      1           .
 9       1      Vol/Age      1        143.034500

 

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

Can you show the actual TEXT of the CSV file?  You posted something that looks like it was copied from a SPREADSPEET, not a TEXT file.

 

The number of decimal places you SHOW has nothing to do with how you READ in the values.  How you PRINT the values is determined by what FORMAT specification you attach to the variable.

 

Your program looks ok, but there some things that are over complicated (or perhaps show a misunderstanding of how SAS works).

 

For example you are attaching an INFORMAT to a character variable.  Why?  SAS already knows how to read in character variables.  And why would you specify a number of decimal places on the $ informat??  Character strings do not have decimal places.  And you only ever need to specify a number of decimal places on a numeric informat when you want SAS to insert the decimal place for you.  If the strings being read already have a period in them then that is were the decimal place will be placed.  But if the string does not then the number of decimal places specified in the informat tell SAS what power or 10 to use to move the decimal point to the right place.

 

You normally never want to use MISSOVER instead of TRUNCOVER.  If you use MISSOVER and instruct the INPUT statement to read more characters than remain on the line then those characters are ignored (set to missing). But if you use TRUNCOVER then it uses the truncated string instead.  Since you are using LIST MODE input it does not matter in this particular program since LIST MODE ignores the width specified on the informat and just reads the whole next "word" on the line.

 

Your NULLS informat will only recognize NULL in all uppercase.  Which from your example might be ok.  If you want it to also recognize the string null or Null as a missing value then you should include the UPCASE option when defining it.

 

Or if you know that any value that does not actually represent a number should be set to missing then you can just not bother with the informat.  Instead use the ? or ?? modifier on the INPUT statement to tell it to not generate any errors when it cannot convert the value into a number.

 

So let's assume your actual CSV file looks like this.  Let's make a temporary file with your example so we have something to program with.

options parmcards=csv;
filename csv temp;

parmcards4;
FMAID,Model,YieldClass,ConVol
1,Vol/Age,1,0
1,Vol/Age,1,0.74387
1,Vol/Age,1,NULL
1,Vol/Age,1,17.6377
1,Vol/Age,1,36.11641
1,Vol/Age,1,115.1822
1,Vol/Age,1,86.87595
1,Vol/Age,1,NULL
1,Vol/Age,1,143.0345
;;;;

 

So to read it just use a data step with:

  1. INFILE statement using DSD, TRUNCOVER and FIRSTOBS= options.
  2. Define your variables using a LENGTH statement.
  3. Attach any required FORMAT or INFORMAT to your variables (most variables will need neither).
  4. Attach any LABEL values that you might need (if no label is attached the variable name will be used instead).
  5. INPUT the variables.

So perhaps something like this (notice the ?? after ConVol on the INPUT statement).  Also notice there is no need to add a $ after Model since it was already defined as character when was first seen in the LENGTH statement.

data want;
  infile csv dsd truncover firstobs=2;
  length FMAID 8 Model $10 YieldClass ConVol 8;
  input FMAID Model YieldClass ConVol ?? ;
run;

Which will print like this:

                           Yield
Obs    FMAID     Model     Class     ConVol

 1       1      Vol/Age      1        0.000
 2       1      Vol/Age      1        0.744
 3       1      Vol/Age      1         .
 4       1      Vol/Age      1       17.638
 5       1      Vol/Age      1       36.116
 6       1      Vol/Age      1      115.182
 7       1      Vol/Age      1       86.876
 8       1      Vol/Age      1         .
 9       1      Vol/Age      1      143.035

Since PROC PRINT will attempt to find a common number of decimal places that can best display all of the values for a variable on a page.

 

If you want ConVol to print with a specific number of decimal places then attach a format.  Either in the step that creates it .  Or perhaps just in the step that is doing the printing.

proc print data=want;
  format convol 12.6;
run;
                           Yield
Obs    FMAID     Model     Class          ConVol

 1       1      Vol/Age      1          0.000000
 2       1      Vol/Age      1          0.743870
 3       1      Vol/Age      1           .
 4       1      Vol/Age      1         17.637700
 5       1      Vol/Age      1         36.116410
 6       1      Vol/Age      1        115.182200
 7       1      Vol/Age      1         86.875950
 8       1      Vol/Age      1           .
 9       1      Vol/Age      1        143.034500

 

YEGUser
Fluorite | Level 6

Thanks so much for the quick response!  A few comments on the reply -

 

Many of us learn from having been handed down code from predecessors. And it's worked and we have continued to follow the template they have set.  This is some of the "complexity" you are seeing.  There decimal after the informat for character?  I'm OCD and I like how it looks, and to date is hasn't cause any problems with the informat, but thank you I'll try to stop doing it.  🙂

 

Thanks for the trick with the ?? and the note about proc print.  I was not aware of this truncation issue with proc print (for some reason I haven't run into it befoe) and it turns out that my original code was doing the job just fine (albeit in a clunky manner!).

YEGUser
Fluorite | Level 6

Apologies - new to this forum.  I was trying to accept @Tom's response as the solution but apparently I failed. Sigh.

Tom
Super User Tom
Super User

No problem. Your reply is a good answer also.

 

OCD is not a bad feature for programming.  Just make sure it doesn't turn into OWD , Obsessing on the Wrong Details.

YEGUser
Fluorite | Level 6
LOL - that is my new favourite acronym
FreelanceReinh
Jade | Level 19

@YEGUser wrote:

Apologies - new to this forum.  I was trying to accept @Tom's response as the solution but apparently I failed. Sigh.


Hello @YEGUser,

 

This has happened to many others before and can be easily corrected: Select Tom's post as the solution after clicking "Not the Solution" in the option menu (see icon below) of the current solution.

show_option_menu.png

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 355 views
  • 4 likes
  • 3 in conversation