BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Nietzsche
Lapis Lazuli | Level 10

Hi I am reading page 244 of the official specialist exam prep guide.

the book says the example is demonstrating reading with the mmddyy10 informats. 

but in the code, there is no mmddyy10 anywhere in the code.

 

 

 

Nietzsche_0-1668831775638.png

 

but when I ran the code as shown in the book, the result does indeed look like what in the book. So what is book is trying to say? MMDDYY10 is the default informat?

 

How should I change the code if I want to use DATE11 informat?

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Nietzsche wrote:

here is a copy of the new_hires.csv

can you just tell me the code if I want to use DATE11.  I can not find anywhere in the book that show code to use informat in PROC IMPORT.

 

or do I have to use PROC IMPORT first, the change it within the DATA step?


You do not use informats with PROC IMPORT.   PROC IMPORT generates code that uses informats.  It GUESSES what informats to use based on what it sees in the text file.

 

If you want to control how the file is read just write your own data step to read the file.  Since a CSV is a delimited text file you should read it with LIST MODE input style.  In that style the width on any informat is IGNORED.  So there is no difference between reading it using DATE. or DATE9. or DATE11. as the informat specification.  SAS will match the width to the width of the next word it sees based on the delimiters.

 

But note that the text in that particular file is in the style MDY, so you must read it using the MMDDYY informat.  THe DATE informat would not understand what those characters mean.

 

Of course you can attach the DATE11. format to the variable to have the values displayed in the style DD-MON-YYYY if you want. You could attach any of the many formats that know how to display date values once the variable has date values in it.

 

You will have to attach the format in a separate step from the PROC IMPORT step.  Either another data step that copies the data, or just by using PROC DATASETS to modify the format attribute of the variable. 

 

Note that if you write the data step yourself you can tell it to read the strings from the file using the MMDDYY. informat and also attach the DATE11. format to the variable.

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Not sure what they mean.  Did the book also show the actual text of the CSV file?

 

Since the code used PROC IMPORT to read the file then it was PROC IMPORT that made the decision that MMDDYY was the best informat to use for the example values of DATE_OF_BIRTH and HIRE_DATE that it saw in that particular CSV file.

 

I suspect the point they are trying to make is that the MMDDYY informat can be used to read strings that are in the style mm/dd/yyyy.  As apposed to reading strings in some other style which would require a different informat.  They are clearly NOT showing how YOU can use the MMDDYY informat since it is not in the code. I have no idea how advanced the book you are using is intended to be or how nuanced it wants to be in what it is trying to teach.  They might have another section that is designed to teach about informats or about dates.   If you want to learn more about informats then experiment with writing your own code.

 

You should also read about the DATESTYLE option to see how you can control what SAS does when the strings are ambiguous as to what date style they are using.  It definitely impacts how the ANYDT... series of informats work.  I am not sure whether it impacts the decisions that PROC IMPORT makes.

Nietzsche
Lapis Lazuli | Level 10

here is a copy of the new_hires.csv

can you just tell me the code if I want to use DATE11.  I can not find anywhere in the book that show code to use informat in PROC IMPORT.

 

or do I have to use PROC IMPORT first, the change it within the DATA step?

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
fja
Lapis Lazuli | Level 10 fja
Lapis Lazuli | Level 10

Hello Nietsche!

Well ... if you use the proc import you should see some (equivalent) data step code being stated in the log using explicit informat statements.

You could use that instead to determine informats explicitely (i.e. your date11). Please have a look at the (probably) helpful examples in the documentation online.

As far as I know, there is no way to control the informat using the proc import directly.

--Fja

 

PS: I am suffering a bit from a hangover ... appologies for me English ... 

Tom
Super User Tom
Super User

@Nietzsche wrote:

here is a copy of the new_hires.csv

can you just tell me the code if I want to use DATE11.  I can not find anywhere in the book that show code to use informat in PROC IMPORT.

 

or do I have to use PROC IMPORT first, the change it within the DATA step?


You do not use informats with PROC IMPORT.   PROC IMPORT generates code that uses informats.  It GUESSES what informats to use based on what it sees in the text file.

 

If you want to control how the file is read just write your own data step to read the file.  Since a CSV is a delimited text file you should read it with LIST MODE input style.  In that style the width on any informat is IGNORED.  So there is no difference between reading it using DATE. or DATE9. or DATE11. as the informat specification.  SAS will match the width to the width of the next word it sees based on the delimiters.

 

But note that the text in that particular file is in the style MDY, so you must read it using the MMDDYY informat.  THe DATE informat would not understand what those characters mean.

 

Of course you can attach the DATE11. format to the variable to have the values displayed in the style DD-MON-YYYY if you want. You could attach any of the many formats that know how to display date values once the variable has date values in it.

 

You will have to attach the format in a separate step from the PROC IMPORT step.  Either another data step that copies the data, or just by using PROC DATASETS to modify the format attribute of the variable. 

 

Note that if you write the data step yourself you can tell it to read the strings from the file using the MMDDYY. informat and also attach the DATE11. format to the variable.

Tom
Super User Tom
Super User

There is no real reason to use PROC IMPORT to read a text.  Especially as simple a file like that.

First look at the file.

Name,Hire Date,Company,Country,Date of Birth                                    
Gisela S. Santos,8/12/17,Pede Nunc Sed Limited,Micronesia,8/21/1971             
Maxwell L. Cooley,9/4/17,A LLP,Somalia,4/30/1975                                
Thane P. Obrien,10/28/17,Consectetuer Limited,Jamaica,4/23/1988                 
Minerva C. Conley,1/5/18,Feugiat Tellus Lorem Institute,Fiji,2/18/1975 

So there are only 5 variables.

So just start writing the code to read the file.

data want ;
  infile 'new_hires.csv' dsd truncover firstobs=2;

You can then copy the first line and use it to generate names for the variables.  You can use a LENGTH statement to define the variables.

length 
  Name $30
  Hire_Date 8
  Company $40
  Country $30
  Date_of_Birth 8
;

Now read in the values.  You can add in-line informats in the INPUT statement if you want, just remember to prefix them with the colon modifier so that SAS will read the line in LIST MODE.

input Name Hire_Date :mmddyy. Company Country Date_of_Birth :mmddyy. ;

And finally attach any formats to variables that NEED them (SAS does not need to be given special instructions for how to display most variables).

  format Hire_Date Date_of_Birth date11. ;

You could also attach any labels you want to the variables.

1292  data want ;
1293    infile 'c:\downloads\new_hires.csv' dsd truncover firstobs=2;
1294    length
1295      Name $30
1296      Hire_Date 8
1297      Company $40
1298      Country $30
1299      Date_of_Birth 8
1300    ;
1301    input Name Hire_Date :mmddyy. Company Country Date_of_Birth :mmddyy. ;
1302    format Hire_Date Date_of_Birth date11. ;
1303  run;

NOTE: The infile 'c:\downloads\new_hires.csv' is:
      Filename=c:\downloads\new_hires.csv,
      RECFM=V,LRECL=32767,File Size (bytes)=15701,
      Last Modified=19Nov2022:12:41:59,
      Create Time=19Nov2022:12:41:59

NOTE: 100 records were read from the infile 'c:\downloads\new_hires.csv'.
      The minimum record length was 80.
      The maximum record length was 160.
NOTE: The data set WORK.WANT has 100 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds


1304
1305  proc print;
1306  run;

NOTE: There were 100 observations read from the data set WORK.WANT.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Tom_0-1668880986943.png

 

Quentin
Super User

I think the book has a typo, and means to say "... reading a CSV with dates in MMDDYY10. format."  

 

They are saying the CSV has dates in mmddyy10 format in it, and the PROC IMPORT will recognize the values as dates and read them in as dates.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

@Quentin wrote:

I think the book has a typo, and means to say "... reading a CSV with dates in MMDDYY10. format."  

 

They are saying the CSV has dates in mmddyy10 format in it, and the PROC IMPORT will recognize the values as dates and read them in as dates.


That make sense.  I prefer to reference the way the text looks as the STYLE of date (or pattern of date) used in the CSV file. 

 

I avoid using the words FORMAT or INFORMAT for that meaning since those two words have a very specific meaning in SAS code.

 

A format is used to convert values to text.  The format determines the style in which the value is displayed. 

An informat is used to convert text to values.  An informat supports reading text that match the styles or patterns that it understands.

Nietzsche
Lapis Lazuli | Level 10

I will update that in the errata thread.

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 909 views
  • 6 likes
  • 4 in conversation