BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
YasodaJayaweera
Obsidian | Level 7

Hi,

I am aware of the difference between informats and formats. According to the new Base Programmer 9.4 exam prep guide there are no syntax differences when coding an informat and format. However, I am trying to run the below code and there is this colon sign added in front of the 2 date formats (e.g. Birth_Date :date.). I was told this is because these two are informats. If they were formats, I would not have to use them. I have attached the data file herewith.

 

I tried removing the colon and the out was indeed different. Without the colon values of Hire_Date was empty. But the prep guide doesn't seem to mention about this. Therefore, I am little confused.

 

I would appreciate if someone could explain the purpose of the colon and how to decide when to use it in an informat? 

data work.subset3;
   length First_Name $ 12 Last_Name $ 18 
          Gender $ 1 Job_Title $ 25 
          Country $ 2;
   infile 'sales.csv' dlm=',';
   input Employee_ID First_Name $ Last_Name $ 
         Gender $ Salary Job_Title $ Country $ 
         Birth_Date :date. 
         Hire_Date :mmddyy.;
run;

P.S. I executed code in SAS OnDemand for academics.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Where did this practice problem come from?

 

In what I hope is plain English ...

 

Without using a colon, the DATE. informat means:

 

Read the next 7 characters.  Expect what you find to be in date7 form.  Automatically convert what you find in those 7 characters by applying the DATE7 informat.

 

Note that you need to know that the default width for DATE informats is 7 characters, which is neither intuitive nor an important part of the lesson when learning about the colon.

 

Adding a colon changes the instructions to the INPUT statement.  The colon says, "Read across all the commas (because commas are your delimiter here), until you find a non-comma.  Starting with that non-comma, read 7 characters expecting them to be in DATE7 form.  

 

So the comma affects where to start reading characters from the raw data line.  Without a colon, SAS reads the next 7 characters.  With a colon, it finds a non-comma and starts reading at that point.

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

The colon is a feature of the INPUT statement. It tells the statement to apply the delimiter first and then apply the informat. Without it, the statement would read as many characters as the format prescribes, disregarding any delimiters.

ballardw
Super User

Without seeing the exact statements that you are interpreting as meaning "there are no syntax differences when coding an informat and format" I have to assume that your intent is "there is no difference in using" or possibly even just the INFORMAT and FORMAT statements, which are not quite the same as the INPUT statement you show.

 

There are significant differences in the SYNTAX of Invalue and Value statements for creating formats and informats.

Astounding
PROC Star

Where did this practice problem come from?

 

In what I hope is plain English ...

 

Without using a colon, the DATE. informat means:

 

Read the next 7 characters.  Expect what you find to be in date7 form.  Automatically convert what you find in those 7 characters by applying the DATE7 informat.

 

Note that you need to know that the default width for DATE informats is 7 characters, which is neither intuitive nor an important part of the lesson when learning about the colon.

 

Adding a colon changes the instructions to the INPUT statement.  The colon says, "Read across all the commas (because commas are your delimiter here), until you find a non-comma.  Starting with that non-comma, read 7 characters expecting them to be in DATE7 form.  

 

So the comma affects where to start reading characters from the raw data line.  Without a colon, SAS reads the next 7 characters.  With a colon, it finds a non-comma and starts reading at that point.

Tom
Super User Tom
Super User

Not quite.  

The : modifier says to read the line using LIST MODE, which means honor the delimiters.  When you do that any width of the INFORMAT is IGNORED.  The INPUT statement will use the all of the next "word" on the line.  So the fact that the default width for the DATE informat does not matter as the width that actually matters is how many characters are there on the line.  If it is 7 or 11 it does not matter.  Even if it is more characters than that particular informat could handle all of them will be "used".

Astounding
PROC Star

For the moment, I regret not having SAS on my home machine anymore so I could test this myself.  Here is a simple test that I hope you (or someone) can run:

data _null_;
input date : date. next_character $1.;
put _all_;
cards;
01jan1960 abc
01jan1960abc
;
Tom
Super User Tom
Super User

@Astounding wrote:

For the moment, I regret not having SAS on my home machine anymore so I could test this myself.  Here is a simple test that I hope you (or someone) can run:

data _null_;
input date : date. next_character $1.;
put _all_;
cards;
01jan1960 abc
01jan1960abc
;

What do you think that is testing?

date=0 next_character=a _ERROR_=0 _N_=1
date=0 next_character=  _ERROR_=0 _N_=2

You have shown two things:

The DATE informat will ignore trailing letters.

32   data test;
33     date = input('01jan1960abc',date11.);
34     format date date9.;
35     put date=;
36   run;

date=01JAN1960

The cursor is located after the (first) delimiter when a value is read in list mode.

data test;
  input date :date. next $char5. ;
  format date date9. next $quote.;
cards;
01FEB-1960 123456
01MAR1960xxx   444
;

Results

Obs         date     next

 1     01FEB1960    "12345"
 2     01MAR1960    "  444"

 

Astounding
PROC Star

@Tom ,

 

There were a few things I imagined as possibilities  in this program.  

 

Since the default width for the DATE informat is 7, would it take all 9 characters to be the date?  Looks like it did since the year was 1960.

 

Would it find that extra characters were invalid?  Looks like it read the extra characters, but ignored them when assigning a value to the variable DATE..

 

 

YasodaJayaweera
Obsidian | Level 7

Thank you for the code. I ran it with and without the colon and I clearly understood the difference.

YasodaJayaweera
Obsidian | Level 7

This is a question from the old exam material. I believe two years ago. Thank you for the detailed explanation.

Tom
Super User Tom
Super User

@YasodaJayaweera wrote:

Hi,

I am aware of the difference between informats and formats. According to the new Base Programmer 9.4 exam prep guide there are no syntax differences when coding an informat and format. However, I am trying to run the below code and there is this colon sign added in front of the 2 date formats (e.g. Birth_Date :date.). I was told this is because these two are informats. If they were formats, I would not have to use them. I have attached the data file herewith.

That statement flat out wrong.  The reason there is a colon in the INPUT statement is to tell the INPUT statement to read the next field using LIST MODE even though you have included an informat in the INPUT statement.  You can also use the colon modifier in a PUT statement to tell the PUT statement to output the data in LIST MODE even though you have include a format in the PUT statement.  So the colon usage is consistent between INPUT/INFORMAT and PUT/FORMAT.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2666 views
  • 5 likes
  • 5 in conversation