BookmarkSubscribeRSS Feed
daid13
Calcite | Level 5

Hi, I'm new round here, started using SAS two months ago for a new job.

 

The context is I'm working on a macro with similar functionality to minlevel from this paper but recognising there are at least 3 valid ways/(informally)formats SAS has of storing iso8601 dates. I have all the human readable formats within iso spec covered reasonably well but I need to convert the SAS internal storage to that to make the calculations. So the plan is to use vinformat to check if the informat is $N8601E. if so use put to convert it.

 

My problem is vinformat does not seem to recognise if the informat was defined in the input statement and I don't know why, can someone explain? As you can see in the code and image below, vinformat returns the informat name if the informat step was used but returns just $ if the same informat was used in the input statement.

%macro trial(in, out);
    &out.=vinformatn(&in.);

%mend trial;


data inputset3;
    INFORMAT dt $N8601E20.;
    INPUT dt ;
    %trial(dt, informat);
    DATALINES;
1999
1999-12
1999-12-25
1999-12-25T12
1999-12-25T12:13
1999-12-25T12:13:30
;

data expectedset3;
    INPUT dt :$N8601E20.;
    
    %trial(dt, informat);
    DATALINES;
1999
1999-12
1999-12-25
1999-12-25T12
1999-12-25T12:13
1999-12-25T12:13:30
;


PROC print data=inputset3;
    TITLE informat;
RUN;

PROC PRINT data=expectedset3;
    TITLE input;
RUN;

daid13_0-1723720000081.png

Thanks

 

17 REPLIES 17
SASJedi
SAS Super FREQ

VINFORMAT returns the name of the INFORMAT permanently assigned to a variable in the variable's metadata. The code that creates inputset3 contains an INFORMAT statement that modifies the variable's metadata, the code that creates expectedset3 does not. If you run this code after yours:

ods select variables;
proc contents data=inputset3;
run;

ods select variables;
proc contents data=expectedset3;
run;

The results show why you are getting the results:

inputset3
Alphabetic List of Variables and Attributes
# Variable Type Len Informat
1 dt Char 20 $N8601E20.
2 informat Char 200  

 

expectedset3

Alphabetic List of Variables and Attributes
# Variable Type Len
1 dt Char 20
2 informat Char 200
Check out my Jedi SAS Tricks for SAS Users
daid13
Calcite | Level 5

Thanks SASJedi, that makes sense though it does leave me wondering why SAS is so inconsistent with its metadata and doesn't seem to be in the docs, at least not at the level I've looked at.

 

Going to have to move this to else where in my functionality or possibly do a regex straight on the SAS ISO8601 encoding though that, while simple appearing, doesn't seem to have clear documentation so I'll have to do it by eye.

SASJedi
SAS Super FREQ

@daid13 - there is no inconsistency here. The DATA step produces the exact output it was directed to produce. Your program code for the dataset inputset3 included a directive (INFORMAT statement) that assigned a default informat to the variable dt in the variable's metadata, much as a LABEL and FORMAT statements modify the metadata with label text and a default format. It is reasonable for code to use a specific informat to read the current input, and to use an INFORMAT statement to assign a different default informat in the metadata for future use. Without an INFORMAT (or FORMAT or LABLE) statement, no modification is made to the metadata, as in your DATA step code that produced dataset expectedset3. It's just that simple. 

Consider the following code:

data test;
	infile datalines truncover;
	input @1 V1 
		  @1 V2 binary3.
		  @5 V3 $3.
		  @5 V4 $UPCASE3.
	;
	informat V2 32. V3 V4 $UPCASE.;
	format   V1 V2 z3. V4 $REVERS3.;
datalines;
011 abc
;
ods select variables;
proc contents data=test;
run;
proc print;
run;

In the code, the INPUT statement reads the value for V2 using the BINARY informat, and the INFORMAT statement modifies the metadata for V2 to use the 32.  informat by default. The metadata shows the appropriate results:

Alphabetic List of Variables and Attributes
# Variable Type Len Format Informat
1 V1 Num 8 Z3.  
2 V2 Num 8 Z3. 32.
3 V3 Char 3   $UPCASE.
4 V4 Char 3 $REVERS3. $UPCASE.

 

The values read in with various informats, are later displayed using different formats:

Obs V1 V2 V3 V4
1 011 003 abc CBA

 

This is not inconsistency - it's flexibility. But with flexibility comes complexity, and with complexity the need to understand how the system works in order to achieve the desired result. If you want specific informat information in your metadata, you must use an INFORMAT statement to assign it.  

 

Check out my Jedi SAS Tricks for SAS Users
daid13
Calcite | Level 5

I guess the inconsistency comes in when you consider that the fundamental computer science way to look at an informat is as a type. That is what it analogous in any other programming language I have used. It seems to be implemented quite differently however it is described in documentation and guides as the tool you use for typing. Therefore, I can expect (rightly or wrongly) that it will behave like a type. I can definitely see how what is described is internally consistent but not industry consistent. Perhaps I'm falling into a trap a little in assuming that SAS will behave like a mainstream programming language.

 

I have finally found a piece of documentation stating that informats used in the input step are not stored but it wasn't even on the base input statement documentation, where it really should be, but buried away in a page on formats in the input step. I appreciate it is there but frustrating to find something important like that hidden.

 

At this point I don't think I'm looking for solutions thank you all for the assistance provided. The built in function that should have done exactly what I needed, vinformat, is inadequate due to this strange design of SAS not storing this metadata. I'm going to have to either reject the SAS informat way of storing ISO dates or build separate regexes for it.

Tom
Super User Tom
Super User

SAS has two types of variables. Floating point numbers and fixed length character strings.

 

daid13
Calcite | Level 5
Yes I am aware of that. Informats are then expressed as the tool to have an item or collection of data in a more restricted form than the two types allow, that is why I am saying SAS uses informats as other languages would use types.
Tom
Super User Tom
Super User

@daid13 wrote:
Yes I am aware of that. Informats are then expressed as the tool to have an item or collection of data in a more restricted form than the two types allow, that is why I am saying SAS uses informats as other languages would use types.

INFORMATs convert text into values.

FORMATs convert values into text.

 

Neither is the same as variable TYPE.

 

This is not really any different than how other languages do formatted input and output.

 

Computer jargon appropriates words from English and gives them special meaning.  https://www.merriam-webster.com/dictionary/format

 

In this case you seem to be trying to use the dictionary definition of the NOUN.  But the concept is SAS is closer to the definition of the VERB.

Quentin
Super User

@daid13 wrote:
Informats are then expressed as the tool to have an item or collection of data in a more restricted form than the two types allow...

I don't think I would say that at all.  Most typically, an informat is used to read a character value into a numeric value.  The official definition in the docs is more general: "An informat is a type of SAS language element that applies a pattern to or executes instructions for a data value to be read as input."

 

TBH, I've never really understood why SAS stores informat information as part of the metadata in a dataset, because it's not a constraint on the values that can be stored in a variable (see integrity constraints), and I don't think you can easily "re-use" an informat attached to a SAS dataset.  I think of the informat used to read in data as a temporary construct.

 

Below code creates a SAS dataset with one variable, DT.  It's just a numeric variable, it can store any numeric value (this is allowed in a SAS dataset, precision limits etc).  

 

data have ;
  dt=input("19Aug2024",date9.) ; output ;
  dt=input("08/19/2024",mmddyy10.) ; output ;
  dt=23607 ;output ;
  dt=0 ;output ;
  dt=constant('pi') ;output ;
run ;  

proc print data=have ;
run ;
The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.
Tom
Super User Tom
Super User

Attached informats had more value when using SAS/FSP for interactive entry/review of data.

 

You can still use them to help read data.  Say you have a CSV file and you want to read it into a dataset.  If you have a template dataset with all of the variables defined in the same order and appropriate informats attached then the code to read it could be as simple as:

data want;
  infile csv dsd truncover firstobs=2;
  if 0 then set template;
  input (_all_) (+0);
run;
Kurt_Bremser
Super User

If you want to recognize a sub-type (date, time, something else), do not look at informats, look at formats.

You can define a SAS date variable as "is numeric, with a date format attached". Similar for times, currency amounts, ....

Kurt_Bremser
Super User

Regard "inconsistency": this has to be that way.

 

It is perfectly valid to have multiple INPUT statements with different informats for the same variable. In such a case, which of the informats should be stored in the datasets metadata? So it makes sense to not use the informat from an INPUT statement.

OTOH, only one INFORMAT assignment for a variable will take effect (multiple informat assignments for the same variable are in fact a coding mistake, but SAS generously does not throw an ERROR and takes only the last).

ballardw
Super User

Here is an only moderately contrived example of @Kurt_Bremser and why a single variable may have multiple INFORMATS in actual use.

 

data example;
   input repsource $ @;
   if repsource='Source1' then input date :date9.;
   else if repsource='Source2' then input date :yymmdd10.;
   else if repsource='Source3' then input date :julian.;
   format date date9. ;
datalines;
Source1 22AUG2024
Source2 2024/08/22:00:00:00.000000
Source3 24235
;

You may think this is completely made up. This is actually a simplified example of just one data source I dealt with. The "data" file to read was a text file that had been made by appending years of report files with similar but slightly different header outputs as time went on. So I had the joy of PARSING multiple report formats to apply standard rules. You may have a luxury of never dealing with such but the first input with the trailing @ holds the input on that line of the file. That way you can examine some of the contents to determine 1) is this the header record I need) 2) which header record type is it and then 3) read other values which will have different read requirements.

 

So, from that code, which informat should be reported as "the informat" by VINFORMAT?

Quentin
Super User

Since you're new to SAS, can you say more about the big picture of what you're trying to do?  

 

Typically, SAS stores both dates, and date-times, in numeric variables.  A date is number of days since Jan 1, 1960.  A date-time is number of seconds since Jan 1, 1960.  So when you talk about doing calculations with dates, you would want a numeric variable.  To represent the date Jan 4, 1960, the value of the variable would be 3.  But you could use a format to display it as a human-readable string.

 

In your code, you've used an informat to read the data, but it's a character informat, which reads the value into a character variable.  The below log snippet shows the character (string) values in the variable DT:

92   data inputset3;
93       INFORMAT dt $N8601E20.;
94       INPUT dt ;
95       put dt= ;
96       DATALINES;

dt=1999FFFFFFFFFFFD
dt=1999CFFFFFFFFFFD
dt=1999C25FFFFFFFFD
dt=1999C2512FFFFFFD
dt=1999C251213FFFFD
dt=1999C25121330FFD
NOTE: The data set WORK.INPUTSET3 has 6 observations and 1 variables.

103  ;

That DT variable created is just a character variable.  You won't be able to do date calculations with it.  

Typically when reading in data from a text file, would know the format of the data you are reading in, and then read it with a numeric informat which creates a numeric date or date-time variable.

For this data, I might read it in with the B8601DT informat, which will create a numeric date-time variable.

data inputset3;
    INFORMAT dt B8601DT.;
    INPUT dt ;
    put "Date value is: " dt "The value can be displayed as " dt datetime19.;
    DATALINES;
1999
1999-12
1999-12-25
1999-12-25T12
1999-12-25T12:13
1999-12-25T12:13:30
;

The PUT statement there will show the actual numeric value stored in the variable, and display value when it is formatted using the DATETIME format.


Log is:

227  data inputset3;
228      INFORMAT dt B8601DT.;
229      INPUT dt ;
230      put "Date value is: " dt "The value can be displayed as " dt datetime19.;
231      DATALINES;

Date value is: 1230768000 The value can be displayed as  01JAN1999:00:00:00
Date value is: 1259625600 The value can be displayed as  01DEC1999:00:00:00
Date value is: 1261699200 The value can be displayed as  25DEC1999:00:00:00
Date value is: 1261742400 The value can be displayed as  25DEC1999:12:00:00
Date value is: 1261743180 The value can be displayed as  25DEC1999:12:13:00
Date value is: 1261743210 The value can be displayed as  25DEC1999:12:13:30
NOTE: The data set WORK.INPUTSET3 has 6 observations and 1 variables.

238  ;

 

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1116 views
  • 21 likes
  • 6 in conversation