- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear friends, this is my first step to SAS world and while following the examples of "The little SAS Book",
I am suffering from using date informat.
I have uploaded the data, which is pumpkin2.txt and here's my code I typed in.
DATA contest;
infile '/folders/myfolders/sasuser.v94/pumpkin3.txt';
input Name $16. Age 3. +1 Type $1. +1 DATE MMDDYYYY10. (Score1 Score2 Score3 Score4 Score5) 4.1;
RUN;
PROC PRINT DATA = contest;
RUN;
and this is what my SAS studio says:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;7273 DATA contest;74 infile '/folders/myfolders/sasuser.v94/pumpkin2.txt';75 input Name $16. Age 3. +1 Type $1. +1 DATE MMDDYYYY10. (Score1 Score2 Score3 Score4 Score5) 4.1;___________ ___485 7976NOTE 485-185: Informat MMDDYYYY was not found or could not be loaded.ERROR 79-322: Expecting a (.ERROR 76-322: Syntax error, statement will be ignored.76 RUN;NOTE: The SAS System stopped processing this step because of errors.WARNING: The data set WORK.CONTEST may be incomplete. When this step was stopped there were 0 observations and 4 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondscpu time 0.01 seconds7778 PROC PRINT DATA = contest;79 RUN;NOTE: No observations in data set WORK.CONTEST.NOTE: PROCEDURE PRINT used (Total process time):real time 0.00 secondscpu time 0.00 seconds808182 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
Please help. P.S: for some of rows in the data, the Name field, are less than 16 characters and in the source code I have defined it as long as 16 characters. Is this the reason?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@jimmychoi wrote:
Hi Tom, apologize if I have caused a confusion with the wordings. Should it be differentiated? Date informat vs MMDDYY informat
P.S original data is written in MM-DD-YYYY format (e.g. 10-31-2013)
If the data is in the form 10-31-2013 then you definitely want the MMDDYY informat. Since you are using fixed format input you need to set the width to match the number of columns that are reserved in the input line for that field. 8 digits and two hyphens means you want width of 10.
But it looks like your sample data file is not in fixed columns.
Alicia Grossman 13 C 10-28-2012 7.8 6.5 7.2 8.0 7.9 Matthew Lee 9 D 10-30-2012 6.5 5.9 6.8 6.0 8.1 Elizabeth Garcia 10 C 10-29-2012 8.9 7.9 8.5 9.0 8.8 Lori Newcombe 6 D 10-30-2012 6.7 5.6 4.9 5.2 6.1 Jose Martinez 7 d 10-31-2012 8.9 9.5 10.0 9.7 9.0 Brian Williams 11 C 10-29-2012 7.8 8.4 8.5 7.9 8.0
So you cannot use a formatted input statement to read those lines. For example the date does not appear in the same place on every line.
You need to use list mode input instead. Since you don't need to use formats in the input statement it will be easier if you define the variables BEFORE using them in the INPUT statement. Since SAS knows how to read numbers and character strings you only need to specify a informat for the date. And since you are using list mode you don't need to specify the width of the informat since SAS will ignore it any way.
Also since your data is not in fixed columns and you don't have an extra space after the NAME field you will need to read the name as two variables. If any of the rows have middle names or initials then you will have problems.
DATA contest;
infile '/folders/myfolders/sasuser.v94/pumpkin2.txt';
length Fname Lname $16 Age 8 Type $1 DATE 8 Score1-Score5 8 ;
informat Date mmddyy. ;
format date yymmdd10.;
input Fname -- Score5 ;
run;
Also do NOT add a decimal part to an informat unless you know that the decimal point was purposely NOT included in the text of the source file. When you add a decimal part to an informat you are telling SAS where the implied decimal point is So if any of your score values were wrtten into the text file WITHOUT a period, for example integer values, then the result of reading it with 4.1 informat would be to divide the value by 10.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
i'm not sure about the parentheses around score variables. Maybe just "score1-score5 4.1". For the date variable try format "date9." or "date10."
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
i have never used parentheses in that way, and never had a problem, but the book would certainly be right, i may be out of touch
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
NOTE 485-185: Informat MMDDYYYY was not found or could not be loaded.
Seems like a reasonably clear error message to me.
You probably meant to ask for the MMDDYY informat instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
P.S original data is written in MM-DD-YYYY format (e.g. 10-31-2013)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@jimmychoi wrote:
Hi Tom, apologize if I have caused a confusion with the wordings. Should it be differentiated? Date informat vs MMDDYY informat
P.S original data is written in MM-DD-YYYY format (e.g. 10-31-2013)
If the data is in the form 10-31-2013 then you definitely want the MMDDYY informat. Since you are using fixed format input you need to set the width to match the number of columns that are reserved in the input line for that field. 8 digits and two hyphens means you want width of 10.
But it looks like your sample data file is not in fixed columns.
Alicia Grossman 13 C 10-28-2012 7.8 6.5 7.2 8.0 7.9 Matthew Lee 9 D 10-30-2012 6.5 5.9 6.8 6.0 8.1 Elizabeth Garcia 10 C 10-29-2012 8.9 7.9 8.5 9.0 8.8 Lori Newcombe 6 D 10-30-2012 6.7 5.6 4.9 5.2 6.1 Jose Martinez 7 d 10-31-2012 8.9 9.5 10.0 9.7 9.0 Brian Williams 11 C 10-29-2012 7.8 8.4 8.5 7.9 8.0
So you cannot use a formatted input statement to read those lines. For example the date does not appear in the same place on every line.
You need to use list mode input instead. Since you don't need to use formats in the input statement it will be easier if you define the variables BEFORE using them in the INPUT statement. Since SAS knows how to read numbers and character strings you only need to specify a informat for the date. And since you are using list mode you don't need to specify the width of the informat since SAS will ignore it any way.
Also since your data is not in fixed columns and you don't have an extra space after the NAME field you will need to read the name as two variables. If any of the rows have middle names or initials then you will have problems.
DATA contest;
infile '/folders/myfolders/sasuser.v94/pumpkin2.txt';
length Fname Lname $16 Age 8 Type $1 DATE 8 Score1-Score5 8 ;
informat Date mmddyy. ;
format date yymmdd10.;
input Fname -- Score5 ;
run;
Also do NOT add a decimal part to an informat unless you know that the decimal point was purposely NOT included in the text of the source file. When you add a decimal part to an informat you are telling SAS where the implied decimal point is So if any of your score values were wrtten into the text file WITHOUT a period, for example integer values, then the result of reading it with 4.1 informat would be to divide the value by 10.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
no, i would replace the mmddyy format with date9.; ive had problems with ddmmyyyy type formats before and thus i always use date9. Give it a try and see if it works