hi,
I am trying to read the a datafile with the following fields.
137 3.35 Afghanistan 2009-07-08
My code look like below
informat casedt1id :$3. contntid :4.2. country :&$32. casedt1 :yymmdd10.
I am getting error reading the value 3.35 for which I have used :4.2 or just 4.2. the error I am receiving is in the attached png file.
this has no special characters like dollar/comma/percentage a standard number, if I am getting error using w.d what have i not understood, please help.
regards
jana
Please post the complete data-step used to read the data and the log with the error message. Both as text using the "insert sas code" button above the text-box.
data abc123 replace ;
informat casedt1id :$3. contntid 4.2 . country :&$32. casedt1 yymmdd10. cumrep4 cumrep5 cumrep6 cumrep7 cumrep8 5.
dthdtid1 :$3. dthcontntid 4. dthdt1 yymmdd10. cumdth5 cumdth6 cumdth7 cumdth8 cumdth9 cumdth10 cumdth11 cumdth12 4.2. ;
infile 'c:\abc.dat' dsd truncover;
input casedt1id contntid country casedt1 cumrep4 cumrep5 cumrep6 cumrep7 cumrep8 dthdtid1 dthcontntid dthdt1 cumdth5 cumdth6 cumdth7 cumdth8 cumdth9 cumdth10 cumdth11 cumdth12 ;
run;
DATA:
137 3.35 Afghanistan 2009-07-08 32 32 99 5.33 2009-10-30 1 16
HI, There you have the code and the log.
1. how to read this second column 3.35 data using w.d format ? I used 4.2 that didn't work now I tried using 4 then again it doesn't work .
thanks for the reply.
regards
jana
Your data looks to be tab delimited, so use that information rather than explicit position e.g.:
data abc123; infile '...' dlm='09'x dsd truncover; input casedt1id $ contntid country $ casedt1 cumrep4 cumrep5 cumrep6 cumrep7 cumrep8 dthdtid1 $ dthcontntid dthdt1 cumdth5 cumdth6 cumdth7 cumdth8 cumdth9 cumdth10 cumdth11 cumdth12; run;
hi,
Checked using notepad++ and this is not a tab delimited file . All the fields are separated by more than ten to 12 blank spaces.
thanks for your help.
regards
jana
Remove the cruft, correct the glaring syntax errors, and it will work:
data abc123 replace ;
informat
casedt1id $3.
contntid 4.2
country $32.
casedt1 yymmdd10.
;
format casedt1 yymmddd10.;
input casedt1id contntid country casedt1;
cards;
137 3.35 Afghanistan 2009-07-08
;
run;
If you have tabs in the infile (which did NOT show up in the code you posted), set them as correct delimiter.
hi
What do you mean by "remove cruft " ?
I used notepad++ to search for tabs and notepad++ returned 0 matches . this file is not tab delimited.
Thanks for your help.
regards
jana
@ariyurjana wrote:
hi
What do you mean by "remove cruft " ?
I used notepad++ to search for tabs and notepad++ returned 0 matches . this file is not tab delimited.
Thanks for your help.
regards
jana
"cruft" the bad parts of code generating syntax errors: incorrect characters in the informat statement
informat casedt1id :$3. contntid :4.2. country :&$32. casedt1 :yymmdd10.
The characters in red above are errors. If the file is delimited then the : modifier is likely not needed. Inclusion of a decimal in the informat for numerics may create unexpected values.
Other errors in the informat are the use of () which is not needed and list of variables followed by an informat is sufficient:
inform cumrep4 - cumrep8 5. ; would assume that 5 variables get the informat (or format).
on you INPUT statement if you place variables inside ( ) then there is expected informat or column control information to follow to match. If you have supplied the informat information and are not going to specify column controls then remove the () around variable lits in the Input statement.
What you show did not read any values from the file but failed. So it is very hard to diagnose a specific issue with some "value"
There is absolutely NO reason to post a PDF or picture file of the log. Copy the text directly from the log and paste into one of the boxes opened with either the {I} or the "running man". The running man will have some syntax coloring, the {I} is plain text but maintains the positions of the diagnostic characters from the error messages.
Your infile statement does not include any delimiter at all which means that default used is a space. If there are multiple spaces between values then the DSD option on the infile is treating each pair of spaces to indicate a missing value for a variable.
Below is an example using dlm='|'.
See what happens to the "numeric" value on the 4th row when there are two delimters before the value.
data example; infile datalines dsd dlm='|'; informat casedt1id $3. contntid 4.2 country $32.; input casedt1id contntid country; datalines; 123|456|France 124| 1234|France 125|3.45|Germany 126||678|USA ; run;
The example data you posted above
137 3.35 Afghanistan 2009-07-08
has 10 spaces appearing before the 3.35 value. If the data is NOT tab delimited and actually has 10 spaces then using the DSD option on your infile statement means that the value of 3.35 would be attempted to be read into the eleventh variable on your input statement (if there are that many). 10 more spaces to Afghanistan means that you'll likely have that read into the 21st variable.
If there are not 10 spaces, then copy a few lines of data from the text file and paste it into a box opened with the {I} to show us.
Hi,
Thanks for your detailed reply.
The country name field also has data like the ones below which have embedded blanks which makes the "&" to be used in the informat statement.
137 3.35 Afghanistan 2009-07-08 101 1.22 Antigua and Barbuda 2009-06-24 124 2.39 Bosnia and Herzegovina 2009-06-03 155 4.13 Federated States of Micronesia 2009-07-22 52 1.1 Dominican Republic 2009-05-29 9 2.03 Austria 2009-04-29 107 3.28 Cambodia 2009-06-24
From the above data there are actually 10 or more blank spaces between the columns .
My file has the following characteristics
1. blank is the delimiter
2. there are multiple blanks between columns
3. columns can have missing values but this is not denoted by a missing value
4. numeric data has varying length
it is this kind of file I am trying to read into sas..
BTW, the poster wanted the code and log and since the option was available to upload a file I chose to upload the log as a pdf. Aslo the size of the file was small only 50k. I think I did not commit a super critical error by uploading the log file.Anyway from next time on I will take care not to do the same.
This will work for the file as posted:
data want;
input
casedt1id :$3.
contntid :4.
country &$32.
casedt1 :yymmdd10.
;
format casedt1 yymmddd10.;
cards;
137 3.35 Afghanistan 2009-07-08
101 1.22 Antigua and Barbuda 2009-06-24
124 2.39 Bosnia and Herzegovina 2009-06-03
155 4.13 Federated States of Micronesia 2009-07-22
52 1.1 Dominican Republic 2009-05-29
9 2.03 Austria 2009-04-29
107 3.28 Cambodia 2009-06-24
;
run;
Note that I kept my code to the bare minimum needed.
Frankly, if someone sends me data like this, I ask them why they stopped their treatments. They should resume ASAP.
Note that in trying to accomodate this galloping foolishness, the responsibility for the correctness of the data in SAS lies on you. But the responsibilty to send data in an unambiguous form should always be with the source of the data. It's not your job to clean other's diapers.
Hi,
Thanks to everybody for replying . I am closing this question.
I will use the column input method to read the data instead of informat method.
regards
jana
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.