DATA Step, Macro, Functions and more

Data Read in error- Variables repeat and made up

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Data Read in error- Variables repeat and made up

I have been using the same data set, multiple times now, just updating as needed. Suddenly, after 28 variables read in, SAS is now reading the titles correctly, but the data in the cells is incorrect. No log messages are appearing that are in reference to this event or not. It is not new data nor have I seen any notices on a possibly corrupted file. And the data appears to be interpreted as correct by SAS, it is not recognizing any other errors. 

 

Example: What data file looks like

Obs   ID  Time  ....... (28 columns later).... Date5  Weight 5

1      b1    12:00     ................................... Jun2    25

2      b2    02:00     ................................... Jun4    35   

3      b3    02:00     ................................... Jun1    15

4      b4    12:00     ................................... Jun2    25

 

 

What SAS data file looks likes

Obs   ID  Time  ....... (28 columns later).... Date5  Weight 5

1      b1    12:00     ................................... 9          1

2      b2    02:00     ................................... 9          1  

3      b3    02:00     ................................... 9          1

4      b4    12:00     ................................... 9          1

 

All intermitten columns match, so nothing is out of order. and 9 or 1 are not even relevant number in the data set. I've restarted and tried forming a new file. Nothing has worked so far. 

 


Accepted Solutions
Solution
‎02-03-2017 05:48 PM
Super User
Posts: 10,483

Re: Data Read in error- Variables repeat and made up

[ Edited ]

You might want to try posting code and log results in a code box opened with the {i} icon. The forum sometimes does things to pasted code in the main body. For instance, do you actually have a variable named 24dead?

The line in the log:

Mortality 24dead Dead24 PWdEAD PCTDeadX makes it lok like you have a variable 24dead but the data error only shows a variable "dead"

 

If I count correctly your input statement is attempting to read 57 variables but I only can see 54 values in the data.

 

From the first record error this line

Mating=0.15 WEI=0 Estrus=2 Estrus7=2 Removed=. Rem_Date=9 RemRsn

Tells me it was reading these values (start at red for the Mating variable)

4 10001,11,240,205,LATE,349,1,352,3,16,.,359,28,119,14,7,1,6,1.55,0.05,10.85,1.44,0.15,0
87 ,2,2,Yes,9,1,0,1,1,0,8,16,375,72,4.95,39.6,15,0.66,3.4,0.23,74.17,381,0,1,6,1,1,0,,, 170

 

And your input statement has Removed as numeric so reading 'YES' generates the error.

Soe you either have something shifting columns or whoever provided the data may have changed a numeric, possibly 0/1 coded field, to character Yes/No (or change the display property in Excel and your export to csv used that displayed text). You might look at the cell's format properties for the Removed column in the Excel file.

 

if that is the case then change to the 0/1 format and re-export and re-read the data.

View solution in original post


All Replies
Valued Guide
Posts: 2,174

Re: Data Read in error- Variables repeat and made up

Has this data been transferred fro windows to unix as binary instead of text? That adds a CR at end of line which causes SAS a bit of trouble reading the last column.
Occasional Contributor
Posts: 7

Re: Data Read in error- Variables repeat and made up

No conversions. All in the same file types (MS Excel, CSV (Comma Separated File), which I make the files as a .xls file and save as CSV. Full response below

Super User
Posts: 10,483

Re: Data Read in error- Variables repeat and made up

HOW are you reading the data? Data step, proc import or something else? What file format is the source data?

 

Were there any notes, warning or errors when reading the data?

It may help to post the log of the code and result messages. Paste into a code box opened with the {i} icon.

Occasional Contributor
Posts: 7

Re: Data Read in error- Variables repeat and made up

Almost all of the errors seem to be targeted at incorrect data for what is indicated in the input line. I went back and 'fixed' a few of these just to see what it would read out in the log. Same error, different column, all after the 'Mortality' column. and the reason SAS thought it was wrong is it appeared to be trying to fill the 'Removed' error column with 'Yes' or 'No" Data from a previous variable (which would have been incorrect for that column, supposed to be 1 or 0. 

Full response below. 

Occasional Contributor
Posts: 7

Re: Data Read in error- Variables repeat and made up

[ Edited ]

CSV file type, nothing converted, made on in microsoft excel and run on the same computer/OS. 

 

data read in as::

data arginine;
infile '16_01 Individual Sow Data Set 3Feb2017 TEST.csv' dsd delimiter=',' firstobs=4 lrecl=500000;
input ID $ WK $ BRED_D GDU_PEN $ TRT $ SHIP_D GBARN $ GBARN_D GBARN_T FAR_RM FAR_HI FAR_D FarWk GEST_L
TB BA SB MM BA_Wtav BA_Wtvar BA_LitBW TB_Av TB_Wtvar X_Off X_On CROSS_ON X_Status $ REAR_OPP Mortality Dead24 Dead Weaned WnRm WnDate WN_HI WnWt Wn_LitWt WnWtAge WnWtVar PWGain PWADG LactHiT Estrus_d HNS Mating WEI Estrus Estrus7;
CtW= WnDate-BRED_D;
FulForm = TB - MM;
run;

 

 

I guess i'm looking to see if anyone knows of any other infile limits i could try?there is nothing spectacular about this data set that makes it different than any other i have run before, and it simply stops reading correctly at the Mortality variable.

I've run this code and other files just like this one before, and they have worked just fine. The file isn't flagging itself as corrupt either. 

Here, the log lists errors for Removed variable, but, as far as I know, the data in that column is correct. Also, that column/variable doesn't show up until far after where the problems start (if the software reads the file from left to right)

 

LOG:

188 title '16_01 Arginine Trial Sow Evaluation, 3Feb2017
189 File: 16_01 SAS Data File 3Feb2017.csv';
190
191 data arginine;
192 infile '16_01 SAS Data File 3Feb2017.csv' dsd delimiter=',' firstobs=4 lrecl=500000;
193 input ID $ WK $ BRED_D GDU_PEN $ TRT $ SHIP_D GBARN $ GBARN_D GBARN_T FAR_RM
193! FAR_HI FAR_D FarWk GEST_L
194 TB BA SB MM BA_Wtav BA_Wtvar BA_LitBW TB_Av TB_Wtvar X_Off X_On
194! CROSS_ON X_Status $ REAR_OPP
195 Mortality 24dead Dead24 PWdEAD PCTDeadX
196 Weaned WnRm WnDate WN_HI WnWt Wn_LitWt WnWtAge WnWtVar PWGain PWADG LactHiT
197 Estrus_d HNS Mating WEI Estrus Estrus7 Removed Rem_Date RemRsn $ Note $
198 ;
199 CtW= WnDate-BRED_D;
200 FulForm = TB - MM;
201 run;

NOTE: The infile '16_01 SAS Data File 3Feb2017.csv' is:

Filename=C:\Users\ehines\Desktop\Box Sync\16_01 Arginine Trial Files.Shared\16_01 SAS
Data File 3Feb2017.csv,
RECFM=V,LRECL=500000,File Size (bytes)=99719,
Last Modified=03Feb2017:14:17:02,
Create Time=03Feb2017:07:43:34

NOTE: Invalid data for Removed in line 4 92-94.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
4 10001,11,240,205,LATE,349,1,352,3,16,.,359,28,119,14,7,1,6,1.55,0.05,10.85,1.44,0.15,0
87 ,2,2,Yes,9,1,0,1,1,0,8,16,375,72,4.95,39.6,15,0.66,3.4,0.23,74.17,381,0,1,6,1,1,0,,, 170
ID=10001 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=359 FarWk=28 GEST_L=119 TB=14 BA=7 SB=1 MM=6 BA_Wtav=1.55 BA_Wtvar=0.05
BA_LitBW=10.85 TB_Av=1.44 TB_Wtvar=0.15 X_Off=0 X_On=2 CROSS_ON=2 X_Status=Yes REAR_OPP=9
Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=359 WN_HI=28 WnWt=119
Wn_LitWt=14 WnWtAge=7 WnWtVar=1 PWGain=6 PWADG=1.55 LactHiT=0.05 Estrus_d=10.85 HNS=1.44
Mating=0.15 WEI=0 Estrus=2 Estrus7=2 Removed=. Rem_Date=9 RemRsn=1 Note=0 CtW=119 FulForm=8
_ERROR_=1 _N_=1
NOTE: Invalid data for Removed in line 5 90-92.
5 10002,11,240,205,LATE,349,1,352,3,16,.,356,27,116,17,12,1,4,1.02,0.01,12.05,1,0.01,0,2
87 ,2,Yes,14,2,1,1,2,0,12,16,375,72,5.16,61.95,18,0.6,4.14,0.23,74.17,381,0,1,6,1,1,0,,, 171
ID=10002 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=356 FarWk=27 GEST_L=116 TB=17 BA=12 SB=1 MM=4 BA_Wtav=1.02
BA_Wtvar=0.01 BA_LitBW=12.05 TB_Av=1 TB_Wtvar=0.01 X_Off=0 X_On=2 CROSS_ON=2 X_Status=Yes
REAR_OPP=14 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=356
WN_HI=27 WnWt=116 Wn_LitWt=17 WnWtAge=12 WnWtVar=1 PWGain=4 PWADG=1.02 LactHiT=0.01
Estrus_d=12.05 HNS=1 Mating=0.01 WEI=0 Estrus=2 Estrus7=2 Removed=. Rem_Date=14 RemRsn=2 Note=1
CtW=116 FulForm=13 _ERROR_=1 _N_=2
NOTE: Invalid data for Removed in line 6 92-94.
6 10003,11,240,205,LATE,349,1,352,3,16,.,356,27,116,12,10,0,2,1.16,0.02,11.6,1.16,0.02,0
87 ,2,2,Yes,12,,,,,.,.,16,375,72,5.04,55.45,18,2.43,3.88,0.22,74.17,380,0,1,5,1,1,0,,, 169
ID=10003 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=356 FarWk=27 GEST_L=116 TB=12 BA=10 SB=0 MM=2 BA_Wtav=1.16
BA_Wtvar=0.02 BA_LitBW=11.6 TB_Av=1.16 TB_Wtvar=0.02 X_Off=0 X_On=2 CROSS_ON=2 X_Status=Yes
REAR_OPP=12 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=356
WN_HI=27 WnWt=116 Wn_LitWt=12 WnWtAge=10 WnWtVar=0 PWGain=2 PWADG=1.16 LactHiT=0.02
Estrus_d=11.6 HNS=1.16 Mating=0.02 WEI=0 Estrus=2 Estrus7=2 Removed=. Rem_Date=12 RemRsn=
Note= CtW=116 FulForm=10 _ERROR_=1 _N_=3
NOTE: Invalid data for Removed in line 7 92-93.
7 10004,11,240,205,LATE,349,1,352,3,16,.,354,27,114,16,16,0,0,1.23,0.09,19.7,1.23,0.09,0
87 ,0,0,No,16,2,0,2,2,.,15,16,375,72,4.36,65.4,20,2.4,3.13,0.16,74.17,383,1,0,8,1,0,0,,,1
173 dead after wean weights collected- considered weaned and dead 234
ID=10004 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=354 FarWk=27 GEST_L=114 TB=16 BA=16 SB=0 MM=0 BA_Wtav=1.23
BA_Wtvar=0.09 BA_LitBW=19.7 TB_Av=1.23 TB_Wtvar=0.09 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=16 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=354
WN_HI=27 WnWt=114 Wn_LitWt=16 WnWtAge=16 WnWtVar=0 PWGain=0 PWADG=1.23 LactHiT=0.09
Estrus_d=19.7 HNS=1.23 Mating=0.09 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=16 RemRsn=2
Note=0 CtW=114 FulForm=16 _ERROR_=1 _N_=4
NOTE: Invalid data for Removed in line 8 91-92.
8 10006,11,240,205,LATE,349,1,352,3,16,.,358,27,118,12,12,0,0,1.4,0.02,16.85,1.4,0.02,0,
87 0,0,No,12,2,0,2,2,.,10,16,375,72,5.66,56.55,16,0.2,4.25,0.27,74.17,393,0,1,18,1,0,0,,, 172
ID=10006 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=358 FarWk=27 GEST_L=118 TB=12 BA=12 SB=0 MM=0 BA_Wtav=1.4 BA_Wtvar=0.02
BA_LitBW=16.85 TB_Av=1.4 TB_Wtvar=0.02 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No REAR_OPP=12
Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=358 WN_HI=27 WnWt=118
Wn_LitWt=12 WnWtAge=12 WnWtVar=0 PWGain=0 PWADG=1.4 LactHiT=0.02 Estrus_d=16.85 HNS=1.4
Mating=0.02 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=12 RemRsn=2 Note=0 CtW=118 FulForm=12
_ERROR_=1 _N_=5
NOTE: Invalid data for Removed in line 9 94-95.
9 10007,11,240,205,LATE,349,1,352,3,16,77,355,27,115,17,16,1,0,1.22,0.03,19.55,1.17,0.07
87 ,0,0,0,No,16,,,,,.,.,16,375,72,4.84,53.2,19,2.69,3.61,0.19,74.17,380,0,1,5,1,1,0,,, 169
ID=10007 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=77 FAR_D=355 FarWk=27 GEST_L=115 TB=17 BA=16 SB=1 MM=0 BA_Wtav=1.22
BA_Wtvar=0.03 BA_LitBW=19.55 TB_Av=1.17 TB_Wtvar=0.07 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=16 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=77 WnDate=355
WN_HI=27 WnWt=115 Wn_LitWt=17 WnWtAge=16 WnWtVar=1 PWGain=0 PWADG=1.22 LactHiT=0.03
Estrus_d=19.55 HNS=1.17 Mating=0.07 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=16 RemRsn=
Note= CtW=115 FulForm=17 _ERROR_=1 _N_=6
NOTE: Invalid data for Removed in line 10 92-94.
10 10008,11,240,205,LATE,349,1,352,3,16,.,357,27,117,16,16,0,0,1.3,0.06,20.85,1.3,0.06,2,
87 0,-2,Yes,14,1,0,1,1,.,13,16,375,72,3.66,47.55,17,0.29,2.35,0.14,74.17,382,0,1,7,1,1,0,
173 ,, 174
ID=10008 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=357 FarWk=27 GEST_L=117 TB=16 BA=16 SB=0 MM=0 BA_Wtav=1.3 BA_Wtvar=0.06
BA_LitBW=20.85 TB_Av=1.3 TB_Wtvar=0.06 X_Off=2 X_On=0 CROSS_ON=-2 X_Status=Yes REAR_OPP=14
Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=357 WN_HI=27 WnWt=117
Wn_LitWt=16 WnWtAge=16 WnWtVar=0 PWGain=0 PWADG=1.3 LactHiT=0.06 Estrus_d=20.85 HNS=1.3
Mating=0.06 WEI=2 Estrus=0 Estrus7=-2 Removed=. Rem_Date=14 RemRsn=1 Note=0 CtW=117 FulForm=16
_ERROR_=1 _N_=7
NOTE: Invalid data for Removed in line 11 92-93.
11 10009,11,240,205,LATE,349,1,352,3,17,.,358,27,118,14,11,0,3,1.52,0.12,16.7,1.52,0.12,0
87 ,0,0,No,11,0,0,0,0,.,11,17,376,.,5.75,63.2,17,0.79,4.23,0.25,76.86,381,0,1,5,1,1,0,,, 171
ID=10009 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=358 FarWk=27 GEST_L=118 TB=14 BA=11 SB=0 MM=3 BA_Wtav=1.52
BA_Wtvar=0.12 BA_LitBW=16.7 TB_Av=1.52 TB_Wtvar=0.12 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=11 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=358
WN_HI=27 WnWt=118 Wn_LitWt=14 WnWtAge=11 WnWtVar=0 PWGain=3 PWADG=1.52 LactHiT=0.12
Estrus_d=16.7 HNS=1.52 Mating=0.12 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=11 RemRsn=0
Note=0 CtW=118 FulForm=11 _ERROR_=1 _N_=8
NOTE: Invalid data for Removed in line 12 91-93.
12 10010,11,240,205,LATE,349,1,352,3,17,.,358,27,118,16,15,0,1,1.48,0.09,22.2,1.48,0.1,0,
87 1,1,Yes,16,2,2,0,2,0,14,17,376,.,5.23,73.2,17,0.58,3.75,0.22,76.86,382,0,1,6,1,1,0,,, 171
ID=10010 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=358 FarWk=27 GEST_L=118 TB=16 BA=15 SB=0 MM=1 BA_Wtav=1.48
BA_Wtvar=0.09 BA_LitBW=22.2 TB_Av=1.48 TB_Wtvar=0.1 X_Off=0 X_On=1 CROSS_ON=1 X_Status=Yes
REAR_OPP=16 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=358
WN_HI=27 WnWt=118 Wn_LitWt=16 WnWtAge=15 WnWtVar=0 PWGain=1 PWADG=1.48 LactHiT=0.09
Estrus_d=22.2 HNS=1.48 Mating=0.1 WEI=0 Estrus=1 Estrus7=1 Removed=. Rem_Date=16 RemRsn=2 Note=2
CtW=118 FulForm=15 _ERROR_=1 _N_=9
NOTE: Invalid data for Removed in line 13 93-95.
13 10011,11,240,205,LATE,349,1,352,3,17,.,357,27,117,13,13,0,0,1.57,0.07,20.35,1.57,0.07,
87 0,3,3,Yes,16,1,0,1,1,1,15,17,376,.,4.82,72.35,18,0.52,3.26,0.18,76.86,382,0,1,6,1,1,0,
173 ,, 174
ID=10011 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=357 FarWk=27 GEST_L=117 TB=13 BA=13 SB=0 MM=0 BA_Wtav=1.57
BA_Wtvar=0.07 BA_LitBW=20.35 TB_Av=1.57 TB_Wtvar=0.07 X_Off=0 X_On=3 CROSS_ON=3 X_Status=Yes
REAR_OPP=16 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=357
WN_HI=27 WnWt=117 Wn_LitWt=13 WnWtAge=13 WnWtVar=0 PWGain=0 PWADG=1.57 LactHiT=0.07
Estrus_d=20.35 HNS=1.57 Mating=0.07 WEI=0 Estrus=3 Estrus7=3 Removed=. Rem_Date=16 RemRsn=1
Note=0 CtW=117 FulForm=13 _ERROR_=1 _N_=10
NOTE: Invalid data for Removed in line 14 94-95.
14 10012,11,240,205,LATE,349,1,352,3,17,77,355,27,115,19,16,3,0,1.27,0.04,20.35,1.24,0.36
87 ,0,0,0,No,16,,,,,.,.,17,376,.,5.53,71.85,20,0.55,4.26,0.21,76.86,389,0,1,13,1,0,0,,, 170
ID=10012 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=77 FAR_D=355 FarWk=27 GEST_L=115 TB=19 BA=16 SB=3 MM=0 BA_Wtav=1.27
BA_Wtvar=0.04 BA_LitBW=20.35 TB_Av=1.24 TB_Wtvar=0.36 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=16 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=77 WnDate=355
WN_HI=27 WnWt=115 Wn_LitWt=19 WnWtAge=16 WnWtVar=3 PWGain=0 PWADG=1.27 LactHiT=0.04
Estrus_d=20.35 HNS=1.24 Mating=0.36 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=16 RemRsn=
Note= CtW=115 FulForm=19 _ERROR_=1 _N_=11
NOTE: Invalid data for Removed in line 15 92-93.
15 10015,11,240,205,LATE,349,1,352,3,17,.,356,27,116,13,12,1,0,1.28,0.03,15.38,1.3,0.03,0
87 ,0,0,No,12,1,0,1,1,.,11,17,376,.,6.15,67.7,19,0.14,4.87,0.26,76.86,382,0,1,6,1,1,0,,, 171
ID=10015 WK=11 BRED_D=240 GDU_PEN=205 TRT=LATE SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=356 FarWk=27 GEST_L=116 TB=13 BA=12 SB=1 MM=0 BA_Wtav=1.28
BA_Wtvar=0.03 BA_LitBW=15.38 TB_Av=1.3 TB_Wtvar=0.03 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=12 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=356
WN_HI=27 WnWt=116 Wn_LitWt=13 WnWtAge=12 WnWtVar=1 PWGain=0 PWADG=1.28 LactHiT=0.03
Estrus_d=15.38 HNS=1.3 Mating=0.03 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=12 RemRsn=1
Note=0 CtW=116 FulForm=13 _ERROR_=1 _N_=12
NOTE: Invalid data for Removed in line 16 92-93.
16 10016,11,240,206,FULL,349,1,352,3,17,.,356,27,116,14,13,1,0,1.24,0.06,16.1,1.21,0.07,0
87 ,0,0,No,13,1,0,1,1,.,12,17,376,.,6.24,74.9,19,1.02,5,0.26,76.86,382,0,1,6,1,1,0,,, 168
ID=10016 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=356 FarWk=27 GEST_L=116 TB=14 BA=13 SB=1 MM=0 BA_Wtav=1.24
BA_Wtvar=0.06 BA_LitBW=16.1 TB_Av=1.21 TB_Wtvar=0.07 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=13 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=356
WN_HI=27 WnWt=116 Wn_LitWt=14 WnWtAge=13 WnWtVar=1 PWGain=0 PWADG=1.24 LactHiT=0.06
Estrus_d=16.1 HNS=1.21 Mating=0.07 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=13 RemRsn=1
Note=0 CtW=116 FulForm=14 _ERROR_=1 _N_=13
NOTE: Invalid data for Removed in line 17 92-93.
17 10017,11,240,206,FULL,349,1,352,3,17,.,354,27,114,16,14,2,0,1.38,0.08,19.3,1.34,0.08,0
87 ,0,0,No,14,1,0,1,1,.,13,17,375,76,6.41,83.35,20,1.08,5.03,0.25,76.86,380,0,1,5,1,1,0,,
173 , 173
ID=10017 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=354 FarWk=27 GEST_L=114 TB=16 BA=14 SB=2 MM=0 BA_Wtav=1.38
BA_Wtvar=0.08 BA_LitBW=19.3 TB_Av=1.34 TB_Wtvar=0.08 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=14 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=354
WN_HI=27 WnWt=114 Wn_LitWt=16 WnWtAge=14 WnWtVar=2 PWGain=0 PWADG=1.38 LactHiT=0.08
Estrus_d=19.3 HNS=1.34 Mating=0.08 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=14 RemRsn=1
Note=0 CtW=114 FulForm=16 _ERROR_=1 _N_=14
NOTE: Invalid data for Removed in line 18 93-95.
18 10018,11,240,206,FULL,349,1,352,3,17,.,357,27,117,16,16,0,0,1.21,0.03,19.3,1.17,0.06,3
87 ,0,-3,Yes,13,1,1,0,1,.,12,17,375,76,4.99,59.9,17,0.39,3.78,0.22,76.86,383,1,0,8,1,0,0,
173 ,, 174
ID=10018 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=357 FarWk=27 GEST_L=117 TB=16 BA=16 SB=0 MM=0 BA_Wtav=1.21
BA_Wtvar=0.03 BA_LitBW=19.3 TB_Av=1.17 TB_Wtvar=0.06 X_Off=3 X_On=0 CROSS_ON=-3 X_Status=Yes
REAR_OPP=13 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=357
WN_HI=27 WnWt=117 Wn_LitWt=16 WnWtAge=16 WnWtVar=0 PWGain=0 PWADG=1.21 LactHiT=0.03
Estrus_d=19.3 HNS=1.17 Mating=0.06 WEI=3 Estrus=0 Estrus7=-3 Removed=. Rem_Date=13 RemRsn=1
Note=1 CtW=117 FulForm=16 _ERROR_=1 _N_=15
NOTE: Invalid data for Removed in line 19 93-94.
19 10019,11,240,206,FULL,349,1,352,3,17,.,354,27,114,14,13,1,0,1.24,0.06,15.65,1.19,0.07,
87 0,0,0,No,13,1,1,0,1,.,12,17,376,.,6.15,73.8,21,0.61,4.91,0.23,76.86,381,0,1,5,1,1,0,,, 172
ID=10019 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=354 FarWk=27 GEST_L=114 TB=14 BA=13 SB=1 MM=0 BA_Wtav=1.24
BA_Wtvar=0.06 BA_LitBW=15.65 TB_Av=1.19 TB_Wtvar=0.07 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=13 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=354
WN_HI=27 WnWt=114 Wn_LitWt=14 WnWtAge=13 WnWtVar=1 PWGain=0 PWADG=1.24 LactHiT=0.06
Estrus_d=15.65 HNS=1.19 Mating=0.07 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=13 RemRsn=1
Note=1 CtW=114 FulForm=14 _ERROR_=1 _N_=16
NOTE: Invalid data for Removed in line 20 93-94.
20 10020,11,240,206,FULL,349,1,352,3,17,77,355,27,115,15,13,2,0,1.2,0.06,15.55,1.21,0.06,
87 0,0,0,No,13,0,0,0,0,.,13,17,375,76,5.59,72.65,19,0.48,4.39,0.23,76.86,382,0,1,7,1,1,0,
173 ,, 174
ID=10020 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=77 FAR_D=355 FarWk=27 GEST_L=115 TB=15 BA=13 SB=2 MM=0 BA_Wtav=1.2
BA_Wtvar=0.06 BA_LitBW=15.55 TB_Av=1.21 TB_Wtvar=0.06 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=13 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=77 WnDate=355
WN_HI=27 WnWt=115 Wn_LitWt=15 WnWtAge=13 WnWtVar=2 PWGain=0 PWADG=1.2 LactHiT=0.06
Estrus_d=15.55 HNS=1.21 Mating=0.06 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=13 RemRsn=0
Note=0 CtW=115 FulForm=15 _ERROR_=1 _N_=17
NOTE: Invalid data for Removed in line 21 91-92.
21 10021,11,240,206,FULL,349,1,352,3,17,.,354,27,114,15,11,4,0,1.5,0.07,16.5,1.43,0.07,0,
87 0,0,No,11,0,0,0,0,.,11,17,376,.,6.68,73.45,21,1.32,5.18,0.25,76.86,381,0,1,5,1,1,0,,, 171
ID=10021 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=. FAR_D=354 FarWk=27 GEST_L=114 TB=15 BA=11 SB=4 MM=0 BA_Wtav=1.5 BA_Wtvar=0.07
BA_LitBW=16.5 TB_Av=1.43 TB_Wtvar=0.07 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No REAR_OPP=11
Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=. WnDate=354 WN_HI=27 WnWt=114
Wn_LitWt=15 WnWtAge=11 WnWtVar=4 PWGain=0 PWADG=1.5 LactHiT=0.07 Estrus_d=16.5 HNS=1.43
Mating=0.07 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=11 RemRsn=0 Note=0 CtW=114 FulForm=15
_ERROR_=1 _N_=18
NOTE: Invalid data for Removed in line 22 93-94.
22 10022,11,240,206,FULL,349,1,352,3,17,77,355,27,115,15,15,0,0,1.43,0.13,21.5,1.43,0.13,
87 0,0,0,No,15,0,0,0,0,.,15,17,376,.,5.15,77.2,20,1.29,3.71,0.19,76.86,383,0,1,7,1,1,0,,, 172
ID=10022 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=17 FAR_HI=77 FAR_D=355 FarWk=27 GEST_L=115 TB=15 BA=15 SB=0 MM=0 BA_Wtav=1.43
BA_Wtvar=0.13 BA_LitBW=21.5 TB_Av=1.43 TB_Wtvar=0.13 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=15 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=17 WnRm=77 WnDate=355
WN_HI=27 WnWt=115 Wn_LitWt=15 WnWtAge=15 WnWtVar=0 PWGain=0 PWADG=1.43 LactHiT=0.13
Estrus_d=21.5 HNS=1.43 Mating=0.13 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=15 RemRsn=0
Note=0 CtW=115 FulForm=15 _ERROR_=1 _N_=19
NOTE: Invalid data for Removed in line 23 92-93.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
23 10024,11,240,206,FULL,349,1,352,3,16,.,354,27,114,14,11,3,0,1.26,0.01,13.9,1.26,0.01,0
87 ,0,0,No,11,,,,,.,.,16,375,72,6,60,20,1.11,4.74,0.24,74.17,381,0,1,6,1,1,0,,, 162
ID=10024 WK=11 BRED_D=240 GDU_PEN=206 TRT=FULL SHIP_D=349 GBARN=1 GBARN_D=352 GBARN_T=3
FAR_RM=16 FAR_HI=. FAR_D=354 FarWk=27 GEST_L=114 TB=14 BA=11 SB=3 MM=0 BA_Wtav=1.26
BA_Wtvar=0.01 BA_LitBW=13.9 TB_Av=1.26 TB_Wtvar=0.01 X_Off=0 X_On=0 CROSS_ON=0 X_Status=No
REAR_OPP=11 Mortality=4 dead=9 Dead24=1 PWdEAD=352 PCTDeadX=3 Weaned=16 WnRm=. WnDate=354
WN_HI=27 WnWt=114 Wn_LitWt=14 WnWtAge=11 WnWtVar=3 PWGain=0 PWADG=1.26 LactHiT=0.01
Estrus_d=13.9 HNS=1.26 Mating=0.01 WEI=0 Estrus=0 Estrus7=0 Removed=. Rem_Date=11 RemRsn=
Note= CtW=114 FulForm=14 _ERROR_=1 _N_=20
NOTE: 549 records were read from the infile '16_01 SAS Data File 3Feb2017.csv'.
The minimum record length was 132.
The maximum record length was 240.
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line)Smiley SadColumn).
41 at 199:12 2 at 200:14
NOTE: The data set WORK.ARGININE has 549 observations and 56 variables.
NOTE: DATA statement used (Total process time):
real time 0.16 seconds
cpu time 0.14 seconds


202
203 proc sort;
204 by ID;
205 run;

NOTE: There were 549 observations read from the data set WORK.ARGININE.
NOTE: The data set WORK.ARGININE has 549 observations and 56 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


206 proc print data=arginine (obs=10);
207 run;

NOTE: There were 10 observations read from the data set WORK.ARGININE.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.12 seconds
cpu time 0.04 seconds

Solution
‎02-03-2017 05:48 PM
Super User
Posts: 10,483

Re: Data Read in error- Variables repeat and made up

[ Edited ]

You might want to try posting code and log results in a code box opened with the {i} icon. The forum sometimes does things to pasted code in the main body. For instance, do you actually have a variable named 24dead?

The line in the log:

Mortality 24dead Dead24 PWdEAD PCTDeadX makes it lok like you have a variable 24dead but the data error only shows a variable "dead"

 

If I count correctly your input statement is attempting to read 57 variables but I only can see 54 values in the data.

 

From the first record error this line

Mating=0.15 WEI=0 Estrus=2 Estrus7=2 Removed=. Rem_Date=9 RemRsn

Tells me it was reading these values (start at red for the Mating variable)

4 10001,11,240,205,LATE,349,1,352,3,16,.,359,28,119,14,7,1,6,1.55,0.05,10.85,1.44,0.15,0
87 ,2,2,Yes,9,1,0,1,1,0,8,16,375,72,4.95,39.6,15,0.66,3.4,0.23,74.17,381,0,1,6,1,1,0,,, 170

 

And your input statement has Removed as numeric so reading 'YES' generates the error.

Soe you either have something shifting columns or whoever provided the data may have changed a numeric, possibly 0/1 coded field, to character Yes/No (or change the display property in Excel and your export to csv used that displayed text). You might look at the cell's format properties for the Removed column in the Excel file.

 

if that is the case then change to the 0/1 format and re-export and re-read the data.

Occasional Contributor
Posts: 7

Re: Data Read in error- Variables repeat and made up

Thank you for the tip on posting, i'll try to remember that. 
I was thinking along the same lines as you suggested. Something was shifting, somewhere, but I just could not find it or it was restricted some how by my code. I copied a new file and manually changed excel formatting. I removed a few extraneous variables, and that fixed it. I'm not sure why though, As I have run them before without issue. I do have variables named 24dead and dead24. I also did not see that those were not being read in as 2 separate variables, but fixing those column titles did not fix the issue. 

 

Thanks for the suggestions!

Super User
Posts: 10,483

Re: Data Read in error- Variables repeat and made up


ehines wrote:

Thank you for the tip on posting, i'll try to remember that. 
I was thinking along the same lines as you suggested. Something was shifting, somewhere, but I just could not find it or it was restricted some how by my code. I copied a new file and manually changed excel formatting. I removed a few extraneous variables, and that fixed it. I'm not sure why though, As I have run them before without issue. I do have variables named 24dead and dead24. I also did not see that those were not being read in as 2 separate variables, but fixing those column titles did not fix the issue. 

 

Thanks for the suggestions!


Depending on who is providing Excel data shifting columns and changing formats is a very common problem. A strictly database driven export may work but if a human ever intervenes in the process, such as they are copying items from somewhere else to compile into your data set, then things like this happen. Frequently. Also changes in one useres Excel default settings can do things such as the 0/1 to No/Yes.

I will givfe you positive points for transferring the data to CSV before reading though.

 

Learning to read a log and what may cause certain types of errors is a learned skill so don't feel to bad about recognizing where to look. I've been looking at these logs off and on for nearly 30 years. I have had many instances of data sources changing layouts and so had a rough idea of where to look.

 

The text I highlighted in green would tell me there is someone I need to hunt down and administer lashes with wet noodles for changing file content.

Super User
Super User
Posts: 6,499

Re: Data Read in error- Variables repeat and made up

You should definitely add the TRUNCOVER option to your INFILE statement. Otherwise if one of the lines has too few values SAS will try to get the values from the next line.  That type of mixup will be much harder to handle.

 

infile '16_01 Individual Sow Data Set 3Feb2017 TEST.csv' 
    dsd delimiter=',' firstobs=4 lrecl=500000 truncover
;

Most likely the cause of your trouble is that someone insert a line break into the value of one of the cells in the Excel file.  Say a cell was supposed to have 'A B' but instead of space the user added a line break. So  instead of line like

col1,col2,A B,col4,col5

You will end up with two lines like

col1,col2,A
B,col4,col5

You might have some luck checking the file to make sure that every line has the same number of fields.

 

But a better solution is to either not use Excel as you data entry system, or if you do then format the sheet like a table. So remove the three extra rows at the top that you are skipping with FIRSTOBS=4 and just keep one header row. The values in the header row should all be valid variable names.  Then you can just have SAS read the XLS or XLSX file directly and skip the CSV file and the data step.

 

Valued Guide
Posts: 2,174

Re: Data Read in error- Variables repeat and made up

Is it possible that a line-feed character (0Ax) has been injected into some column?
The failure occurs because the input is attempting to read the status column (yes or no) as if it should be a number.
As the maximum linewidth was only 240 in this run you could reduce the LRECL value below 32K Although using 500000 should work, it is a bit larger than needed. Wider than 32767 the infile handling internally might be a bit different.
Occasional Contributor
Posts: 7

Re: Data Read in error- Variables repeat and made up

What is a line-feed character? I only know to denote Text vs. Numeric with $, so I am not sure how that changes what SAS sees other than calculable or class. 

And thank you for the thought on the LRECL. I have 33450 points, and I am usually around there +/- 10000. It was set at 100000, which is where I usually keep just to cover my data set. I did not realize that it might be handled differently with increasing length, I will keep that in mind. 

 

Thanks!

Valued Guide
Posts: 2,174

Re: Data Read in error- Variables repeat and made up

In the recent releases of SAS9 the INFILE option TERMSTR=CRLF was added for this kind of problem.

As mentioned earlier by @Tom, the most important INFILE option to clarify the way SAS should read your data, is TRUNCOVER
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 152 views
  • 2 likes
  • 4 in conversation