BookmarkSubscribeRSS Feed
MisterJenn
Fluorite | Level 6

I am trying to input my data into SAS. But I keep getting blank space in my data file. 

data hw4.bone;
infile "\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt";
input dob 1-10 dov 11-20 wt 21-26 ht 27-30 peri 31-35 
endo 35-40 tibia 41-44 tanner $ 45 sex $ 46;
age = (dov - dob)/365;
logperi = log(peri);
run;
206  infile "\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt";
207  input dob 1-10 dov 11-20 wt 21-26 ht 27-30 peri 31-35
208  endo 35-40 tibia 41-44 tanner $ 45 sex $ 46;
209  age = (dov - dob)/365;
210  logperi = log(peri);
211  run;

NOTE: The infile "\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt" is:
      Filename=\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt,
      RECFM=V,LRECL=32767,File Size (bytes)=37854,
      Last Modified=04Oct2022:16:09:23,
      Create Time=04Oct2022:16:09:23

NOTE: Invalid data for dob in line 1 1-10.
NOTE: Invalid data for dov in line 1 11-20.
NOTE: Invalid data for wt in line 1 21-26.
NOTE: Invalid data for peri in line 1 31-35.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--
1         11MAR1975 19MAR2004 55.7 167.8333 67.722 35.05 392 5 2 54
dob=. dov=. wt=. ht=67.8 peri=. endo=67.722 tibia=35 tanner=0 sex=5 age=. logperi=. _ERROR_=1 _N_=1
NOTE: Invalid data for dob in line 2 1-10.
NOTE: Invalid data for dov in line 2 11-20.
NOTE: Invalid data for ht in line 2 27-30.
NOTE: Invalid data for endo in line 2 35-40.
2         11OCT1978 29MAR2004 61.93333 160.8667 75.179 31.42 370 5 2 58
dob=. dov=. wt=61.933 ht=. peri=60.86 endo=. tibia=0.179 tanner=  sex=3 age=. logperi=4.1085761445
_ERROR_=1 _N_=2
NOTE: Invalid data for dob in line 3 1-10.
NOTE: Invalid data for dov in line 3 11-20.
NOTE: Invalid data for wt in line 3 21-26.
NOTE: Invalid data for peri in line 3 31-35.
3         25JAN1975 16AUG2004 85.5 188.4333 78.923 42.373 451 5 1 55
dob=. dov=. wt=. ht=88.4 peri=. endo=78.923 tibia=42 tanner=3 sex=7 age=. logperi=. _ERROR_=1 _N_=3
NOTE: Invalid data for dob in line 4 1-10.
NOTE: Invalid data for dov in line 4 11-20.
NOTE: Invalid data for wt in line 4 21-26.
NOTE: Invalid data for peri in line 4 31-35.
4         17APR1978 26OCT2004 58.3 160.4333 65.595 32.551 353 5 2 55
dob=. dov=. wt=. ht=60.4 peri=. endo=65.595 tibia=32 tanner=5 sex=5 age=. logperi=. _ERROR_=1 _N_=4
NOTE: Invalid data for dob in line 5 1-10.
NOTE: Invalid data for dov in line 5 11-20.
NOTE: Invalid data for ht in line 5 27-30.
NOTE: Invalid data for endo in line 5 35-40.
5         21FEB1978 18NOV2004 59.13334 160.0333 63.761 38.858 351 5 2 59
dob=. dov=. wt=59.133 ht=. peri=60.03 endo=. tibia=0.761 tanner=  sex=3 age=. logperi=4.0948444373
_ERROR_=1 _N_=5
NOTE: Invalid data for dob in line 6 1-10.
NOTE: Invalid data for dov in line 6 11-20.
NOTE: Invalid data for ht in line 6 27-30.
NOTE: Invalid data for endo in line 6 35-40.
6         24AUG1977 29NOV2004 70.63333 184.7667 80.437 33.072 406 5 1 59
dob=. dov=. wt=70.633 ht=. peri=84.76 endo=. tibia=0.437 tanner=  sex=3 age=. logperi=4.4398237334
_ERROR_=1 _N_=6
NOTE: Invalid data for dob in line 7 1-10.
NOTE: Invalid data for dov in line 7 11-20.
NOTE: Invalid data for ht in line 7 27-30.
NOTE: Invalid data for endo in line 7 35-40.
7         28OCT1973 03JAN2005 64.36667 171.7333 66.205 35.9 382 5 2 57
dob=. dov=. wt=64.366 ht=. peri=71.73 endo=. tibia=0.205 tanner=  sex=3 age=. logperi=4.2729090701
_ERROR_=1 _N_=7
NOTE: Invalid data for dob in line 8 1-10.
NOTE: Invalid data for dov in line 8 11-20.
NOTE: Invalid data for wt in line 8 21-26.
NOTE: Invalid data for peri in line 8 31-35.
8         04JAN1981 09MAR2005 93.3 187.4667 79.986 35.279 421 5 1 55
dob=. dov=. wt=. ht=87.4 peri=. endo=79.986 tibia=35 tanner=2 sex=7 age=. logperi=. _ERROR_=1 _N_=8
NOTE: Invalid data for dob in line 9 1-10.
NOTE: Invalid data for dov in line 9 11-20.
NOTE: Invalid data for wt in line 9 21-26.
NOTE: Invalid data for peri in line 9 31-35.
9         19SEP1970 06APR2005 81.4 185.7667 80.011 47.433 412 5 1 55
dob=. dov=. wt=. ht=85.7 peri=. endo=80.011 tibia=47 tanner=4 sex=3 age=. logperi=. _ERROR_=1 _N_=9
NOTE: Invalid data for dob in line 10 1-10.
NOTE: Invalid data for dov in line 10 11-20.
NOTE: Invalid data for wt in line 10 21-26.
NOTE: Invalid data for peri in line 10 31-35.
10        06OCT1979 07APR2005 86.1 177.5667 73.965 33.133 377 5 1 55
dob=. dov=. wt=. ht=77.5 peri=. endo=73.965 tibia=33 tanner=1 sex=3 age=. logperi=. _ERROR_=1 _N_=10
NOTE: Invalid data for dob in line 11 1-10.
NOTE: Invalid data for dov in line 11 11-20.
NOTE: Invalid data for wt in line 11 21-26.
NOTE: Invalid data for peri in line 11 31-35.
11        21MAR1975 08APR2005 77.5 187.3333 79.986 38.468 439 5 1 55
dob=. dov=. wt=. ht=87.3 peri=. endo=79.986 tibia=38 tanner=4 sex=6 age=. logperi=. _ERROR_=1 _N_=11
NOTE: Invalid data for dob in line 12 1-10.
NOTE: Invalid data for dov in line 12 11-20.
NOTE: Invalid data for ht in line 12 27-30.
NOTE: Invalid data for tibia in line 12 41-44.
12        29OCT1970 08JUN2005 70.23333 172.4 83.708 45.794 400 5 1 56
dob=. dov=. wt=70.233 ht=. peri=72.4 endo=83.7 tibia=. tanner=  sex=7 age=. logperi=4.2822062994
_ERROR_=1 _N_=12
NOTE: Invalid data for dob in line 13 1-10.
NOTE: Invalid data for dov in line 13 11-20.
NOTE: Invalid data for wt in line 13 21-26.
NOTE: Invalid data for endo in line 13 35-40.
13        26NOV1975 25JUN2005 90.1 175.1 76.557 36.84 390 5 1 51
dob=. dov=. wt=. ht=75.1 peri=76.5 endo=. tibia=0.84 tanner=3 sex=9 age=. logperi=4.3372907408
_ERROR_=1 _N_=13
NOTE: Invalid data for dob in line 14 1-10.
NOTE: Invalid data for dov in line 14 11-20.
NOTE: Invalid data for wt in line 14 21-26.
NOTE: Invalid data for endo in line 14 35-40.
NOTE: Invalid data for tibia in line 14 41-44.
14        30DEC1982 29JUL2005 56.4 164.3 67.2 27.385 372 5 2 50
dob=. dov=. wt=. ht=64.3 peri=67.2 endo=. tibia=. tanner=7 sex=2 age=. logperi=4.2076732475 _ERROR_=1
_N_=14
NOTE: Invalid data for dob in line 15 1-10.
NOTE: Invalid data for dov in line 15 11-20.
NOTE: Invalid data for wt in line 15 21-26.
NOTE: Invalid data for peri in line 15 31-35.
15        16AUG1985 25JAN2002 59.6 155.1333 63.682 29.743 360 3 2 55
dob=. dov=. wt=. ht=55.1 peri=. endo=63.682 tibia=29 tanner=7 sex=4 age=. logperi=. _ERROR_=1 _N_=15
NOTE: Invalid data for dob in line 16 1-10.
NOTE: Invalid data for dov in line 16 11-20.
NOTE: Invalid data for ht in line 16 27-30.
NOTE: Invalid data for endo in line 16 35-40.
16        25MAY1977 26AUG2005 70.96667 173.0333 72.815 37.274 402 5 1 59
dob=. dov=. wt=70.966 ht=. peri=73.03 endo=. tibia=0.815 tanner=  sex=3 age=. logperi=4.2908703156
_ERROR_=1 _N_=16
NOTE: Invalid data for dob in line 17 1-10.
NOTE: Invalid data for dov in line 17 11-20.
NOTE: Invalid data for ht in line 17 27-30.
NOTE: Invalid data for endo in line 17 35-40.
17        23NOV1988 08FEB2002 61.56667 157.2667 62.696 29.335 339 3 2 59
dob=. dov=. wt=61.566 ht=. peri=57.26 endo=. tibia=0.696 tanner=  sex=2 age=. logperi=4.0476022997
_ERROR_=1 _N_=17
NOTE: Invalid data for dob in line 18 1-10.
NOTE: Invalid data for dov in line 18 11-20.
NOTE: Invalid data for wt in line 18 21-26.
NOTE: Invalid data for peri in line 18 31-35.
18        18DEC1975 02NOV2005 71.7 167.3667 65.748 29.198 384 5 2 55
dob=. dov=. wt=. ht=67.3 peri=. endo=65.748 tibia=29 tanner=1 sex=9 age=. logperi=. _ERROR_=1 _N_=18
NOTE: Invalid data for dob in line 19 1-10.
NOTE: Invalid data for dov in line 19 11-20.
NOTE: Invalid data for wt in line 19 21-26.
NOTE: Invalid data for peri in line 19 31-35.
19        19JAN1982 07NOV2005 73.3 180.3667 77.807 35.506 411 5 1 55
dob=. dov=. wt=. ht=80.3 peri=. endo=77.807 tibia=35 tanner=5 sex=0 age=. logperi=. _ERROR_=1 _N_=19
NOTE: Invalid data for dob in line 20 1-10.
NOTE: Invalid data for dov in line 20 11-20.
NOTE: Invalid data for wt in line 20 21-26.
NOTE: Invalid data for peri in line 20 31-35.
WARNING: Limit set by ERRORS= option reached.  Further errors of this type will not be printed.
20        30NOV1969 08NOV2005 83.9 164.5667 66.129 33.163 411 5 2 55
dob=. dov=. wt=. ht=64.5 peri=. endo=66.129 tibia=33 tanner=1 sex=6 age=. logperi=. _ERROR_=1 _N_=20
NOTE: 665 records were read from the infile
      "\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt".
      The minimum record length was 48.
      The maximum record length was 59.
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      665 at 209:12   263 at 210:11
NOTE: The data set HW4.BONE has 665 observations and 11 variables.
NOTE: DATA statement used (Total process time):
      real time           0.28 seconds
      cpu time            0.09 seconds

Screen Shot 2022-10-10 at 9.13.36 PM.png
5 REPLIES 5
Patrick
Opal | Level 21

See here.

MisterJenn
Fluorite | Level 6
This code is not working. Part of the data set is still missing.
Patrick
Opal | Level 21

The code I've shared here works for the data layout I've shared.

You haven't shared a representative sample of your source data (the text file, i.e. as an attachment). May-be spend now a bit of time to read-up all the docu links I've shared. 

 

To get quickly to a first template of a valid infile/input statement you can also use either the SAS EG or SAS Studio import wizard OR Proc Import. 

Proc Import will then write the generated data step to the SAS log.

 

NB: If for example DoB is missing then you will likely need a different informat - one that's suitable for the text string in your source data. Consult the documentation to find the matching SAS informat.

ballardw
Super User

Yet another teaching point. How to read a log with invalid data. The 1, 2 , 3 etc represent columns 10, 20, 30 to 90 in the source file. The + sign is at the 5 following, so represent 15, 25, 35, etc. So we can easily tell that your column specifications point to the wrong place multiple times.

NOTE: Invalid data for dob in line 1 1-10.
NOTE: Invalid data for dov in line 1 11-20.
NOTE: Invalid data for wt in line 1 21-26.
NOTE: Invalid data for peri in line 1 31-35.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--
1         11MAR1975 19MAR2004 55.7 167.8333 67.722 35.05 392 5 2 54

DOB in the source is 11MAR1975. Your Input statement attempts to read it as a numerical value that appears in columns 1 to 10. Without a specific informat SAS uses a default like 12. Meaning 12 numeric characters plus possibly minus signs, decimals or E as part of exponential values such as 1.3E6 (1300000). So you have to tell SAS the informat of the data to read the values. In this case you want a DATE9. format. The same for DOV.

WT does not get read correctly because column 26 is the 1 of the ht value by position (your height has the wrong value because it starts reading at column 27, which is the 6 of the actual 167.8333 of the Ht value) and SAS will not read a "number" with a space in the middle using the default informat for numbers.

Peri does not get read for the exact same reason that Wt doesn't just different columns: you tell SAS to read Peri starting at 31. The character at column 31 is the first 3 of the Ht value. So reading columns 31 to 35 attempts to read "333 6" into Peri. Spaces in the middle of the value are not valid for numbers.

You have either counted columns incorrectly or were given column positions that were incorrect OR were incorrectly told the data was fixed columns. It isn't. Your values have differing numbers of characters in the source all over the place. So depending on which line of data the columns misalign differently.

 

I have a strong feeling that you may have better luck with an input statement like this:

input dob :date9. dov :date9. wt  ht  peri  
endo  tibia  tanner $  sex $ ;
/* and a format for the dates so people can read them*/
Format dob dov date9. ;

The : in front of the Date9. informat tells SAS to read the next LIST item found with that format, regardless of actual column in the file. It appears that your file is actually space delimited text.

 

If you had included an example of 5 lines of text from your file 5 or 6 questions ago this would have been pointed out as well.

Tom
Super User Tom
Super User

You have to write an INPUT statement that at least attempts to match the layout of your data.

From your SAS log you posted you can see that the first 19 lines of your file does NOT have data lined up into neat columns like your code is expecting.

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--
1         11MAR1975 19MAR2004 55.7 167.8333 67.722 35.05 392 5 2 54
2         11OCT1978 29MAR2004 61.93333 160.8667 75.179 31.42 370 5 2 58
3         25JAN1975 16AUG2004 85.5 188.4333 78.923 42.373 451 5 1 55
4         17APR1978 26OCT2004 58.3 160.4333 65.595 32.551 353 5 2 55
5         21FEB1978 18NOV2004 59.13334 160.0333 63.761 38.858 351 5 2 59
6         24AUG1977 29NOV2004 70.63333 184.7667 80.437 33.072 406 5 1 59
7         28OCT1973 03JAN2005 64.36667 171.7333 66.205 35.9 382 5 2 57
8         04JAN1981 09MAR2005 93.3 187.4667 79.986 35.279 421 5 1 55
9         19SEP1970 06APR2005 81.4 185.7667 80.011 47.433 412 5 1 55
10        06OCT1979 07APR2005 86.1 177.5667 73.965 33.133 377 5 1 55
11        21MAR1975 08APR2005 77.5 187.3333 79.986 38.468 439 5 1 55
12        29OCT1970 08JUN2005 70.23333 172.4 83.708 45.794 400 5 1 56
13        26NOV1975 25JUN2005 90.1 175.1 76.557 36.84 390 5 1 51
14        30DEC1982 29JUL2005 56.4 164.3 67.2 27.385 372 5 2 50
15        16AUG1985 25JAN2002 59.6 155.1333 63.682 29.743 360 3 2 55
16        25MAY1977 26AUG2005 70.96667 173.0333 72.815 37.274 402 5 1 59
17        23NOV1988 08FEB2002 61.56667 157.2667 62.696 29.335 339 3 2 59
18        18DEC1975 02NOV2005 71.7 167.3667 65.748 29.198 384 5 2 55
19        19JAN1982 07NOV2005 73.3 180.3667 77.807 35.506 411 5 1 55

Instead the values are just flowed into the line separated by a space.

So tell SAS that in the INPUT statement.

data hw4.bone;
  infile "\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Assignment 4\bone.txt";
  input dob :date. dov :date. wt ht peri endo tibia tanner sex;
  age = (dov - dob)/365;
  logperi = log(peri);
  format dob dov date9.;
run;

Tom_0-1665462467021.png

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 543 views
  • 0 likes
  • 4 in conversation