BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
BLT2023
Calcite | Level 5

I am a (very) new SAS user who is working on a class project. I am graded for the number of observations in my 'output data' matching the number of observations in the raw data file. The raw data file has 1,242 observations, but when I run this code (see below) it gives me 1,238 observations.  What would be possible reasons for this discrepancy, and how can I fix it? 

 

DATA CoImpt.Disabilities;
INFILE "&CourseRoot/CDPHE Study/Data/1_Source/disabilities.txt" FIRSTOBS = 2 DELIMITER = '&' ;
INPUT tract_fips            $10.
              pctn_disability :$4. ;
RUN;
 
PROC CONTENTS;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Text files have LINES.   Datasets have OBSERVATIONS.

 

So do the  1,242 lines include the header line you are skipping with the FIRSTOBS= option.  If so you should expect 1,241 observations in the dataset.

 

There are a few things that can cause your code to create fewer observations than lines of text read.

 

One definite issue is that you are reading the first 10 bytes into tract_fips which might be a problem if any of the lines have fewer than 10 bytes on them.

 

You also did not include the TRUNCOVER option on your INFILE statement.  That will prevent SAS from going to the next line to find a value when the current line does not have enough values to satisfy the INPUT statement.

 

Another possibility is that the actual lines are not delimited with the ampersand character, but actually are delimited with something else.

 

Try fixing your code first:

Spoiler
DATA CoImpt.Disabilities;
  INFILE "&CourseRoot/CDPHE Study/Data/1_Source/disabilities.txt"
     FIRSTOBS = 2 DELIMITER = '&'  truncover
  ;
  INPUT tract_fips      :$10.
        pctn_disability :$4. 
  ;
RUN;

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

Text files have LINES.   Datasets have OBSERVATIONS.

 

So do the  1,242 lines include the header line you are skipping with the FIRSTOBS= option.  If so you should expect 1,241 observations in the dataset.

 

There are a few things that can cause your code to create fewer observations than lines of text read.

 

One definite issue is that you are reading the first 10 bytes into tract_fips which might be a problem if any of the lines have fewer than 10 bytes on them.

 

You also did not include the TRUNCOVER option on your INFILE statement.  That will prevent SAS from going to the next line to find a value when the current line does not have enough values to satisfy the INPUT statement.

 

Another possibility is that the actual lines are not delimited with the ampersand character, but actually are delimited with something else.

 

Try fixing your code first:

Spoiler
DATA CoImpt.Disabilities;
  INFILE "&CourseRoot/CDPHE Study/Data/1_Source/disabilities.txt"
     FIRSTOBS = 2 DELIMITER = '&'  truncover
  ;
  INPUT tract_fips      :$10.
        pctn_disability :$4. 
  ;
RUN;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 340 views
  • 0 likes
  • 2 in conversation