BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Durlov
Obsidian | Level 7

Question:

The following SAS program is submitted:
data WORM.INFO;
     infile 'DATAFILE.TXT';
     input @1 Company $20. @25 State $2. @;
     if State='  '  then input @30 Year;
     else input @30 City Year;
     input NumEmployees;
run; 
How many raw data records are read during each iteration of the DATA step?
The Answer is 2.
 
Why? Isn't a record, one line of data with one or many observations? So Every iteration, reads only one line to input into a SAS dataset. Is my thinking not correct?
 
My thinking:
Record 1 = Obs1 Obs2 Obs3
Record 2 = Obs4 Obs5 Obs6
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The best way to understand SAS programming is to play with small examples, such as :

 

data INFO;
input @1 Company $20. @25 State $2. @;
if State='  '  then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc.         CA   123456     2012
10
Nowhere Ltd                             2013
100
Every Where Corp.       VT   654321     2014
1000
;

proc print data=info noobs; run;

It's only when I tried this that I realized that City was defined as a number.

PG

View solution in original post

7 REPLIES 7
PGStats
Opal | Level 21

This data step implements the case where two records from file datafile.txt are read to form a single observation in dataset worm.info.

An input statement with a trailling '@' stays on the same line so hat the next input within the same iteration continues at the same position.

PG
Durlov
Obsidian | Level 7

Oh, I see what you're saying. The first record being read is the one with the @ and the second is the next record in the data set. Is that correct? 

 

Just to note, for the very first iteration all the data would be from the same record, so that would be one record, right?

PGStats
Opal | Level 21

The first line read would contain either

 

Company Blank Year

or

Company State City Year

 

The second line read would contain 

 

NumEmployees

 

The output dataset will thus contain all those fields, with missing values for State and City for some observations.

PG
Tom
Super User Tom
Super User

You will execute three input statements (there are actually four in the code but two are in the same IF/THEN/ELSE block so only one of the will execute.)  But the first one uses the trailing @ that will keep the same line of input. So the second INPUT that executes reads from the same input line as the first INPUT statement used.  You then read a second line of input with the final INPUT statement.  That is why each iteration of the data step reads two lines of the input data file.  So if your data file has 20 lines your resulting SAS dataset will have 10 observations.

 

The only wrinkle that could happen to change this would be if some of the requested fields are not populated.  This could cause some of the INPUT statements to go to the next line to look for values of STATE, CITY or YEAR. You can add the TRUNCOVER option to the INFILE statement to prevent that.

Durlov
Obsidian | Level 7

So, the data set would be set up as follows? 

 

Record 1: Company State City Year

Record 2: NumEmployees

 

Record 3: CompanyObs1 StateObs1 CityObs1 YearObs1

Record 4: NumEmployeesObs1

Record 5: CompanyObs2 StateObs2 CityObs2 YearObs2

Record 6: NumEmployeesObs2

 

I'm understanding from your explanations, that the only reason one would use 2 "input" statements (note, I'm skipping over the IF/ELSE statements) is if the data is split up as above. Is that correct? I highlighted the "input" statements I'm talking about below.

 

data WORM.INFO;
     infile 'DATAFILE.TXT';
     input @1 Company $20. @25 State $2. @;
     if State='  '  then input @30 Year;
     else input @30 City Year;
     input NumEmployees;
run; 

 

 

 

 

PGStats
Opal | Level 21

The best way to understand SAS programming is to play with small examples, such as :

 

data INFO;
input @1 Company $20. @25 State $2. @;
if State='  '  then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc.         CA   123456     2012
10
Nowhere Ltd                             2013
100
Every Where Corp.       VT   654321     2014
1000
;

proc print data=info noobs; run;

It's only when I tried this that I realized that City was defined as a number.

PG
Durlov
Obsidian | Level 7
Great! Thanks much for you're help. I usually, try to do the examples in SAS, but don't have access to it at work. So need to run them after I get home.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 4728 views
  • 3 likes
  • 3 in conversation