DATA Step, Macro, Functions and more

Practice Question

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

Practice Question

Question:

The following SAS program is submitted:
data WORM.INFO;
     infile 'DATAFILE.TXT';
     input @1 Company $20. @25 State $2. @;
     if State='  '  then input @30 Year;
     else input @30 City Year;
     input NumEmployees;
run; 
How many raw data records are read during each iteration of the DATA step?
The Answer is 2.
 
Why? Isn't a record, one line of data with one or many observations? So Every iteration, reads only one line to input into a SAS dataset. Is my thinking not correct?
 
My thinking:
Record 1 = Obs1 Obs2 Obs3
Record 2 = Obs4 Obs5 Obs6

Accepted Solutions
Solution
‎12-30-2015 11:27 AM
Respected Advisor
Posts: 4,921

Re: Practice Question

The best way to understand SAS programming is to play with small examples, such as :

 

data INFO;
input @1 Company $20. @25 State $2. @;
if State='  '  then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc.         CA   123456     2012
10
Nowhere Ltd                             2013
100
Every Where Corp.       VT   654321     2014
1000
;

proc print data=info noobs; run;

It's only when I tried this that I realized that City was defined as a number.

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,921

Re: Practice Question

This data step implements the case where two records from file datafile.txt are read to form a single observation in dataset worm.info.

An input statement with a trailling '@' stays on the same line so hat the next input within the same iteration continues at the same position.

PG
Contributor
Posts: 24

Re: Practice Question

Oh, I see what you're saying. The first record being read is the one with the @ and the second is the next record in the data set. Is that correct? 

 

Just to note, for the very first iteration all the data would be from the same record, so that would be one record, right?

Respected Advisor
Posts: 4,921

Re: Practice Question

The first line read would contain either

 

Company Blank Year

or

Company State City Year

 

The second line read would contain 

 

NumEmployees

 

The output dataset will thus contain all those fields, with missing values for State and City for some observations.

PG
Super User
Super User
Posts: 7,046

Re: Practice Question

You will execute three input statements (there are actually four in the code but two are in the same IF/THEN/ELSE block so only one of the will execute.)  But the first one uses the trailing @ that will keep the same line of input. So the second INPUT that executes reads from the same input line as the first INPUT statement used.  You then read a second line of input with the final INPUT statement.  That is why each iteration of the data step reads two lines of the input data file.  So if your data file has 20 lines your resulting SAS dataset will have 10 observations.

 

The only wrinkle that could happen to change this would be if some of the requested fields are not populated.  This could cause some of the INPUT statements to go to the next line to look for values of STATE, CITY or YEAR. You can add the TRUNCOVER option to the INFILE statement to prevent that.

Contributor
Posts: 24

Re: Practice Question

So, the data set would be set up as follows? 

 

Record 1: Company State City Year

Record 2: NumEmployees

 

Record 3: CompanyObs1 StateObs1 CityObs1 YearObs1

Record 4: NumEmployeesObs1

Record 5: CompanyObs2 StateObs2 CityObs2 YearObs2

Record 6: NumEmployeesObs2

 

I'm understanding from your explanations, that the only reason one would use 2 "input" statements (note, I'm skipping over the IF/ELSE statements) is if the data is split up as above. Is that correct? I highlighted the "input" statements I'm talking about below.

 

data WORM.INFO;
     infile 'DATAFILE.TXT';
     input @1 Company $20. @25 State $2. @;
     if State='  '  then input @30 Year;
     else input @30 City Year;
     input NumEmployees;
run; 

 

 

 

 

Solution
‎12-30-2015 11:27 AM
Respected Advisor
Posts: 4,921

Re: Practice Question

The best way to understand SAS programming is to play with small examples, such as :

 

data INFO;
input @1 Company $20. @25 State $2. @;
if State='  '  then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc.         CA   123456     2012
10
Nowhere Ltd                             2013
100
Every Where Corp.       VT   654321     2014
1000
;

proc print data=info noobs; run;

It's only when I tried this that I realized that City was defined as a number.

PG
Contributor
Posts: 24

Re: Practice Question

Great! Thanks much for you're help. I usually, try to do the examples in SAS, but don't have access to it at work. So need to run them after I get home.
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 1432 views
  • 2 likes
  • 3 in conversation