DATA Step, Macro, Functions and more

Data step input question

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

Data step input question

I hope you are all having a good day. I am currently studying for my Base SAS certification and I have 2 questions in regards to the input statement in SAS. First off, here is my raw data:

 

RANCH,1250,2

SPLIT,1190,1

CONDO,1400,2

TWOSTORY,1810,4

RANCH,1500,3

SPLIT,1615,4

SPLIT,1305,3

 

Here is my code:

 

data work.condo_ranch;

infile file1 dsd;

input style $ @;

if style="CONDO" or style="RANCH" then

input sqfeet bedrooms;

run;

 

My questions:

 

1. Can someone explain the @ sign and why it is used here?

2. Why does my dataset contain all 7 observations rather than the 2 RANCH observations and 1 CONDO observation?


Accepted Solutions
Solution
‎03-03-2017 05:19 PM
Super User
Posts: 11,343

Re: Data step input question

If you look at the output set you will see that the styles other than RANCH or CONDO are missing the sqfeet and bedrooms because they were not read.

 

The @ holds the input pointer at end of the STYLE variable (in effect pointing at the comma) in the input vector until released or moved. When the style matches then it CONTINUES reading from the position on the SAME input line. The second input, when execute releases the hold as it does not have a @ or @@ at the end. Also reaching the bottom of the data step in effect releases the hold.

 

 

Remove the @ and see what the result looks like. You will get errors because without the hold the input pointer moves to the next line of file immediately after reading style and points to the second row(after reading RANCH from the first row) and attempts to read sqfeet and bedrooms from the second line. Since the first value on the second row, SPLIT is character then there is an error trying to read it as numeric value for sqfeet, and the second variable read is the sqfeet but gets read into the bedrooms variable. Also note that the number of output rows in the data will be about one-half the lines in the input data (may be more if the number is odd as the last input may not have a line to read after reading style.)

 

For extra fun replace the @ in the first input with @@ to see about holding across multiple interations of the data step.

 

 

 

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Data step input question

the @ tells the INPUT statement to leave the read pointer on the current line. So you can read the first value an do something condtional.

 

To exclude other records you would need a separate statement as you haven't provided any instruction to remove them.

 

If style in ('CONDO' 'RANCH');

would subset the output. Likely this line should go after the input for sqfeet an bedrooms.

 

OR incorporate an explict output statement:

if style="CONDO" or style="RANCH" then do;

   input sqfeet bedrooms;

   output;

end;

would only send data to the output data set when the style is as set.

 

Note that instead of multiple OR bits for the same variable that the IN operator is the equivalent.

Contributor
Posts: 27

Re: Data step input question

Thank you for your response. I still do not understand the purpose of the @ sign. When SAS is going through the datastep, I thought the following was happening:

 

It starts at the line "RANCH,1250,2," it evaluates if the style is equal to RANCH or CONDO, then it decides if it should input the squarefeet or bedrooms.

 

Am I wrong on my assumption? What are the steps that the @ sign goes through?

 

Solution
‎03-03-2017 05:19 PM
Super User
Posts: 11,343

Re: Data step input question

If you look at the output set you will see that the styles other than RANCH or CONDO are missing the sqfeet and bedrooms because they were not read.

 

The @ holds the input pointer at end of the STYLE variable (in effect pointing at the comma) in the input vector until released or moved. When the style matches then it CONTINUES reading from the position on the SAME input line. The second input, when execute releases the hold as it does not have a @ or @@ at the end. Also reaching the bottom of the data step in effect releases the hold.

 

 

Remove the @ and see what the result looks like. You will get errors because without the hold the input pointer moves to the next line of file immediately after reading style and points to the second row(after reading RANCH from the first row) and attempts to read sqfeet and bedrooms from the second line. Since the first value on the second row, SPLIT is character then there is an error trying to read it as numeric value for sqfeet, and the second variable read is the sqfeet but gets read into the bedrooms variable. Also note that the number of output rows in the data will be about one-half the lines in the input data (may be more if the number is odd as the last input may not have a line to read after reading style.)

 

For extra fun replace the @ in the first input with @@ to see about holding across multiple interations of the data step.

 

 

 

Contributor
Posts: 27

Re: Data step input question

Thank you so much!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 125 views
  • 0 likes
  • 2 in conversation