Question:
The best way to understand SAS programming is to play with small examples, such as :
data INFO;
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc. CA 123456 2012
10
Nowhere Ltd 2013
100
Every Where Corp. VT 654321 2014
1000
;
proc print data=info noobs; run;
It's only when I tried this that I realized that City was defined as a number.
This data step implements the case where two records from file datafile.txt are read to form a single observation in dataset worm.info.
An input statement with a trailling '@' stays on the same line so hat the next input within the same iteration continues at the same position.
Oh, I see what you're saying. The first record being read is the one with the @ and the second is the next record in the data set. Is that correct?
Just to note, for the very first iteration all the data would be from the same record, so that would be one record, right?
The first line read would contain either
Company Blank Year
or
Company State City Year
The second line read would contain
NumEmployees
The output dataset will thus contain all those fields, with missing values for State and City for some observations.
You will execute three input statements (there are actually four in the code but two are in the same IF/THEN/ELSE block so only one of the will execute.) But the first one uses the trailing @ that will keep the same line of input. So the second INPUT that executes reads from the same input line as the first INPUT statement used. You then read a second line of input with the final INPUT statement. That is why each iteration of the data step reads two lines of the input data file. So if your data file has 20 lines your resulting SAS dataset will have 10 observations.
The only wrinkle that could happen to change this would be if some of the requested fields are not populated. This could cause some of the INPUT statements to go to the next line to look for values of STATE, CITY or YEAR. You can add the TRUNCOVER option to the INFILE statement to prevent that.
So, the data set would be set up as follows?
Record 1: Company State City Year
Record 2: NumEmployees
Record 3: CompanyObs1 StateObs1 CityObs1 YearObs1
Record 4: NumEmployeesObs1
Record 5: CompanyObs2 StateObs2 CityObs2 YearObs2
Record 6: NumEmployeesObs2
I'm understanding from your explanations, that the only reason one would use 2 "input" statements (note, I'm skipping over the IF/ELSE statements) is if the data is split up as above. Is that correct? I highlighted the "input" statements I'm talking about below.
data WORM.INFO;
infile 'DATAFILE.TXT';
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
run;
The best way to understand SAS programming is to play with small examples, such as :
data INFO;
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
datalines;
Some Place Inc. CA 123456 2012
10
Nowhere Ltd 2013
100
Every Where Corp. VT 654321 2014
1000
;
proc print data=info noobs; run;
It's only when I tried this that I realized that City was defined as a number.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.