Hi, i have trouble importing my text files, my output does not get the same as my lecure given.
Bellow are my codes and files attachment.
Thnks!
data ass.titaniccsv;
length row_names 3 pclass $3 survived 3 name $82 age 3
embarked $3 home_dest $70 room $5 ticket $20 boat $3 sex $10;
infile '/home/titanic.txt' dlm=',' dsd;
input row_names pclass $ survived name ~ $82. age
embarked $ home_dest $ room $ ticket $ boat $ sex $;
run;
If you want a column to be numeric, but have non-numeric data in the input stream that signals a missing value, your best choice is to read into a temporary character variable and conditionally convert from that:
data ....;
infile .....;
length ...... _age $3 age 3 ......;
input ..... _age .....;
if _age = 'NA'
then age = .;
else age = input(_age,3.);
drop _age;
run;
Any errors in your log?
no error, just warning.
the invalid data appeared every line.
Your code between this and your other question differ.
I would suggest using proc import and set GUESSINGROWS to the number of observation. The code to read the data will be in your log. Use that to compare t your current code to find the errors.
alrigh noted with thanks!!!
Just to relate to the error message:
NOTE: Invalid data for age in line 73 41-42.
in your code you defined: length ... age 3 ... whichs means that a numeric data is expected
the input value for age in line 73 (and many others) is NA that is an alphanumeric data,
and as such it is invalid, therfore output is: age=. which means age has missing value.
In your first post you wrote: my output does not get the same as my lecure given.
What have you been given ? Are there other discompatibilities ?
If you want a column to be numeric, but have non-numeric data in the input stream that signals a missing value, your best choice is to read into a temporary character variable and conditionally convert from that:
data ....;
infile .....;
length ...... _age $3 age 3 ......;
input ..... _age .....;
if _age = 'NA'
then age = .;
else age = input(_age,3.);
drop _age;
run;
@yewkeong wrote:
hey it works! but can u explain more details for me about this code? i dont really understand y must be two variables. At first store inside the character variables which is _age $3, if got 'NA' then set the numeric variables equals to missing value. if is not 'NA' the convert the _age into numeric? is it??
i'm sorry i very new aboout sas studio. =')
You are completely right about how the step works.
As for your question: in a dataset, a column can only have one type throughout the dataset, just like you have in database systems. This is where SAS differs from a spreadsheet program like Excel or OpenOffice calc. Since the data step data structures are taken from the contributing datasets, the same is true for the data step: a variable can only be of one type.
So one has to read mixed type values into a separate character variable and then conditionally convert to numeric values, the character variable can be kept for future reference or be dropped.
You could avoid the temporary variable, but trying to read mixed data directly into numeric will cause a lot of messages and finally a WARNING or ERROR in the log. Clean programming avoids those, so when you have them appearing in the log unexpectedly, you know something's wrong in the input data.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.