BookmarkSubscribeRSS Feed
abuanuazu
Fluorite | Level 6

Dear All,

 

The data contains variable called AJCC with the sub-variables: stage I, Stage II, Stage III, Stage IV, and Unknown Stage. But I keep getting missing output "Stage III". see the attached document for detail. Thank you for any help related to the problem.

 

abuanuazu

 

5 REPLIES 5
FreelanceReinh
Jade | Level 19

Hello @abuanuazu,

 

First of all, something must be missing in your code, probably a SET statement reading another dataset. Otherwise, where should variable AJCC come from?

 

The reason why SAS seems to omit "Stage III" is most likely that you did not specify a sufficient length for character variable AJCC in the other dataset. There should be a statement like the following in the data step creating the dataset which contains AJCC:

length AJCC $9;

If there was no LENGTH statement for AJCC, this variable would be assigned the default length for character variabes, which is 8, or the length of the first value assigned to it. But the string "Stage III" has length 9. The last roman digit "I" would be truncated if AJCC had length 8. Hence, all values which should read "Stage III" would get the value "Stage II". Not surprisingly, "Stage 2" has a particularly large percentage in your output (because most probably it is in fact the union of Stage II and Stage III).

 

The same goes for variabe MYSTAGE. The value "Unknown Stage" is truncated to "Unknown" due to the missing length specification.

abuanuazu
Fluorite | Level 6
Thank you for your prompt response.

This is the actual code:
Data cancer;
Infile "C:\sas\bcancer.csv" dlm="," firstobs=2;
Input AJCC $13;
Run;

Data status;
Set cancer;
if AJCC="Stage I" then mystage="Stage 1";
else if AJCC="Stage II" then mystage="Stage 2";
else if AJCC="Stage III" then mystage="Stage 3";
else if AJCC="Stage IV" then mystage="Stage 4";
else
mystage="Unknown Stage";
Run;

proc freq data=status;
table mystage/chisq;
run;

initially to test the data i didn't assign length. After your response, I add length to AJCC $13 and execute the code. The output produce all the data to "unknown" 100% data.

Any suggestion?
FreelanceReinh
Jade | Level 19

Your INPUT statement is invalid. [Edit: No, as pointed out by Tom, it's actually valid syntactically, although incorrect, in that it requests column input (for column 13) both senselessly and in a misleading way.]  Either correct it to read

input AJCC :$13.;

or insert a LENGTH statement before the INPUT statement:

length AJCC $13;

Then you can omit the informat specification completely and just write

input AJCC;

 

The LENGTH statement for MYSTAGE is still missing in the second data step.

Tom
Super User Tom
Super User

Look carefully at your INPUT statement.

 

input AJCC $13;

 

Since there is no period after the 13 it is taken to mean a column number instead of an informat. So it means to read the 13th character.  That is why you do not see any of the values.

 

Also why did you set the delimiter to a comma on the INFILE statement?  If the file really is a CSV file with more than one column then you should use the DSD option to make sure missing values are properly handled.  If you just have a single column of values then it easiest to just read the value and not worry about delimiters. But you might want to add a TRUNCOVER option so that it properly handles lines with less than 13 characters. 

 

data cancer;
  infile "C:\sas\bcancer.csv" firstobs=2 truncover;
  input AJCC $13.;
run;

abuanuazu
Fluorite | Level 6

Dear All,

 

Thank you for all help on Data Step question. I made a mistake in formating the real data from .xlsx to csv that cause all the problem.

I used the orginal data import to SAS and works fine. Thank you all again. please see the attached  result.

 

Abuanuazu

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2579 views
  • 5 likes
  • 3 in conversation