I was trying to practice with distribution tables using a random set of data. The "states"/categories appeared to be random instead of selecting the first possible column, which it does automatically in some cases, as the potential grouping of "states" could someone explain why this is?
P.S. images demonstrating this are in the attached document below
@fja wrote:
You have a tab as a delimiter mixed with spaces ...
you needed to add a line "infile cards dlm=" ";" with a space and a tab in between the quotesData; infile cards dlm=" "; Input state $ inactivity diabetes @@ ; Rate=inactivity/diabetes; datalines; ...
or you could replace tabs manually by spaces ...
--fja
edit: As we have doubled .. or you could use the solution of @Tom above. 🙂
Adding even more tab characters into the code is just going to compound the problem. Instead try using the EXPANDTABS option :
infile datalines expandtabs;
It is not at all clear what you are talking about. There is no need to attach a word processing file. To include text just use the Insert Code or Insert SAS Code buttons. To include a picture use the Insert Photos button, or just paste it in.
Data;
Input state $ inactivity diabetes @@ ;
Rate=inactivity/diabetes;
datalines;
AL 29.74 11.26
AL 24.74 9.54
AL 33.34 13.52
AL 35.02 11.4
AL 31.94 11.74
AL 32 16.02
AL 35.2 14.02
AL 32.58 12.78
AL 35.18 13.76
AL 32.06 11.62
AL 31.06 11.3
AL 34.02 14.26
AL 33.56 13.16
AL 36.04 12.26
AL 32.48 11.02
AL 29.12 12
AL 30.7 12.5
AL 34.3 14.32
AL 31.72 13
AL 35.78 11.64
AL 31.42 12.66
AL 29.42 11.58
AL 29.46 12.96
AL 33.38 14.3
AL 29.58 11.66
AL 30.26 11.56
AL 35.58 12.44
AL 32.86 12.22
AL 30.68 13
AL 34.8 12.32
AL 32.48 11.66
AL 34.56 16.34
AL 35.36 14.06
AL 29.06 12.62
AL 30.86 11.82
AL 30.48 11.84
AL 28.3 11.34
AL 33.78 11.9
AL 30.62 11.38
AL 33.62 12.8
AL 25.84 10.76
AL 30.66 10.4
AL 33.82 17.92
AL 31.82 15.22
AL 25.22 11.12
AL 34 15.06
AL 36.04 11.72
AL 31.98 11.6
AL 29.64 11.78
AL 31.86 12.74
AL 28.48 13.24
AL 28 10.94
AL 32.88 16.02
AL 32.52 13.86
AL 31.12 14.18
AL 31.94 12.62
AL 33.98 12.76
AL 31.2 11.82
AL 23.54 8.28
AL 31.26 15.16
AL 34.64 12.56
AL 30.46 12.06
AL 28.62 11.26
AL 35.48 12.88
AL 31.8 12.3
AL 32.62 16.32
AL 31.52 10.04
;
ods html close;
ods html;
proc glm;
class state;
model diabetes=state;
means state/lsd bon sidak tukey snk duncan scheffe lines;
Run;
The code used above was what was entered into the SAS program, with the hopes that AL and AZ would serve as my two main classes that would be measured. I assumed this due to putting similar words into previous versions of this program in the same column and them manifesting as such. However, once my distribution table came out of my results. The following image proved otherwise.
As you can see, the system chose random numbers to use as categories/states, which upon review is likely due to it choosing only part of the data entries shown, as shown below.
So, my two questions are as follows;
1. How do I make sure what ends up as a class in SAS aside from what is directly labeled in the "input state" section of my code
2. How do I make sure that the program accepts all my data entries and uses them?
P.S. I apologize for using a word document in my initial post. I did that previously because I was unfamiliar with how to upload images into my posts.
P.S.S.
I shall also include the "successful" ideal of what I was trying to do here:
Data;
Input state $ inactivity diabetes @@ ;
Rate=inactivity/diabetes;
datalines;
FL 42 65
FL 43 66
FL 41 64
FL 40 63
GA 63 44
GA 62 47
GA 68 50
GA 69 51
PA 22 33
PA 42 46
PA 23 42
PA 55 34
;
Ods html close;
Ods html;
proc glm;
class state;
model diabetes=state;
means state/lsd bon sidak tukey snk duncan scheffe lines;
Run;
P.S.S.
I also noticed that whenever I try to copy-paste something from an excel spreadsheet, some of the data shifts out of the column even if I'm not copying from the excel spreadsheet itself, could that also be part of what's going on?
An important step in debugging any problem is for you to LOOK AT the SAS data set being used with your own eyes. You haven't done this. If you had, you would see that sometimes STATE has a value of 'AL' and other times it has values that look like numbers such as '12.5' and '16.02'. Your data set has not been created properly. You need to fix this before you can run any PROC to analyze the data.
Read the SAS log.
69 Data; 70 Input state $ inactivity diabetes @@ ; 71 Rate=inactivity/diabetes; 72 datalines; NOTE: Invalid data for inactivity in line 74 4-14. NOTE: Invalid data for diabetes in line 75 1-2. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 75 CHAR AL 24.74.9.54 ZONE 44233233032332222222222222222222222222222222222222222222222222222222222222222222 NUMR 1C024E7499E540000000000000000000000000000000000000000000000000000000000000000000 NOTE: Invalid data errors for file CARDS occurred outside the printed range. NOTE: Increase available buffer lines with the INFILE n= option. state=AL inactivity=. diabetes=. Rate=. _ERROR_=1 _N_=1 NOTE: Invalid data for inactivity in line 76 1-2. NOTE: Invalid data for diabetes in line 76 4-14. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 ...
It looks like you have somehow gotten TAB characters into you lines of data.
69 data _null_; 70 input; 71 if index(_infile_,'09'x) then list; 72 datalines; RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 74 CHAR AL 29.74.11.26 ZONE 44233233033233222222222222222222222222222222222222222222222222222222222222222222 NUMR 1C029E74911E26000000000000000000000000000000000000000000000000000000000000000000 75 CHAR AL 24.74.9.54 ZONE 44233233032332222222222222222222222222222222222222222222222222222222222222222222 NUMR 1C024E7499E540000000000000000000000000000000000000000000000000000000000000000000 76 CHAR AL 33.34.13.52 ZONE 44233233033233222222222222222222222222222222222222222222222222222222222222222222 NUMR 1C033E34913E52000000000000000000000000000000000000000000000000000000000000000000 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
Note that if you are using SAS/Studio there is a handy option that will automatically convert attempts to add tabs character into your programs into the proper number of spaces to advance to the next tab stop instead. Look for the option called "Substitute spaces for tabs" under the "Code and Log" section of the user Preferences. If you have that set and paste your data step into the editor window then it works fine.
If you cannot figure out how to replace the tabs with spaces you could add an INFILE statement and use the EXPANDTABS option.
You have a tab as a delimiter mixed with spaces ...
you needed to add a line "infile cards dlm=" ";" with a space and a tab in between the quotes
Data;
infile cards dlm=" ";
Input state $ inactivity diabetes @@ ;
Rate=inactivity/diabetes;
datalines;
...
or you could replace tabs manually by spaces ...
--fja
edit: As we have doubled .. or you could use the solution of @Tom above. 🙂
@fja wrote:
You have a tab as a delimiter mixed with spaces ...
you needed to add a line "infile cards dlm=" ";" with a space and a tab in between the quotesData; infile cards dlm=" "; Input state $ inactivity diabetes @@ ; Rate=inactivity/diabetes; datalines; ...
or you could replace tabs manually by spaces ...
--fja
edit: As we have doubled .. or you could use the solution of @Tom above. 🙂
Adding even more tab characters into the code is just going to compound the problem. Instead try using the EXPANDTABS option :
infile datalines expandtabs;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.