I'm trying to follow the code on this site Test for the equality of two proportions in SAS - The DO Loop for the section called A chi-square test for association in SAS. I basically need to compare the proportion in one area which was tested for something to the proportion in another area which was tested and see if they are significantly different proportions, but I can't get the code to work right. I get this error:
Hmm, my guess (and it's only a guess) is that there might be an unprintable character in your datalines. This can happen if you copied/pasted the data from a MS Word file. Word and other "rich text" formats might have a CR/LF line ending, which could make SAS think there is a blank line between the 2nd and 3rd lines of data.
My suggestion: type the program verbatim into the SAS editor. Don't copy and paste it. Do you get the same error, or does the DATA step now run correctly?
Can't replicate with the posted example.
The main message windows on this forum will reformat text so it is possible that the code you posted has been modified in such a way that the error won't appear.
The code you show cannot generate the error shown.
Your shown INPUT statement does not create a variable named SEQ as shown with this:
Group=CountyA Seq=No N=. _ERROR_=1 _N_=2
Your code has variables Group, Test and N.
I suggest that you copy your code and paste into a text box opened on the forum using the </> icon above the message window and we can see if that will behave the same.
</>
data underfive;
length Group $9 Test $3;
input Group Test N;
datalines;
Worcester Yes 55
Worcester No 45027
NonWor Yes 71
NonWor No 311726
;
</>
Sorry, I had copied part of an old version in, this is my current code. It gives this error:
NOTE: Invalid data for N in line 79 1-6.
But does create a dataset but the numbers aren't all included
As posted that does not generate any error or invalid data :
3468 data underfive; 3469 length Group $9 Test $3; 3470 input Group Test N; 3471 datalines; NOTE: The data set USER.UNDERFIVE has 4 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 3476 ;
So you have to be running something different to generate such an invalid data message.
@Geoghegan wrote:
</>
data underfive;
length Group $9 Test $3;
input Group Test N;
datalines;
Worcester Yes 55
Worcester No 45027
NonWor Yes 71
NonWor No 311726
;</>
Sorry, I had copied part of an old version in, this is my current code. It gives this error:
NOTE: Invalid data for N in line 79 1-6.
But does create a dataset but the numbers aren't all included
Be sure to put the semicolon after the DATALINES statement on a line by itself:
data underfive;
length Group $15 Test $3;
input Group Test N;
datalines;
CountyA Yes 55
CountyA No 45027
CountyB Yes 71
CountyB No 311726
;
proc freq data=underfive order=data;
weight N;
tables Group*Test/chisq;
run;
Thank you! That helped make it create a dataset, though now the variables don't have the values they should (only had 3 obs and one is blank for N)
I don't know how you are running the code, but I assure you that the DATA step generates four observations:
data underfive;
length Group $15 Test $3;
input Group Test N;
datalines;
CountyA Yes 55
CountyA No 45027
CountyB Yes 71
CountyB No 311726
;
proc print data=underfive;
run;
Do the amount of spaces on the lines where it has CountyA Yes etc.. matter? I'm trying to figure out why it's telling me it only has three obs:
Post your code by doing the following:
1. Click the "Insert SAS Code" icon (looks like a running man). A dialog box will pop up.
2. Paste the EXACT code that generates the error into the dialog box.
3. Click OK to display the code in the thread.
4. Click Post so we can see the code.
data underfive;
length Group $15 Test $3;
input Group Test N;
datalines;
Worcester Yes 55
Worcester No 45027
NonWor Yes 71
NonWor No 311726
;
OK, here is my log for the code you posted. Show us yours.
7033 data underfive;
7034 length Group $15 Test $3;
7035 input Group Test N;
7036 datalines;
NOTE: The data set WORK.UNDERFIVE has 4 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
7041 ;
Hmm, my guess (and it's only a guess) is that there might be an unprintable character in your datalines. This can happen if you copied/pasted the data from a MS Word file. Word and other "rich text" formats might have a CR/LF line ending, which could make SAS think there is a blank line between the 2nd and 3rd lines of data.
My suggestion: type the program verbatim into the SAS editor. Don't copy and paste it. Do you get the same error, or does the DATA step now run correctly?
Ahh thank you! I opened a new program and typed it out and it worked just fine, thanks so much for your help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.