BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
art297
Opal | Level 21

I don't think there is anything wrong with the statement you used, other than the "else;" isn't needed.

You should have gotten just a bunch of ones printed.

Have you learned, yet, what to put in an infile statement to show that you have reached the end of your data file?

And, do you want the total errors where areacode doesn't match your list?  If so, you shouldn't initialize it to 0 for each new record.

Leon27607
Fluorite | Level 6

"Have you learned, yet, what to put in an infile statement to show that you have reached the end of your data file?"

We have learned about the @ and @@ symbols but I'm not sure what you mean by this question, usually we just thought that whenever you ran out of records or a "full observation" SAS will automatically know it has reached the end of the file.

"And, do you want the total errors where areacode doesn't match your list?  If so, you shouldn't initialize it to 0 for each new record."

Yes, this is what I want, I want my total errors to equal however many do not match the list. I will test it now without initializing it to 0 each time.

Edit, tried this but it just gives me Errorcount = my observations which was 6914...

retain Errorcount 0;

if (areacode ne 828) or (areacode ne 704) or (areacode ne 980) or (areacode ne 336) or (areacode ne 919) or (areacode ne 910) or (areacode ne 252)  then Errorcount + 1;

put Errorcount;

Like I said it's acting like my logic is wrong because in this case it will say that none of my area codes = to any numbers in this list, I used a put areacode earlier in my code just to see which values it gives me.

Tom
Super User Tom
Super User

Try making a truth table to see what your condition evaluates to.  Use the values listed and one or two that are not in the list.  That should help you figure out our to fix your logic.

art297
Opal | Level 21

You have 41,484 observations, of which 6,914 don't meet the condition you specifed.

If you only wanted to see that number once, you could change your infile statement to read:

infile 'c:\art\child4.txt' pad end=eof; *Reading the file in and padding the end with spaces so it can tell when a new record begins*/;

and then change your put Errorcount; statement to be:

if eof then put Errorcount;

Leon27607
Fluorite | Level 6

Well, actually you said it a bit wrong, but I understand what you mean, it's not counting a group of my 8 variables as one "observation."(which is why if i use obs=a number less than 6 in the infile statement I get a lost card error) but if I run this code it'll say it is 41,484 records but 6914 Observations.

I tried what you said but first off yeah, we never used anything like end=eof, and also doing what you said just makes it output one number which is still the 6914 like before. I have sent an email to my teacher about this part but I don't know just how much he can help me.

EDIT: Ok I found out what was the problem, I was supposed to use "and" instead of "or" -_-. So dumb I wasted hours trying to figure this out too...

art297
Opal | Level 21

OK, my error!  However, you need a lesson in Logic 101.  I think what you wanted to ask in your code was:

if areacode ne 828 and areacode ne 704 and areacode ne 980 and areacode ne 336 and areacode ne 919 and areacode ne 910 and areacode ne 252  then Errorcount + 1;

if eof then put Errorcount;

Leon27607
Fluorite | Level 6

Blarg... I've almost got this done, just having trouble with the last 2 parts, which basically want me to once again use only parts of some data. I understand they're in certain columns but I don't know the syntax or how to "call" on these variables in these columns. The last 2 parts of this HW is to make a frequency table of stars, and the zipcode prefix composed of the first 3 digits. The final question is also similar to this by using the 1st or 1st 2 digits of the license numbers and either the zip code prefix or the area code in a frequency table. Also I'd have to test for independence(I know I just use a chi-squared test here).

You guys keep saying that they're in certain columns but I still don't understand the correct formatting to use it correctly.

I know I should be doing something like this though...

proc freq data=child;

by stars zip;

run;

proc freq data=child;

by centerid areacode;

run;

but like for zip since it should be only the first 3 digits... I tried doing zip 3 but that's obviously wrong. I don't know the syntax if there is one. Perhaps I could get around this some other way but I think it would screw up my answers for all the previous questions.

art297
Opal | Level 21

Can't you just read those columns as extra variables?  Just cause you already read them doesn't mean you cant re-read them.

P.S.  Sorry to everyone for using up so much of our bandwidth, but I'd really like to see the OP solve this on his own and I only know how to accomplish that by providing answers to questions.

Leon27607
Fluorite | Level 6

Yes, that's how I'm approaching this problem. Hmm.. can I not sort multiple variables in one proc sort? Here's my code which is practically done.(just need to edit the proc sort, the last 2 procs and put in a chi-squared test). It is telling me that my stars and area code are not sorted in ascending order but I put that into the proc sort.

data child;

infile 'E:\School Stuff\ST445\child4.txt' pad ; *Reading the file in and padding the end with spaces so it can tell when a new record begins*/;

label centeridprefix='License Number Prefix' centerid = 'Full center license number' name = 'Name of child care'

address = 'Address' townstate = 'Town and State' zipprefix = 'Zip prefix' zip = 'Full zip code' areacode = 'Area code of Telephone number'

phone = 'Phone number' class= 'Class of Center' License = 'What kind of license they have' stars = 'Star of class' ; *Labeling my variables;

input centeridprefix 1-2 centerid 1-8; *Inputting every variable in one by one because they are given vertically*/;

input name $ 1-36;

input address $ 1-36;

input townstate $ 1-30 zipprefix 31-33 zip 31-35 ;

input areacode 2-4 phone $ 1-36 ; *put areacode;

input class $ License $1-36;

stars = 0; *Just simply creating a stars variable, this may not be needed;

if (class = 'One') then stars = 1; *Logic to change from character variable to numeric;

else if (class = 'Two') then stars = 2;

else if (class = 'Three') then stars = 3;

else if (class = 'Four') then stars = 4;

else if (class = 'Five') then stars = 5;

else stars = '.';

*put stars;

retain Errorcount 0;

if (areacode ne 828) and (areacode ne 704) and (areacode ne 980) and (areacode ne 336) and (areacode ne 919) and (areacode ne 910) and (areacode ne 252)

then Errorcount + 1;

if (areacode ne 828) and (areacode ne 704) and (areacode ne 980) and (areacode ne 336) and (areacode ne 919) and (areacode ne 910) and (areacode ne 252)

then delete;

*put Errorcount;

if (areacode = 980) then areacode = 704;

run;

proc freq data=child;

table areacode;

run;

proc sort data=child;

by areacode;

by stars;

by zipprefix;

by centeridprefix;

run;

proc means data=child mean;

by areacode;

var stars;

run;

proc freq data=child;

by stars zipprefix;

run;

proc freq data=child;

by centeridprefix areacode;

run;

EDIT: Nevermind I think I should just use table (w/e I need) instead of sorting and it should work.

2nd Edit: Ok I'm finished, thank you everyone for your help. Going to mark this question as "solved."

art297
Opal | Level 21

I'll answer your sort question anyhow.  You can sort as many variables as you want within one proc sort.  e.g., taking the file sashelp.class, if you use:

  proc sort data=sashelp.class out=test;

    by age height weight;

  run;

you will end up with a file called work.test that has all of the students with the same age together, from lowest to highest and, within each age, the student will be ordered by height and, within each height, they will be ordered by weight.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 39 replies
  • 1225 views
  • 3 likes
  • 6 in conversation