For one of my questions I was asked to create a seperate dataset with only female names, the question is out of 10, and she never specified what step to use, i used proc sort like this
33 proc sort data=Names out=work.Female_names (drop=gender); 34 by Name Count; 35 where Gender='F'; 36 run; NOTE: There were 89749 observations read from the data set WORK.NAMES. WHERE Gender='F'; NOTE: The data set WORK.FEMALE_NAMES has 89749 observations and 2 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.09 seconds cpu time 0.00 seconds
I got 0/10 for this question even though my friend who got 10/10 used a data step and got the same result i did (that being a seperate dataset). Her reasoning was that it is a major issue to not use data step for something like this even though it got the same result, I asked chatgpt if this is a major issue and it said that its not an issue if the code to use was never specified. Before I appeal, I would like to hear from an expert if this truly is a major issue that I used proc sort instead of data.
Definitely worth discussing with your instructor. Your code is one way to subset the data. If you need to SORT the data in anticipation of a later step, using PROC SORT with a WHERE statement would be more efficient than using a DATA step followed by a PROC SORT step. There are lots of ways in SAS to subset data.
I would have to see the exact instructions provided. As in letter by letter.
Proc Sort is a bit of overkill but without knowing the grading considerations it is hard to evaluate.
I suspect there may have been some semi-automatic grading applied, such as comparing your resulting data set to a "standard" output expected. If that is the case then you may have "failed" because your removed the Gender variable. Depending on the comparison method used the change in order may have been a factor.
As far as "Her reasoning was that it is a major issue to not use data step for something like this" goes, without given a very specific reason as to why a separate data set is needed there really is not any reason to create such. For any report or analysis you can subset the data with either a WHERE statement, as you used, or a similar data set option Where clause. Proliferation of data sets is actually often a symptom of poor design.
I do understand frustration with what appears to be identical results being downgraded because you use a different method than the grader expected. I have seen people get lower scores on use of programs like Excel because the test taker knew the keystroke short cuts to accomplish stuff instead of using the point-and-click through 5 sub-menus to do the same thing. Or did them in a different order than expected.
These were her instructions:
Create a new data set (female_names) that only contains those names which were given to female babies. Include only the variables name and count (10)
@hz16g22 wrote:
These were her instructions:
Create a new data set (female_names) that only contains those names which were given to female babies. Include only the variables name and count (10)
With instructions like that I would ask the instructor if Proc SQL would have been acceptable.
If you haven't been taught Proc SQL then we get into the discussion of what the grading is based on: Taught material or results. It used to be that to get A grades in some of classes I took you had to demonstrate going beyond only what was in the lecture/class sessions (learn on your own).
@hz16g22 wrote:
These were her instructions:
Create a new data set (female_names) that only contains those names which were given to female babies. Include only the variables name and count (10)
There is a lot left out of there. Perhaps it was explained just before the question?
What is the variable that can be used to indicate if baby is female? What is the value of that variable that indicates female?
Is COUNT one of the existing variables?
data female_names;
set have;
where sex='FEMALE';
keep name count;
run;
data female_names;
set have;
if sex='FEMALE';
keep name count;
run;
data female_names(keep=name count);
set have;
where sex='FEMALE';
run;
proc sql;
create table female_names as
select name,count
from have
where sex='FEMALE'
;
quit;
Or do you also need to count something?
proc freq data=have;
tables name / out=female_names(keep=name count);
where sex='FEMALE';
run;
@hz16g22 wrote:
it was yes, we were taught mostly proc procedures, i didnt use chatgpt as it would likely use moreadvancedproblematic code
We have a thread where some one had used Chatgpt to generate code. It had so many errors it was hard to even figure out what the code (questioner) thought it was attempting.
Your program risked changing the original order of names.
This effectively presumes that either the original order is unimportant, or that there would be a way to reproduce the original order if needed.
The fact that it matched your co-student's results was by chance, since the original data happened to be in name/count alphabetic order.
I would not give full credit for this answer to the lecturer's question.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.