Is there any way to get IF First.ID result with PROC SQL?
Please let me know.
data readin;
input ID Name $ Score;
cards;
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72
;
run;
PROC SORT DATA = READIN;
BY ID;
RUN;
DATA READIN1;
SET READIN;
BY ID;
IF FIRST.ID;
PROC PRINT;
RUN;
Desired output:
Thank you!!!
I think you're over thinking this....
Create table AVG_SCORE_TABLE as
Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE
from readin1
Group by name
Order by AVG_SCORE DESC;
QUIT;
In general, no. There is no given order that SQL will return records.
If there is another, or a combination of other variables, that control order, such as a DATE or sequence number you might be able to restrict the output using a HAVING clause with the minimum of that value(better be a numeric value, character minimums seldom match expectations).
If you example data had a variable like test number something like this:
data readin; input ID Name $ Score test; cards; 1 David 45 1 1 David 74 2 2 Sam 45 3 2 Ram 54 4 3 Bane 87 5 3 Mary 92 6 3 Bane 87 7 4 Dane 23 8 5 Jenny 87 9 5 Ken 87 10 6 Simran 63 11 8 Priya 72 12 ; run; proc sql; create table want as select * from readin group by id having test=min(test) ; quit;
It would make more sense to have sequence for each ID but they don't have to be such and the above was easy to enter in the editor since your values were grouped in the data step (thank you).
Why bother? The DATA step gives you the result with very little coding while SQL needs additional effort, so stay with it.
Maxim 14. Use the Right Tool.
I totally get it what both of you are saying, that's why after researching a lot, I ended up asking here.
It is not the first time that I receive a reply saying it is not possible to do this or that with Proc SQL... I guess my teacher have no clue what he is asking for... and probably he is not an expert on the subject himself... Difficult will be for me to question him about it without saying that I asked a ton of experts... Thank you a lot @ballardw and @Kurt_Bremser !
The current SAS program code does make a lot of sense. The question would make more sense if you were sorting by some other variable in addition to ID.
So if the goal is to find the minimum (or maximum) SCORE for each ID then just tell PROC SQL to do that and forget about FIRST. processing.
create table readin1 as
select id,name, min(score) as score
from readin
group by id,name
;
Hi Tom,
Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:
Create table readin1 as
Select id, name, mean (score) as score_average, min(score) as score
from readin1
Group by id, name;
Run;
@LucasW wrote:
Hi Tom,
Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:
Create table readin1 as
Select id, name, mean (score) as score_average, min(score) as score
from readin1
Group by id, name;
Run;
The new request does not make any sense either. If you want the average and the minimum your code is fine. I do not see how ties come into the question. If you collapse to one observation per group there is nothing that can be tied.
You might want to show example input and the expected example. Make sure to include cases that test the boundary conidiations of your logic.
Hi Tom,
Sorry to not return with a table. Below is the question that I received, I hope it clarifies:
With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.
So far the code that I did worked (below), but the I have duplicate rows with only the SCORE column with different values.
Create table AVG_SCORE_TABLE as
Select name, score, mean (score) as AVG_SCORE
from readin1
Group by name, score
Order by AVG_SCORE DESC;
Run;
I don't know where to include the query to keep only the minimum score.
Thanks for your help!
From this description:
With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.
I think the answer will be table with a single row, correct? For example, you would select the minimum score for each student, and then calculate the average of those scores.
Is that what you want?
Have you learned about using sub-queries in PROC SQL? This sounds like a problem where you could use a subquery to calculate the minimum score for each student, then the main query could calculate the average of those scores. Would a query with a sub-query count as a single query?
Hi @Quentin,
I've learned only simple sub-queries, but I believe it counts as one query. Unfortunately, the school I'm in has a tendency the teach the simple stuff and then request on exams tasks that were never saw in class... They say we need to figure it out ourselves...
So, with the code that I did, I got this:
ID | Name | Score | AVG_SCORE |
1 | David | 45 | 59.5 |
1 | David | 74 | 59.5 |
2 | Sam | 45 | 45 |
3 | Ram | 54 | 54 |
4 | Bane | 87 | 88.5 |
5 | Mary | 92 | 92 |
4 | Bane | 90 | 87 |
6 | Dane | 23 | 23 |
7 | Jenny | 87 | 87 |
8 | Ken | 87 | 87 |
9 | Simran | 63 | 63 |
10 | Priya | 72 | 72 |
But I need to eliminate the duplicate students and keep the lowest score, like this:
ID | Name | Score | AVG_SCORE |
1 | David | 45 | 59.5 |
2 | Sam | 45 | 45 |
3 | Ram | 54 | 54 |
4 | Bane | 87 | 88.5 |
5 | Mary | 92 | 92 |
6 | Dane | 23 | 23 |
7 | Jenny | 87 | 87 |
8 | Ken | 87 | 87 |
9 | Simran | 63 | 63 |
10 | Priya | 72 | 72 |
Thank you for your help!
Hmm, Tom's solution is what I was thinking, and matches the description of the problem as I would read it. But Reeza's solution matches the table you want. So I think you'll be happy with one of those. : )
That is more straight forward. You want the average of the minimums.
The logic to find the minimum you already had, but you don't need both the ID and the NAME, just enough to unique identify the grouping.
select id,min(score) as min_score from HAVE group by id
Now you just need use that as the input to tha query to find the average.
Assuming that you want estimate the "average" using the MEAN statistic (as opposed to some other statistic such as MEDIAN) you would do:
create table as AVG_SCORE_TABLE
select mean(min_score) as AVG_SCORE
from (select id, min(score) as min_score from HAVE group by id)
;
I think you're over thinking this....
Create table AVG_SCORE_TABLE as
Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE
from readin1
Group by name
Order by AVG_SCORE DESC;
QUIT;
Hi @Reeza thank you soooo much!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.