Solved: Re: IF FIRST.ID with Proc SQL

LucasW · Posted 03-07-2023 01:18 PM

Is there any way to get IF First.ID result with PROC SQL?

Please let me know.

data readin;
input ID Name $ Score;
cards;
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72
;
run;

PROC SORT DATA = READIN;
BY ID;
RUN;

DATA READIN1;
SET READIN;
BY ID;
IF FIRST.ID;
PROC PRINT;
RUN;

Desired output:

Thank you!!!

Reeza · Posted 03-07-2023 06:10 PM

I think you're over thinking this....

 

Create table AVG_SCORE_TABLE as

Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE  

from readin1

Group by name

Order by AVG_SCORE DESC;

QUIT;

View solution in original post

ballardw · Posted 03-07-2023 01:42 PM

In general, no. There is no given order that SQL will return records.

If there is another, or a combination of other variables, that control order, such as a DATE or sequence number you might be able to restrict the output using a HAVING clause with the minimum of that value(better be a numeric value, character minimums seldom match expectations).

If you example data had a variable like test number something like this:

data readin;
input ID Name $ Score test;
cards;
1 David 45 1
1 David 74 2
2 Sam 45   3
2 Ram 54   4
3 Bane 87  5
3 Mary 92  6
3 Bane 87  7
4 Dane 23  8
5 Jenny 87 9
5 Ken 87   10
6 Simran 63 11 
8 Priya 72  12
;
run;

proc sql;
   create table want as
   select *
   from readin
   group by id
   having test=min(test)
   ;
quit;

It would make more sense to have sequence for each ID but they don't have to be such and the above was easy to enter in the editor since your values were grouped in the data step (thank you).

Kurt_Bremser · Posted 03-07-2023 02:54 PM

Why bother? The DATA step gives you the result with very little coding while SQL needs additional effort, so stay with it.

Maxim 14. Use the Right Tool.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

LucasW · Posted 03-07-2023 03:30 PM

I totally get it what both of you are saying, that's why after researching a lot, I ended up asking here.

It is not the first time that I receive a reply saying it is not possible to do this or that with Proc SQL... I guess my teacher have no clue what he is asking for... and probably he is not an expert on the subject himself... Difficult will be for me to question him about it without saying that I asked a ton of experts... Thank you a lot @ballardw and @Kurt_Bremser !

Tom · Posted 03-07-2023 03:39 PM

The current SAS program code does make a lot of sense. The question would make more sense if you were sorting by some other variable in addition to ID.

So if the goal is to find the minimum (or maximum) SCORE for each ID then just tell PROC SQL to do that and forget about FIRST. processing.

create table readin1 as 
  select id,name, min(score) as score
  from readin
  group by id,name
;

LucasW · Posted 03-07-2023 04:16 PM

Hi Tom,

Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:

Create table readin1 as

Select id, name, mean (score) as score_average, min(score) as score

from readin1

Group by id, name;

Run;

Tom · Posted 03-07-2023 04:21 PM

@LucasW wrote:

Hi Tom,

Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:

Create table readin1 as

Select id, name, mean (score) as score_average, min(score) as score

from readin1

Group by id, name;

Run;

The new request does not make any sense either. If you want the average and the minimum your code is fine. I do not see how ties come into the question. If you collapse to one observation per group there is nothing that can be tied.

You might want to show example input and the expected example. Make sure to include cases that test the boundary conidiations of your logic.

LucasW · Posted 03-07-2023 04:51 PM

Hi Tom,

Sorry to not return with a table. Below is the question that I received, I hope it clarifies:

With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.

So far the code that I did worked (below), but the I have duplicate rows with only the SCORE column with different values.

Create table AVG_SCORE_TABLE as

Select name, score, mean (score) as AVG_SCORE

from readin1

Group by name, score

Order by AVG_SCORE DESC;

Run;

I don't know where to include the query to keep only the minimum score.

Thanks for your help!

Quentin · Posted 03-07-2023 05:04 PM

From this description:

With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.

I think the answer will be table with a single row, correct? For example, you would select the minimum score for each student, and then calculate the average of those scores.

Is that what you want?

Have you learned about using sub-queries in PROC SQL? This sounds like a problem where you could use a subquery to calculate the minimum score for each student, then the main query could calculate the average of those scores. Would a query with a sub-query count as a single query?

The Boston Area SAS Users Group is hosting free webinars!

Register now at https://www.basug.org/events.

LucasW · Posted 03-07-2023 05:37 PM

Hi @Quentin,

I've learned only simple sub-queries, but I believe it counts as one query. Unfortunately, the school I'm in has a tendency the teach the simple stuff and then request on exams tasks that were never saw in class... They say we need to figure it out ourselves...

So, with the code that I did, I got this:

ID	Name	Score	AVG_SCORE
1	David	45	59.5
1	David	74	59.5
2	Sam	45	45
3	Ram	54	54
4	Bane	87	88.5
5	Mary	92	92
4	Bane	90	87
6	Dane	23	23
7	Jenny	87	87
8	Ken	87	87
9	Simran	63	63
10	Priya	72	72

But I need to eliminate the duplicate students and keep the lowest score, like this:

ID	Name	Score	AVG_SCORE
1	David	45	59.5
2	Sam	45	45
3	Ram	54	54
4	Bane	87	88.5
5	Mary	92	92
6	Dane	23	23
7	Jenny	87	87
8	Ken	87	87
9	Simran	63	63
10	Priya	72	72

Thank you for your help!

Quentin · Posted 03-07-2023 06:32 PM

Hmm, Tom's solution is what I was thinking, and matches the description of the problem as I would read it. But Reeza's solution matches the table you want. So I think you'll be happy with one of those. : )

The Boston Area SAS Users Group is hosting free webinars!

Register now at https://www.basug.org/events.

Tom · Posted 03-07-2023 05:36 PM

That is more straight forward. You want the average of the minimums.

The logic to find the minimum you already had, but you don't need both the ID and the NAME, just enough to unique identify the grouping.

select id,min(score) as min_score from HAVE group by id

Now you just need use that as the input to tha query to find the average.

Assuming that you want estimate the "average" using the MEAN statistic (as opposed to some other statistic such as MEDIAN) you would do:

create table as AVG_SCORE_TABLE 
  select mean(min_score) as AVG_SCORE
  from (select id, min(score) as min_score from HAVE group by id)
;

Reeza · Posted 03-07-2023 06:10 PM

I think you're over thinking this....

 

Create table AVG_SCORE_TABLE as

Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE  

from readin1

Group by name

Order by AVG_SCORE DESC;

QUIT;

LucasW · Posted 03-07-2023 06:43 PM

Hi @Reeza thank you soooo much!

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away