BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LucasW
Calcite | Level 5

Is there any way to get IF First.ID result with PROC SQL?

Please let me know.

 

data readin;
input ID Name $ Score;
cards;
1 David 45
1 David 74
2 Sam 45
2 Ram 54
3 Bane 87
3 Mary 92
3 Bane 87
4 Dane 23
5 Jenny 87
5 Ken 87
6 Simran 63
8 Priya 72
;
run;

 

PROC SORT DATA = READIN;
BY ID;
RUN; 

 

DATA READIN1;
SET READIN;
BY ID;
IF FIRST.ID;
PROC PRINT;
RUN;

 

Desired output:

LucasW_0-1678213075074.png

 Thank you!!!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I think you're over thinking this....

 

 

Create table AVG_SCORE_TABLE as

Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE  

from readin1

Group by name

Order by AVG_SCORE DESC;

QUIT;

View solution in original post

13 REPLIES 13
ballardw
Super User

In general, no. There is no given order that SQL will return records.

If there is another, or a combination of other variables, that control order, such as a DATE or sequence number you might be able to restrict the output using a HAVING clause with the minimum of that value(better be a numeric value, character minimums seldom match expectations).

 

If you example data had a variable like test number something like this:

data readin;
input ID Name $ Score test;
cards;
1 David 45 1
1 David 74 2
2 Sam 45   3
2 Ram 54   4
3 Bane 87  5
3 Mary 92  6
3 Bane 87  7
4 Dane 23  8
5 Jenny 87 9
5 Ken 87   10
6 Simran 63 11 
8 Priya 72  12
;
run;

proc sql;
   create table want as
   select *
   from readin
   group by id
   having test=min(test)
   ;
quit;

It would make more sense to have sequence for each ID but they don't have to be such and the above was easy to enter in the editor since your values were grouped in the data step (thank you).

 

 

 

Kurt_Bremser
Super User

Why bother? The DATA step gives you the result with very little coding while SQL needs additional effort, so stay with it.

Maxim 14. Use the Right Tool.

LucasW
Calcite | Level 5

I totally get it what both of you are saying, that's why after researching a lot, I ended up asking here.

It is not the first time that I receive a reply saying it is not possible to do this or that with Proc SQL... I guess my teacher have no clue what he is asking for... and probably he is not an expert on the subject himself... Difficult will be for me to question him about it without saying that I asked a ton of experts...  Thank you a lot @ballardw and @Kurt_Bremser !

Tom
Super User Tom
Super User

The current SAS program code does make a lot of sense.  The question would make more sense if you were sorting by some other variable in addition to ID.  

 

So if the goal is to find the minimum (or maximum) SCORE for each ID then just tell PROC SQL to do that and forget about FIRST. processing.

create table readin1 as 
  select id,name, min(score) as score
  from readin
  group by id,name
;

 

LucasW
Calcite | Level 5

Hi Tom, 

 

Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:

 

Create table readin1 as

Select id, name, mean (score) as score_average, min(score) as score  

from readin1

Group by id, name; 

Run;

Tom
Super User Tom
Super User

@LucasW wrote:

Hi Tom, 

 

Is it possible to include Min or Max if I already have another query in between? In my task I have to create an extra column with the average, and then if there is a duplicate, I need to keep the row that had the min original score, example:

 

Create table readin1 as

Select id, name, mean (score) as score_average, min(score) as score  

from readin1

Group by id, name; 

Run;


The new request does not make any sense either.  If you want the average and  the minimum your code is fine.  I do not see how ties come into the question.  If you collapse to one observation per group there is nothing that can be tied.

 

You might want to show example input and the expected example. Make sure to include cases that test the boundary conidiations of your logic.

LucasW
Calcite | Level 5

Hi Tom, 

 

Sorry to not return with a table. Below is the question that I received, I hope it clarifies:

 

With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.

 

So far the code that I did worked (below), but the I have duplicate rows with only the SCORE column with different values.

 

Create table AVG_SCORE_TABLE as

Select name, score, mean (score) as AVG_SCORE  

from readin1

Group by name, score

Order by AVG_SCORE DESC;

Run;

 

I don't know where to include the query to keep only the minimum score.

Thanks for your help!

 

Quentin
Super User

From this description:

With PROC SQL, find the average SCORE present in the data, in a single query. Call the column containing the result AVG_SCORE and store the result in a table called AVG_SCORE_TABLE.
NOTE: A student (NAME) can be present twice, in this case, remember only keep the minimum score recorded.

I think the answer will be table with a single row, correct?  For example, you would select the minimum score for each student, and then calculate the average of those scores.  

 

Is that what you want?

 

Have you learned about using sub-queries in PROC SQL?  This sounds like a problem where you could use a subquery to calculate the minimum score for each student, then the main query could calculate the average of those scores.  Would a query with a sub-query count as a single query?

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
LucasW
Calcite | Level 5

Hi @Quentin,

 

I've learned only simple sub-queries, but I believe it counts as one query. Unfortunately, the school I'm in has a tendency the teach the simple stuff and then request on exams tasks that were never saw in class... They say we need to figure it out ourselves... 

 

So, with the code that I did, I got this:

 

ID

Name

Score

AVG_SCORE

1

David

45

59.5

1

David

74

59.5

2

Sam

45

45

3

Ram

54

54

4

Bane

87

88.5

5

Mary

92

92

4

Bane

90

87

6

Dane

23

23

7

Jenny

87

87

8

Ken

87

87

9

Simran

63

63

10

Priya

72

72

 

But I need to eliminate the duplicate students and keep the lowest score, like this:  

 

ID

Name

Score

AVG_SCORE

1

David

45

59.5

2

Sam

45

45

3

Ram

54

54

4

Bane

87

88.5

5

Mary

92

92

6

Dane

23

23

7

Jenny

87

87

8

Ken

87

87

9

Simran

63

63

10

Priya

72

72

 

Thank you for your help!

Quentin
Super User

Hmm, Tom's solution is what I was thinking, and matches the description of the problem as I would read it.  But Reeza's solution matches the table you want.  So I think you'll be happy with one of those. : )

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

That is more straight forward.  You want the average of the minimums.

The logic to find the minimum you already had, but you don't need both the ID and the NAME,  just enough to unique identify the grouping.

select id,min(score) as min_score from HAVE group by id

Now you just need use that as the input to tha query to find the average. 

 

Assuming that you want estimate the "average" using the MEAN statistic (as opposed to some other statistic such as MEDIAN) you would do:

create table as AVG_SCORE_TABLE 
  select mean(min_score) as AVG_SCORE
  from (select id, min(score) as min_score from HAVE group by id)
;
Reeza
Super User

I think you're over thinking this....

 

 

Create table AVG_SCORE_TABLE as

Select name, mean (score) as AVG_SCORE, min(Score) as MIN_SCORE  

from readin1

Group by name

Order by AVG_SCORE DESC;

QUIT;
LucasW
Calcite | Level 5

Hi @Reeza thank you soooo much! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1520 views
  • 2 likes
  • 6 in conversation