BookmarkSubscribeRSS Feed
mhoward2
Obsidian | Level 7

I tried a few solutions online and kept getting errors. How do output the top 10 observations by grouping? Current code:

 

PROC SQL;
	CREATE TABLE WORK.WANT AS
		SELECT  t1.BUCKET1,
				t1.BUCKET2,
				t1.CODE,
				SUM(CASE WHEN t1.CRITERIA1='A' THEN t1.SUM_OF_CNT
					END) AS A_CNT
		FROM WORK.HAVE t1
		WHERE t1.CRITERIA2 = 'B'
		GROUP BY 1,2,3
		ORDER BY t1.BUCKET1,
				 t1.BUCKET2,
				 CALCULATED A_CNT DESC;
QUIT;

There can be any number of combinations of BUCKET1 and BUCKET2. With this code, each combination gets as many rows as there are CODES within that combination. How do I only output the top 10 codes for each combination?

 

Thanks in advance!

5 REPLIES 5
PaigeMiller
Diamond | Level 26
proc rank data=have descending;
    var a_cnt;
    by bucket1 buckt2;
    ranks a_cnt_ranks;
run;

 

When a_ctn_ranks is <= 10, those are the observations you want.

--
Paige Miller
mhoward2
Obsidian | Level 7
Thank you! Is there no way to do it within the same SQL Step?
PaigeMiller
Diamond | Level 26

I would not even try to do this in SQL. It is the wrong tool for this problem. SAS has already programmed this for you in PROC RANK.

 

Can it be done in SQL? I'm sure it can, I'm sure it is more difficult than PROC RANK, and I don't know how to do this in SQL (nor do I want to know).

--
Paige Miller
ballardw
Super User

I really hate requests involving an adjective like "top" without defining it. What does top mean in this context? The COUNT of combinations with the most observations is only ONE possible interpretation.

Top could be by alphabetic or numeric value order the bucket variable(s)

By the observations with the largest count of Bucket1 for each value of Bucket2 or vice versa.

 

I'm sure I could come up with a few.

 

Do not expect use to use your code for definition because you have said it is getting errors. So it obviously does not contain a workable definition of "top".

 

BTW, any question about errors in code should include the LOG with the code and all the messages as well as the errors.

Example data can go a long way as well. What if your data does not have any observations where Criteria1='A' and Criteria2='B'?

sbxkoenk
SAS Super FREQ

Hello,

 

in case you are using SAS Viya and CAS tables (datasets in CASLIBs)  ...

 

A useful action is simple.topK, which selects the top K and bottom K values for variables in a data set, based on a user-specified ranking order. The example below returns the top 5 and bottom 5 values for two variables based on their frequency:

 

proc cas;
simple.topk / table="TABLE_NAME" 
              aggregator="N",                        
              inputs={"VAR1","VAR2"},
              topk=5,
              bottomk=5;
quit;

 

Simple is a rich action set with heaps of useful options covered in the documentation.

 

Koen

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 626 views
  • 0 likes
  • 4 in conversation