BookmarkSubscribeRSS Feed
GVeers
Calcite | Level 5
Hey guys,

I figured out the following way to get my data to sum and print exactly the way I want it. However, it is quite inelegant coding and I could use some input on better ways.

Basically I'm using PROC SQL with a CASE statement to summarize my data in a particular way, and then using PROC FORMAT to make sure the data groupings are ordered properly. As I say, the following code works, but it seems pretty clunky.

proc format;
value $Cost '0' = 'a'
'1 - 4,500' = 'b'
'4,501 - 6,000' = 'c'
run;

proc sql;
SELECT CASE
WHEN missing(ORIG_COST) THEN '.'
WHEN ORIG_COST = 0 THEN '0'
WHEN ORIG_COST BETWEEN 1 AND 4500 THEN '1 - 4,500'
WHEN ORIG_COST BETWEEN 4501 AND 6000 THEN '4,501 - 6,000'
END AS Cost,
Count(*)
FROM CL.Cleaned_cauto_2007
GROUP BY Cost ORDER BY put(Cost, $Cost.);
quit;
5 REPLIES 5
GVeers
Calcite | Level 5
Oh, and I should also note this is just a sample. The reason I need the PROC FORMAT is because there are a lot more ranges in my real data and I end up getting ugly results like '10,001 - 25,000' showing up before '6,001 - 10,000' (for example).
data_null__
Jade | Level 19
You can achieve that result of a VALUE statement option and two OPTIONS in PROC SUMMARY.

[pre]
proc format;
value Cost(notsorted)
0 = '0'
1-4500 = '1-45,00'
4500-6000 = '4,500-6,000'
6000-10000 = '6,000-10,000'
10000-11000 = '10,000-11,000';
;
run;

proc plan seed=1169604082;
factors r=100 ordered cost = 1 of 11000 / noprint;
output out=cost;
Run;
proc summary data=cost nway;
class cost / order=data preloadfmt;
format cost cost.;
output out=test;
run;
proc print;
run;[/pre]
GVeers
Calcite | Level 5
Thanks, dall_null. I appreciate the response, particularly since I did not know about the power of PROC SUMMARY.

Still, I think this doesn't quite help me. My data set has a number of variables that I need to seperate into ranges, output with a specific character field, and print in numeric order. I think the way you describe will require sorting the data for each variable I need to output, which seems about as efficient as my approach.

I guess what would be ideal is if there was some procedure that could assign a label to a variable (like PROC FORMAT) but also assign a value for sorting purposes. If anyone knows of such a thing, I'd love to hear it, but I can stick to what I have as well.

Thanks!
data_null__
Jade | Level 19
How about an option to create an index variable associated with the levels of the CLASS variable(s).

[pre]
dm 'clear log; clear output;';
proc format;
value Cost(notsorted)
0 = '0'
1-4500 = '1-45,00'
4500-6000 = '4,500-6,000'
6000-10000 = '6,000-10,000'
10000-11000 = '10,000-11,000';
;
value xCost(notsorted)
0 = '0'
3500-7000 = '3,500-7,000' /*for example*/
1-3500 = '1-3,400'
7000-10000 = '6,000-10,000'
10000-11000 = '10,000-11,000';
;

run;

proc plan seed=1169604082;
factors r=100 ordered cost = 1 of 11000 / noprint;
output out=cost;
Run;
data cost;
set cost;
xcost = cost;
run;
proc summary data=cost;
class cost xcost/ order=data preloadfmt;
ways 1;
format cost cost. xcost xcost.;
output out=test / levels;
run;
proc print;
run;
proc print;
format _all_;
run;
[/pre]
GVeers
Calcite | Level 5
Thanks again, data_null. That could do it.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1659 views
  • 0 likes
  • 2 in conversation