Hello
I want that the order of categories in desired output will be in same order as in proc format.
what is the way to do it please by using proc sql code as below?
One solution is to write proc format with (a) (b) (c) (d) and so on but it doesnt look nice in output
proc format; value agefmt low-18='(a) TILL 18' 18-19='(b) 18-19' 19-21='(c) 19-21' 21-high='(d) 21+' ; Run;
Data have;
input ID age;
cards;
1 15
2 17
3 18
4 21
5 22
6 19
7 19
8 31
9 39
10 48
11 54
12 17
13 18
;
Run;
proc format;
value agefmt
low-18='TILL 18'
18-19='18-19'
19-21='19-21'
21-high='21+'
;
Run;
title;
PROC SQL;
select put(age,agefmt.) as age ,
count(*) as nr ,
calculated nr/(select count(*) as total_nr from have)*100 as pctN
from have
group by calculated age
;
QUIT;
Hello @Ronein,
It's true that PROC FREQ would be more suitable for this particular example, but let's assume that your real task calls for PROC SQL.
With a suitable ORDER BY clause you can specify virtually every conceivable sort order of records in PROC SQL. For example, insert this between the GROUP BY clause and the semicolon:
order by age net 'T', 1
This means: The output records are first sorted by the Boolean value (cf. Operators in PROC SQL expressions)
1 if age does not start with "T" 0 otherwise
(i.e., "TILL 18" is in group 0, the other age categories are in group 1 -- so "TILL 18" will be the first category in the output)
and, within these groups, records are sorted by the first item in the SELECT statement, i.e., by (calculated) age itself. This ensures the correct sort order of "18-19", "19-21" and "21+" because alphabetical order is appropriate for them.
Use PROC FREQ with your format:
proc freq data=have;
format age agefmt.;
tables age;
run;
And to get your dataset, add the NOCUM and OUT= options:
proc freq data=have;
format age agefmt.;
tables age / nocum out=want;
run;
Hello @Ronein,
It's true that PROC FREQ would be more suitable for this particular example, but let's assume that your real task calls for PROC SQL.
With a suitable ORDER BY clause you can specify virtually every conceivable sort order of records in PROC SQL. For example, insert this between the GROUP BY clause and the semicolon:
order by age net 'T', 1
This means: The output records are first sorted by the Boolean value (cf. Operators in PROC SQL expressions)
1 if age does not start with "T" 0 otherwise
(i.e., "TILL 18" is in group 0, the other age categories are in group 1 -- so "TILL 18" will be the first category in the output)
and, within these groups, records are sorted by the first item in the SELECT statement, i.e., by (calculated) age itself. This ensures the correct sort order of "18-19", "19-21" and "21+" because alphabetical order is appropriate for them.
Thanks!!
Your solution is perfect
May you explain please what does it mean
order by age net 'T', 1
Data have;
input ID age;
cards;
1 15
2 17
3 18
4 21
5 22
6 19
7 19
8 31
9 39
10 48
11 54
12 17
13 18
;
Run;
proc format;
value agefmt
low-18='TILL 18'
18-19='18-19'
19-21='19-21'
21-high='21+'
;
Run;
title;
PROC SQL;
select put(age,agefmt.) as age ,
count(*) as nr ,
calculated nr/(select count(*) as total_nr from have)*100 as pctN
from have
group by calculated age
order by age net 'T', 1
;
QUIT;
PROC REPORT:
proc report data=have;
column age id n percent;
define age / group format=agefmt. order=data;
define id / noprint;
define percent / computed format=percent7.2;
compute before;
all = n;
endcomp;
compute percent;
percent = n / all;
endcomp;
run;
PROC SUMMARY/MEANS:
proc means data=have n percent;
format age agefmt.;
class age / order=data;
run;
You're right, I was fooled by your data ordered in a manner that made it work.
It's INTERNAL in PROC REPORT and UNFORMATTED in PROC MEANS.
INTERNAL is a valid alias for UNFORMATTED in PROC MEANS, and so at least in my mind, its easier to remember use INTERNAL for all PROCs where ORDER= is an option.
@PaigeMiller wrote:
INTERNAL is a valid alias for UNFORMATTED in PROC MEANS, and so at least in my mind, its easier to remember use INTERNAL for all PROCs where ORDER= is an option.
++
Sorry, I didnt understand.
In order to sort by unformatted values need to do:
In PROC MEAN need to use data=Unformatted
in proc report need to use Data=internal
Am I correct???
@Ronein wrote:
Sorry, I didnt understand.
In order to sort by unformatted values need to do:
In PROC MEAN need to use data=Unformatted
in proc report need to use Data=internal
Am I correct???
PROC MEANS: you can use ORDER=INTERNAL or ORDER=UNFORMATTED, these are the same. In PROC REPORT, I have only ever used ORDER=INTERNAL. (And not DATA= as you typed)
TABULATE:
proc tabulate data=have;
format age agefmt.;
class age;
tables age,(n pctn);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.