- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello
I want that the order of categories in desired output will be in same order as in proc format.
what is the way to do it please by using proc sql code as below?
One solution is to write proc format with (a) (b) (c) (d) and so on but it doesnt look nice in output
proc format; value agefmt low-18='(a) TILL 18' 18-19='(b) 18-19' 19-21='(c) 19-21' 21-high='(d) 21+' ; Run;
Data have;
input ID age;
cards;
1 15
2 17
3 18
4 21
5 22
6 19
7 19
8 31
9 39
10 48
11 54
12 17
13 18
;
Run;
proc format;
value agefmt
low-18='TILL 18'
18-19='18-19'
19-21='19-21'
21-high='21+'
;
Run;
title;
PROC SQL;
select put(age,agefmt.) as age ,
count(*) as nr ,
calculated nr/(select count(*) as total_nr from have)*100 as pctN
from have
group by calculated age
;
QUIT;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Ronein,
It's true that PROC FREQ would be more suitable for this particular example, but let's assume that your real task calls for PROC SQL.
With a suitable ORDER BY clause you can specify virtually every conceivable sort order of records in PROC SQL. For example, insert this between the GROUP BY clause and the semicolon:
order by age net 'T', 1
This means: The output records are first sorted by the Boolean value (cf. Operators in PROC SQL expressions)
1 if age does not start with "T" 0 otherwise
(i.e., "TILL 18" is in group 0, the other age categories are in group 1 -- so "TILL 18" will be the first category in the output)
and, within these groups, records are sorted by the first item in the SELECT statement, i.e., by (calculated) age itself. This ensures the correct sort order of "18-19", "19-21" and "21+" because alphabetical order is appropriate for them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use PROC FREQ with your format:
proc freq data=have;
format age agefmt.;
tables age;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And to get your dataset, add the NOCUM and OUT= options:
proc freq data=have;
format age agefmt.;
tables age / nocum out=want;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Ronein,
It's true that PROC FREQ would be more suitable for this particular example, but let's assume that your real task calls for PROC SQL.
With a suitable ORDER BY clause you can specify virtually every conceivable sort order of records in PROC SQL. For example, insert this between the GROUP BY clause and the semicolon:
order by age net 'T', 1
This means: The output records are first sorted by the Boolean value (cf. Operators in PROC SQL expressions)
1 if age does not start with "T" 0 otherwise
(i.e., "TILL 18" is in group 0, the other age categories are in group 1 -- so "TILL 18" will be the first category in the output)
and, within these groups, records are sorted by the first item in the SELECT statement, i.e., by (calculated) age itself. This ensures the correct sort order of "18-19", "19-21" and "21+" because alphabetical order is appropriate for them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks!!
Your solution is perfect
May you explain please what does it mean
order by age net 'T', 1
Data have;
input ID age;
cards;
1 15
2 17
3 18
4 21
5 22
6 19
7 19
8 31
9 39
10 48
11 54
12 17
13 18
;
Run;
proc format;
value agefmt
low-18='TILL 18'
18-19='18-19'
19-21='19-21'
21-high='21+'
;
Run;
title;
PROC SQL;
select put(age,agefmt.) as age ,
count(*) as nr ,
calculated nr/(select count(*) as total_nr from have)*100 as pctN
from have
group by calculated age
order by age net 'T', 1
;
QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC REPORT:
proc report data=have;
column age id n percent;
define age / group format=agefmt. order=data;
define id / noprint;
define percent / computed format=percent7.2;
compute before;
all = n;
endcomp;
compute percent;
percent = n / all;
endcomp;
run;
PROC SUMMARY/MEANS:
proc means data=have n percent;
format age agefmt.;
class age / order=data;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The ages in the data are not sorted from low to high...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're right, I was fooled by your data ordered in a manner that made it work.
It's INTERNAL in PROC REPORT and UNFORMATTED in PROC MEANS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
INTERNAL is a valid alias for UNFORMATTED in PROC MEANS, and so at least in my mind, its easier to remember use INTERNAL for all PROCs where ORDER= is an option.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@PaigeMiller wrote:
INTERNAL is a valid alias for UNFORMATTED in PROC MEANS, and so at least in my mind, its easier to remember use INTERNAL for all PROCs where ORDER= is an option.
++
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I didnt understand.
In order to sort by unformatted values need to do:
In PROC MEAN need to use data=Unformatted
in proc report need to use Data=internal
Am I correct???
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ronein wrote:
Sorry, I didnt understand.
In order to sort by unformatted values need to do:
In PROC MEAN need to use data=Unformatted
in proc report need to use Data=internal
Am I correct???
PROC MEANS: you can use ORDER=INTERNAL or ORDER=UNFORMATTED, these are the same. In PROC REPORT, I have only ever used ORDER=INTERNAL. (And not DATA= as you typed)
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
TABULATE:
proc tabulate data=have;
format age agefmt.;
class age;
tables age,(n pctn);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content