Hi,
I am trying to order the values in proc format & not able to succed in it. Is there a work around to get it done? Below is the sample data & the code I am using. Help will be highly appreciated.
data test1;
infile datalines dsd missover dlm="#";
input VAR1$ VAR2$ VAR3$ ID;
datalines;
XXXX#1#ND#101
XXXX#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
YYYY#5#5123#103
XXXX#6#354#102
ZZZZ#7#25000#104
ZZZZ#8#12000#105
VVVV#9#35000#106
VVVV#10#45000#110
ZZZZ#11#55000#101
VVVV#12#70000#103
XXXX#13#226#107
VVVV#14#325#101
YYYY#15#154#106
AAAA#16#ND#104
AAAA#17#.#103
AAAA#18#7000#102
AAAA#19#ND#102
AAAA#19#0#102
ZZZZ#19#0#104
;
run;
proc format;
value gfmt
low-0 = " 0"
1-1000 = " 1-1,000"
1001-10000 = " 1,001-10,000"
10001-20000 = " 10,001-20,000"
20001-30000 = " 20,001-30,000"
30001-40000 = " 30,001-40,000"
40001-50000 = " 40,001-50,000"
50001-60000 = " 50,001-60,000"
60001-high = " GT 60,000"
other = "ND"
;
data test2;
set test1;
if VAR3 = "ND" then VAR4=. ;
else VAR4=input(VAR3,12.);
run;
%macro M_Format (app);
proc sql;
create table test3 as
select count(VAR2) as Freq_Month1
,strip(&app.) as Group
,case when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE"
when VAR1 = "VVVV" and ID ne . then "ROI"
when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN"
else VAR1 end as Region
from test2
group by Group, Region
order by group, Region
;
quit;
%mend;
%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);
Current Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Expected Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Thanks
I also tried with the below format as well, but no luck.
proc format;
value gfmt
low-0 = " 0"
1-1000 = " 1-1,000"
1001-10000 = " 1,001-10,000"
10001-20000 = " 10,001-20,000"
20001-30000 = " 20,001-30,000"
30001-40000 = " 30,001-40,000"
40001-50000 = " 40,001-50,000"
50001-60000 = " 50,001-60,000"
60001-high = " GT 60,000"
other = "ND"
;
run;
Sql orders by raw values; to achieve a different order, create a new variable with formatted values and order by that (keyword calculated).
Hi,
Thanks for the response. If you notice my code I am ordering by the formatted values (which is "order by group, Region") but still not getting the results as I am expecting.
Thanks,
@Data_User wrote:
Hi,
Thanks for the response. If you notice my code I am ordering by the formatted values (which is "order by group, Region") but still not getting the results as I am expecting.
Thanks,
I see. Well, in the ASCII table, the comma comes before the dash, so 1.001 comes before 1-1000. I'd add another (numeric) variable that gets you the correct order.
@Kurt_Bremser How it can be solved by adding another numeric value?
@ballardw Please see below. Current Output & the expected outputs are shown below.
Current Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Expected Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Solution with numeric, formatted value:
data test1;
infile datalines dsd missover dlm="#";
input VAR1$ VAR2$ VAR3$ ID;
datalines;
XXXX#1#ND#101
XXXX#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
YYYY#5#5123#103
XXXX#6#354#102
ZZZZ#7#25000#104
ZZZZ#8#12000#105
VVVV#9#35000#106
VVVV#10#45000#110
ZZZZ#11#55000#101
VVVV#12#70000#103
XXXX#13#226#107
VVVV#14#325#101
YYYY#15#154#106
AAAA#16#ND#104
AAAA#17#.#103
AAAA#18#7000#102
AAAA#19#ND#102
AAAA#19#0#102
ZZZZ#19#0#104
;
run;
proc format;
value gfmt_num
low-0 = 0
1-1000 = 1
1001-10000 = 2
10001-20000 = 3
20001-30000 = 4
30001-40000 = 5
40001-50000 = 6
50001-60000 = 7
60001-high = 8
other = 9
;
run;
proc format;
value gfmt
0 = " 0"
1 = " 1-1,000"
2 = " 1,001-10,000"
3 = " 10,001-20,000"
4 = " 20,001-30,000"
5 = " 30,001-40,000"
6 = " 40,001-50,000"
7 = " 50,001-60,000"
8 = " GT 60,000"
9 = "ND"
;
data test2;
set test1;
if VAR3 = "ND"
then VAR4 = .;
else VAR4 = input(VAR3,12.);
groupn = input(put(var4,gfmt_num.),best.);
format groupn gfmt.;
run;
proc sql;
create table test3 as
select
count(VAR2) as Freq_Month1,
groupn,
case
when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE"
when VAR1 = "VVVV" and ID ne . then "ROI"
when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN"
else VAR1
end as Region
from test2
group by groupn, calculated Region
order by groupn, calculated Region
;
quit;
proc print data=test3 noobs;
run;
Result:
Freq_ Month1 groupn Region 1 0 EUROPE 1 0 JAPAN 1 0 USA 1 1-1,000 ROI 2 1-1,000 UK 1 1-1,000 USA 1 1-1,000 XXXX 1 1,001-10,000 JAPAN 1 1,001-10,000 UK 1 1,001-10,000 USA 1 10,001-20,000 ZZZZ 1 20,001-30,000 EUROPE 1 30,001-40,000 ROI 1 40,001-50,000 ROI 1 50,001-60,000 EUROPE 1 GT 60,000 ROI 3 ND JAPAN 1 ND USA
I merged the threads, to keep all suggested solutions in one place.
@Kurt_Bremser Thanks a lot for the solution on this 🙂
Hi,
I am trying to order the values in proc format & not able to succed in it. Is there a work around to get it done? Below is the sample data & the code I am using. Help will be highly appreciated.
data test1;
infile datalines dsd missover dlm="#";
input VAR1$ VAR2$ VAR3$ ID;
datalines;
XXXX#1#ND#101
XXXX#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
YYYY#5#5123#103
XXXX#6#354#102
ZZZZ#7#25000#104
ZZZZ#8#12000#105
VVVV#9#35000#106
VVVV#10#45000#110
ZZZZ#11#55000#101
VVVV#12#70000#103
XXXX#13#226#107
VVVV#14#325#101
YYYY#15#154#106
AAAA#16#ND#104
AAAA#17#.#103
AAAA#18#7000#102
AAAA#19#ND#102
AAAA#19#0#102
ZZZZ#19#0#104
;
run;
proc format;
value gfmt
low-0 = " 0"
1-1000 = " 1-1,000"
1001-10000 = " 1,001-10,000"
10001-20000 = " 10,001-20,000"
20001-30000 = " 20,001-30,000"
30001-40000 = " 30,001-40,000"
40001-50000 = " 40,001-50,000"
50001-60000 = " 50,001-60,000"
60001-high = " GT 60,000"
other = "ND"
;
data test2;
set test1;
if VAR3 = "ND" then VAR4=. ;
else VAR4=input(VAR3,12.);
run;
%macro M_Format (app);
proc sql;
create table test3 as
select count(VAR2) as Freq_Month1
,strip(&app.) as Group
,case when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE"
when VAR1 = "VVVV" and ID ne . then "ROI"
when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN"
else VAR1 end as Region
from test2
group by Group, Region
order by group, Region
;
quit;
%mend;
%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);
Current Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Expected Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Hi,
I am trying to order the values in proc format & not able to succed in it. Is there a work around to get it done? Below is the sample data & the code I am using. Help will be highly appreciated.
data test1;
infile datalines dsd missover dlm="#";
input VAR1$ VAR2$ VAR3$ ID;
datalines;
XXXX#1#ND#101
XXXX#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
YYYY#5#5123#103
XXXX#6#354#102
ZZZZ#7#25000#104
ZZZZ#8#12000#105
VVVV#9#35000#106
VVVV#10#45000#110
ZZZZ#11#55000#101
VVVV#12#70000#103
XXXX#13#226#107
VVVV#14#325#101
YYYY#15#154#106
AAAA#16#ND#104
AAAA#17#.#103
AAAA#18#7000#102
AAAA#19#ND#102
AAAA#19#0#102
ZZZZ#19#0#104
;
run;
proc format;
value gfmt
low-0 = " 0"
1-1000 = " 1-1,000"
1001-10000 = " 1,001-10,000"
10001-20000 = " 10,001-20,000"
20001-30000 = " 20,001-30,000"
30001-40000 = " 30,001-40,000"
40001-50000 = " 40,001-50,000"
50001-60000 = " 50,001-60,000"
60001-high = " GT 60,000"
other = "ND"
;
data test2;
set test1;
if VAR3 = "ND" then VAR4=. ;
else VAR4=input(VAR3,12.);
run;
%macro M_Format (app);
proc sql;
create table test3 as
select count(VAR2) as Freq_Month1
,strip(&app.) as Group
,case when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE"
when VAR1 = "VVVV" and ID ne . then "ROI"
when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN"
else VAR1 end as Region
from test2
group by Group, Region
order by group, Region
;
quit;
%mend;
%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);
Current Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Expected Output:
Freq_Month1 | Group | Region |
1 | 0 | EUROPE |
1 | 0 | JAPAN |
1 | 0 | USA |
1 | 1-1,000 | ROI |
2 | 1-1,000 | UK |
1 | 1-1,000 | USA |
1 | 1-1,000 | XXXX |
1 | 1,001-10,000 | JAPAN |
1 | 1,001-10,000 | UK |
1 | 1,001-10,000 | USA |
1 | 10,001-20,000 | ZZZZ |
1 | 20,001-30,000 | EUROPE |
1 | 30,001-40,000 | ROI |
1 | 40,001-50,000 | ROI |
1 | 50,001-60,000 | EUROPE |
1 | GT 60,000 | ROI |
3 | ND | JAPAN |
1 | ND | USA |
Thanks
Slight mod to your code in that creating Test2 data set is unneeded, added the var4 to test1.
This duplicates your desired output:
data test1; infile datalines dsd missover dlm="#"; input VAR1$ VAR2$ VAR3$ ID; if VAR3 = "ND" then VAR4=. ; else VAR4=input(VAR3,12.); datalines; XXXX#1#ND#101 XXXX#2#2454#101 XXXX#3#0#102 YYYY#4#195#101 YYYY#5#5123#103 XXXX#6#354#102 ZZZZ#7#25000#104 ZZZZ#8#12000#105 VVVV#9#35000#106 VVVV#10#45000#110 ZZZZ#11#55000#101 VVVV#12#70000#103 XXXX#13#226#107 VVVV#14#325#101 YYYY#15#154#106 AAAA#16#ND#104 AAAA#17#.#103 AAAA#18#7000#102 AAAA#19#ND#102 AAAA#19#0#102 ZZZZ#19#0#104 ; run; proc format library=work; value gfmt low-0 = " 0" 1-1000 = " 1-1,000" 1001-10000 = " 1,001-10,000" 10001-20000 = " 10,001-20,000" 20001-30000 = " 20,001-30,000" 30001-40000 = " 30,001-40,000" 40001-50000 = " 40,001-50,000" 50001-60000 = " 50,001-60,000" 60001-high = " GT 60,000" other = "ND" ; run; proc sql; create table test3 as select count(VAR2) as Freq_Month1 ,put(vAR4,gfmt.) as Group ,case when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA" when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK" when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE" when VAR1 = "VVVV" and ID ne . then "ROI" when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN" else VAR1 end as Region from test1 group by Group, Region order by group, Region ; quit; proc print data=test3 noobs; var freq_month1 group region; run;
The STRIP function is removing the leading spaces before the ORDER is implemented so you get a typical sort for character values.
Or just create the group and region variables in a data step without the leading spaces in the format (or strip if you prefer) and use Proc Sort with Sortseq=linquistic(numeric_collation=on) :
data test2; set test1; /*MY test1*/ group = strip(put(var4,gfmt.)); run; proc sort data=test2 sortseq=linguistic( numeric_collation=on); by group; run;
It is really a good idea to post code and logs using a code box the {i} menu icon brings up. The main message body of the forum can have html stuff added/removed when formatted by the forum sometimes introducing artifacts that will cause code to fail when copied into the SAS editor.
I made the following changes to your code when you initially posted it earlier this afternoon. I haven't re-read your latest post, but does the following do what you were trying to achieve?:
proc format; value gfmt low-0 = " 0" 1-1000 = " 1-1,000" 1001-10000 = " 1,001-10,000" 10001-20000 = " 10,001-20,000" 20001-30000 = " 20,001-30,000" 30001-40000 = " 30,001-40,000" 40001-50000 = " 40,001-50,000" 50001-60000 = " 50,001-60,000" 60001-high = " GT 60,000" other = " ND" ; data test2; set test1; if VAR3 = "ND" then VAR4=. ; else VAR4=input(VAR3,12.); run; %macro M_Format (app); proc sql; create table test3 as select count(VAR2) as Freq_Month1 ,strip(&app.) as Group ,case when VAR1 = "XXXX" and ID in (101,102,103,104,105,106) then "USA" when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK" when VAR1 = "ZZZZ" and ID in (101,102,103,104) then "EUROPE" when VAR1 = "VVVV" and ID ne . then "ROI" when VAR1 = "AAAA" and ID in (101,102,103,104) then "JAPAN" else VAR1 end as Region from test2 group by Group, Region order by group, Region ; quit; %mend; %M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);
Art, CEO, AnalystFinder.com
This code is still not giving the results as I expected. Am I missing anything?
@Data_User wrote:
This code is still not giving the results as I expected. Am I missing anything?
Which code do you mena by "this code"?
My code produces exactly your expected result, see the proc print output.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.