Hi, I have some problems about Proc report outputs. I want to use ORDER option to suppress some repetitive values, but got a unexpected output. Here's my codes:
data x1;
informat subject armcd period $20.;
input subject armcd period;
cards;
C007 RT 2
C007 RT 2
C010 RT 1
C010 RT 1
C010 RT 1
C010 RT 1
C010 RT 2
C010 RT 2
C010 RT 2
C010 RT 2
C010 RT 2
;
proc sort;
by subject armcd period;
run;
proc report data=x1;
column subject armcd period;
define subject/order order=data;
define armcd/order order=data;
define period/order order=data;
run;
I thought the output would be:
C007 RT 2 C010 RT 1 2
I was surprised I got:
C007 RT 2 C010 RT 2 1
So, why the output would be like this?Why the value "1" is showing below the value "2"?Could someone explain to me? Any help will be greatly appreciated.
I very rarely use proc report. But while I believe "order=data" means to report data in the order it is encountered, it's a global concept, not a "within group" concept. Since the first value of period encountered (for C007 RT) is a 2, it is first reported not only for C007 Rt, but also for subsequent subject/armcd combinations.
You can test this by changing the PERIOD from 2 to 1 for first (or both) C007 RT record(s). Then you will see period 1 preceding period 2 for C010 RT.
Perhaps you can fix this by specifying ORDER=INTERNAL (the internal numeric values) for PERIOD. Perhaps that's what you really want.
I very rarely use proc report. But while I believe "order=data" means to report data in the order it is encountered, it's a global concept, not a "within group" concept. Since the first value of period encountered (for C007 RT) is a 2, it is first reported not only for C007 Rt, but also for subsequent subject/armcd combinations.
You can test this by changing the PERIOD from 2 to 1 for first (or both) C007 RT record(s). Then you will see period 1 preceding period 2 for C010 RT.
Perhaps you can fix this by specifying ORDER=INTERNAL (the internal numeric values) for PERIOD. Perhaps that's what you really want.
Thanks a lot! I tried using ORDER=INTERNAL option and it works. I got the result I want. I think you're right that ORDER option is a global option and not "within-group". I thought ORDER option similar to PROC SORT before. When applying to multiple variables, there are priorities. In PROC SORT procedure, if we specify multiple variables, first sort A, then B and so on. So I thought ORDER option would work like this, first suppress repetitive values of ARMCD, then suppress repetitive SUBJID within suppressed ARMCD, then PERIOD. But it doesn't seem to be like this. I'm wondering how to specify ORDER option to get the output sorted same as my dataset and suppress repetitive values meanwhile.
you used order=DATA .
Try order=internal.
proc report data=x1 nowd; column subject armcd period; define subject/order order=data; define armcd/order order=data; define period/order order=internal; run;
Hi:
If you change period to be numeric and create a fake "order" variable that is the obs number and then use SPANROWS, you will see that SPANROWS puts all of subject C007 into one ordered group and then C010 into another ordered group, You really do not need to overcontrol the order, it seems to come out fine if you don't use either order=data or order=internal. At any rate, I used ORDVAR for this first output, but I ran 3 tests and except for ORDVAR in the first report, the ordering of the other reports was the same, whether period was characer or numeric. So you can see the side effect of using ORDER as the usage for subject and armcd and where the groups and subgroups are by using SPANROWS:
Here's what I changed to create the above output:
data x1;
* period is numeric, cper is character;
length subject armcd $20 cper $2;
input subject armcd period;
ordvar = _n_;
cper = put(period,2.);
cards;
C007 RT 2
C007 RT 2
C010 RT 1
C010 RT 1
C010 RT 1
C010 RT 1
C010 RT 2
C010 RT 2
C010 RT 2
C010 RT 2
C010 RT 2
;
proc sort data=x1;
by subject armcd period;
run;
proc report data=x1 spanrows;
title '1 No order specified, with ORDVAR';
column subject armcd period ordvar;
define subject/order ;
define armcd/order ;
define period/order ;
define ordvar / display;
run;
proc report data=x1 spanrows;
title '2 no order = option specified not using ordvar';
column subject armcd period ;
define subject/order ;
define armcd/order ;
define period/order ;
run;
title;
proc report data=x1 spanrows;
title '3 character var cper for period';
column subject armcd cper ;
define subject/order ;
define armcd/order ;
define cper/order ;
run;
title;
and here's the output from #2 and #3 basically the same without using any ORDER = option:
Hope this helps,
Cynthia
Thanks a lot! I tried not specifying ORDER option, just let SAS choose the default option and got the right result. The default for ORDER option is ORDER=FORMATTED? But in another case, I have some observations, and the values of ARMCD are "TR". I want to let values "TR" show above "RT". Though I sort the datasets by descending armcd, and if I don't specify the value for ORDER= option, I get the unwanted output that values "RT" show above "TR". I just want to get the output sorted same as my origin dataset with suppressed repetitive values.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.