why colon operator with a prefix does not work to reorder variables?

hanfei28 · Posted 01-06-2020 05:07 PM

Hi All,

I try to use code like this

data test;

retain ID ACO_: ;

set datasetName;

run;

to put all variables beginning with ACO_ after variable ID. But it does not work. Could anyone give me a clue? Thanks.

Fei

Tom · Posted 01-06-2020 05:14 PM

For the colon suffix to match any variables they have to already be known to the data step compiler. But if they are already known then their position is already set. So you cannot use that syntax to change their order.

Note that your test will only change the position of ID since when the RETAIN statement is compiled there are NO variables defined yet.

To see this change the RETAIN statement to a PUT statement.

data test;
  put 'BEFORE SET: ' (ID ACO_: ) (=);
  set datasetName (obs=2);
  put 'AFTER SET: ' (ID ACO_: ) (=);
run;

mkeintz · Posted 01-06-2020 05:22 PM

@Tom wrote:

For the colon suffix to match any variables they have to already be known to the data step compiler. But if they are already known then their position is already set. So you cannot use that syntax to change their order.

I don't think so. The order in data set HAVE below shows ID to the right of ACO_ variables. But using the retain statement in the DATA TEST step re-orders them

data have;
  input ACO_1-ACO_3 ID;
  put (_all_) (=);
datalines;
11 111 1111  1
run;

data test;
  retain ID ACO_: ;
  set have;
  put (_all_) (=);
run;

generates this log:

142  data have;
143    input ACO_1-ACO_3 ID;
144    put (_all_) (=);
145  datalines;

ACO_1=11 ACO_2=111 ACO_3=1111 ID=1
NOTE: The data set WORK.HAVE has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


147  run;
148
149  data test;
150    retain ID ACO_: ;
151    set have;
152    put (_all_) (=);
153  run;

ID=1 ACO_1=11 ACO_2=111 ACO_3=1111
NOTE: There were 1 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.TEST has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Tom · Posted 01-06-2020 07:05 PM

All you did was move ID to the front since it was the only variable listed in your RETAIN statement.

Define something before the ACO variables and you will see that they do not "move".

13    data have;
14      input XX ACO_1-ACO_3 ID;
15      put (_all_) (=);
16    datalines;

XX=5 ACO_1=11 ACO_2=111 ACO_3=1111 ID=1
NOTE: The data set WORK.HAVE has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds


18    ;
19
20    data test;
21      retain ID ACO_: ;
22      set have;
23      put (_all_) (=);
24    run;

ID=1 XX=5 ACO_1=11 ACO_2=111 ACO_3=1111
NOTE: There were 1 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.TEST has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

s

mkeintz · Posted 01-06-2020 11:35 PM

@Tom wrote:

All you did was move ID to the front since it was the only variable listed in your RETAIN statement.

Define something before the ACO variables and you will see that they do not "move".

13    data have;
14      input XX ACO_1-ACO_3 ID;
15      put (_all_) (=);
16    datalines;

XX=5 ACO_1=11 ACO_2=111 ACO_3=1111 ID=1
NOTE: The data set WORK.HAVE has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds


18    ;
19
20    data test;
21      retain ID ACO_: ;
22      set have;
23      put (_all_) (=);
24    run;

ID=1 XX=5 ACO_1=11 ACO_2=111 ACO_3=1111
NOTE: There were 1 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.TEST has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

Thanks @Tom

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

mkeintz · Posted 01-06-2020 05:16 PM

Could you please show a sample of the data you are reading, and the results that shows that ID is not to the left of the ACO_ variables? Help us help you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

unison · Posted 01-06-2020 08:17 PM

Whenever I need to reorder columns in a specific order, I use dictionary.columns. See example here:

data have;
	input aco_1 aco_3 aco_2;
	datalines;
1 3 2
;
run;

data desired_output;
	input aco_1 aco_2 aco_3;
	datalines;
1 2 3
;
run;

data idea1;
	retain aco_:;
	set have;
run;

/*Idea1 does not work, as you have experienced*/
proc compare 
	base=sashelp.vcolumn(where=(libname='WORK' and memname='DESIRED_OUTPUT'))
	compare=sashelp.vcolumn(where=(libname='WORK' and memname='IDEA1'))
;
run;

/*Get ACO varnames into macro var 'aco_vars' (which is ordered numerically)*/
proc sql noprint;
	select name into :aco_vars separated by " "
		from dictionary.columns 
			where 
				upcase(libname)='WORK' and 
				upcase(memname)='HAVE' and
				upcase(name) like 'ACO%'
			order by name
	;
quit;

data idea2;
	retain &aco_vars;
	set have;
run;

/*Idea2 works*/
proc compare
	base=sashelp.vcolumn(where=(libname='WORK' and memname='DESIRED_OUTPUT'))
	compare=sashelp.vcolumn(where=(libname='WORK' and memname='IDEA2'))
;
run;

-unison

why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?

Re: why colon operator with a prefix does not work to reorder variables?