suggested the answer to this puzzle earlier today on SAS-L. Since I never thought of doing what Matt suggested in the more than 40 years I have been using SAS, I thought it might be useful to see other's responses.
You have the following dataset:
data have;
input a b c $ Zip d e f;
cards;
1 2 x 12345 6 7 8
2 3 y 54321 7 8 9
;
You want a dataset called 'want' that only contains a b c d e and zip AND in that specific order.
What would be the easiest (i.e., least code and quickest running) way of accomplishing the task.
P.S. If you've Matt or have already seen Matt's SAS-L post, don't respond!
All right then, how about:
data want;
set have(keep=a--c d--e);
set have(keep=zip);
run;
Not quite as efficient, but more generalizable.
PG
data want(drop=zip rename=f=zip);
set have;
f=zip;
run;
All right then, how about:
data want;
set have(keep=a--c d--e);
set have(keep=zip);
run;
Not quite as efficient, but more generalizable.
PG
Thanks Art! I agree, FriedEgg's solution is more efficient. - PG
My try
data want;
retain a b c d e zip;
set have(drop=f);
run;
stat@sas: Yes, that would definitely work, but imagine if a thru e actually represented about 500 variables. However, good suggestion!
Arthur.T ,
I think the fastest way is to create a view for that code ,and not to copy real table again.
Xia Keshan
Here's my/yet another approach:
* problem as stated ;
data have;
input a b c $ Zip d e f;
cards;
1 2 x 12345 6 7 8
2 3 y 54321 7 8 9
;
run;
%let vars=a b c d e zip;
data want1;
format &vars;
set have (keep=&vars);
run;
* more realistic example ;
data have;
input b c $ f Zip d a e;
cards;
2 x 8 12345 6 1 7
3 y 9 54321 7 2 8
;
run;
%let vars=a b c d e zip;
data want2;
format &vars;
set have (keep=&vars);
run;
proc compare base=want1 compare=want2;
run;
* last approach, use a "skeleton" dataset as metadata ;
data skel;
length a b 8 c $8 d e Zip 8; * these must be in the desired order for the target dataset ;
call missing(of _all_); * could also add formats, labels, etc. ;
stop;
run;
proc contents data=skel out=columns (keep=name varnum) noprint;
run;
proc sql noprint;
select name into :vars separated by " " from columns order by varnum;
quit;
data want3;
* if any of your metadata columns are derived columns (i.e. not in your source dataset) ;
* remember the implied retain on dataset variables, or this can come back to bite you! ;
* IOW if skel contains a variable "foo" in order to set its PDV order, ;
* and "foo" is calculated in the data step, remember that it is retained in the data step. ;
if 0 then set skel;
set have (keep=&vars);
run;
proc compare base=want1 compare=want3;
run;
For the problem as stated, where your source data has the PDV in almost the exact order you want in the target dataset, approaches such as variable lists may use less code. However, they are prone to undesired results if the source data structure changes.
In a perhaps more realistic example where the source data has its PDV in an order unrelated to the desired target dataset, you're gonna have to list the variables out. If I had 500 variables, I'd proc print the data from proc contents output, reorder the variables in an editor, and create my program.
The last example is yet another approach. Not the least amount of code, but an approach I've sometimes used in the past.
HTH,
Scott
Added comments about implied retain of dataset variables
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.