BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
art297
Opal | Level 21

suggested the answer to this puzzle earlier today on SAS-L. Since I never thought of doing what Matt suggested in the more than 40 years I have been using SAS, I thought it might be useful to see other's responses.

You have the following dataset:

data have;

  input a b c $ Zip d e f;

  cards;

1 2 x 12345 6 7 8

2 3 y 54321 7 8 9

;

You want a dataset called 'want' that only contains a b c d e and zip AND in that specific order.

What would be the easiest (i.e., least code and quickest running) way of accomplishing the task.

P.S. If you've Matt or have already seen Matt's SAS-L post, don't respond!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

All right then, how about:

data want;

set have(keep=a--c d--e);

set have(keep=zip);

run;

Not quite as efficient, but more generalizable.

PG

PG

View solution in original post

10 REPLIES 10
PGStats
Opal | Level 21

data want(drop=zip rename=f=zip);

set have;

f=zip;

run;

PG
art297
Opal | Level 21

: Impressive! But FriedEgg's solution was more generalizable and didn't require having a variable available with the same case.

I won't post his solution just yet.

PGStats
Opal | Level 21

All right then, how about:

data want;

set have(keep=a--c d--e);

set have(keep=zip);

run;

Not quite as efficient, but more generalizable.

PG

PG
art297
Opal | Level 21

: Close enough! FriedEgg's solution was an efficient way of doing the same thing:

data want;

  if 0 then set have (drop=zip keep=a--e);

  set have (keep=a--e);

run;

PGStats
Opal | Level 21

Thanks Art! I agree, FriedEgg's solution is more efficient. - PG

PG
stat_sas
Ammonite | Level 13

My try

data want;

retain a b c d e zip;

set have(drop=f);

run;

art297
Opal | Level 21

stat@sas: Yes, that would definitely work, but imagine if a thru e actually represented about 500 variables. However, good suggestion!

Ksharp
Super User

Arthur.T ,

I think the fastest way is to create a view for that code ,and not to copy real table again. Smiley Happy

Xia Keshan

art297
Opal | Level 21

: Possibly. Compare the two on small, medium and large files and let us know.

ScottBass
Rhodochrosite | Level 12

Here's my/yet another approach:

* problem as stated ;

data have;

  input a b c $ Zip d e f;

  cards;

1 2 x 12345 6 7 8

2 3 y 54321 7 8 9

;

run;

%let vars=a b c d e zip;

data want1;

   format &vars;

   set have (keep=&vars);

run;

* more realistic example ;

data have;

  input b c $ f Zip d a e;

  cards;

2 x 8 12345 6 1 7

3 y 9 54321 7 2 8

;

run;

%let vars=a b c d e zip;

data want2;

   format &vars;

   set have (keep=&vars);

run;

proc compare base=want1 compare=want2;

run;

* last approach, use a "skeleton" dataset as metadata ;

data skel;

  length a b 8 c $8 d e Zip 8* these must be in the desired order for the target dataset ;

  call missing(of _all_);       * could also add formats, labels, etc. ;       

  stop;

run;

proc contents data=skel out=columns (keep=name varnum) noprint;

run;

proc sql noprint;

  select name into :vars separated by " " from columns order by varnum;

quit;

data want3;

* if any of your metadata columns are derived columns (i.e. not in your source dataset) ;

* remember the implied retain on dataset variables, or this can come back to bite you! ;

* IOW if skel contains a variable "foo" in order to set its PDV order, ;

* and "foo" is calculated in the data step, remember that it is retained in the data step. ;

  if 0 then set skel;

  set have (keep=&vars);

run;

proc compare base=want1 compare=want3;

run;

For the problem as stated, where your source data has the PDV in almost the exact order you want in the target dataset, approaches such as variable lists may use less code.  However, they are prone to undesired results if the source data structure changes.

In a perhaps more realistic example where the source data has its PDV in an order unrelated to the desired target dataset, you're gonna have to list the variables out.  If I had 500 variables, I'd proc print the data from proc contents output, reorder the variables in an editor, and create my program.

The last example is yet another approach.  Not the least amount of code, but an approach I've sometimes used in the past.

HTH,

Scott

Added comments about implied retain of dataset variables


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1082 views
  • 9 likes
  • 5 in conversation