Solved: Re: Copy dataset and keep/drop variables without reading data into mem...

left · Posted 02-07-2023 09:59 AM

Is there a way to keep or drop variables from a dataset without using a datastep?

Let's assume I know I have a dataset (might be huge) and want to copy it to a new location with just a few variables.

How would I do this most efficiently, i.e. using minimal computing ressources?

Example:

proc copy in      = sashelp
          out     = work
          memtype = data;
    select cars;
run;

That will copy all variables, so in this case

Variable list as text:

Make	Origin	Invoice	Horsepower	Weight
Model	DriveTrain	EngineSize	MPG_City	Wheelbase
Type	MSRP	Cylinders	MPG_Highway	Length

Usually I would do this:

data cars_few_vars;
    set cars (keep = make model mpg_city);
run;

Without creating a new dataset I can think of this:

proc sort data = cars (keep = make model mpg_city);
    by make;
run;

I think it should be possible to work on a higher level than reading the dataset into memory (datastep, proc sort).

Kurt_Bremser · Posted 02-07-2023 12:29 PM

Since datasets are stored one observation after another, with each observation containing all values for all defined variables, you must always read the whole dataset physically, even when only some variables are kept.

To reduce the size of the target dataset, the data step is the most efficient tool; if you need to sort anyway, dataset options are the way to go.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Astounding · Posted 02-07-2023 10:45 AM

I'm not sure you will see any savings, but you might. Test using a view:

libname somewhere 'path to destination';

data cars / view=cars;
   set sashelp.cars (keep=make model mpg_city);
run;
proc copy in      = work
          out     = somewhere;
    select cars;
run;

(Wish I could test it for you but I can't.)

left · Posted 02-07-2023 11:30 AM

Nice point - I haven't thought about views.

However I am hoping to find a binary answer like

You can (copy and) filter a dataset using to limit variables in a dataset without having to read the dataset into memory (and without using views) and here is how it goes...
You cannot do that, no way. If you want to change the variables in a dataset you must use a kind of datastep/ SQL statement/ etc. to accomplish this.

I look for this kind of confirmation/ correction in my question.

From proc datasets procedure I know that the copy option does not support data set options.

The same applies to the proc copy procedure.

As we have the metadata in the dictionaries I was thinking there should be a way of copying a dataset and use filtering methods (keep/ drop) on a higher level than actually reading the whole data.

Tom · Posted 02-07-2023 12:21 PM

Not sure what you mean about "reading data into memory". SAS does not store datasets in memory. It only stores the observations in memory.

To read a dataset you need to read the whole dataset. Unless you want to skip some observations by using the FIRSTOBS= and/or OBS= (really means lastobs) dataset options.

The KEEP= and/or DROP= dataset options are probably the easiest ways to select which variables to copy.

data outds;
  set inds(keep=id var1 var5);
run;

But you could also generate a simple PROC SQL select statement to select (and potentially re-order) the variables you want.

proc sql;
create table outds as
  select id,var5,var1
  from inds
;
quit;

Kurt_Bremser · Posted 02-07-2023 12:29 PM

Since datasets are stored one observation after another, with each observation containing all values for all defined variables, you must always read the whole dataset physically, even when only some variables are kept.

To reduce the size of the target dataset, the data step is the most efficient tool; if you need to sort anyway, dataset options are the way to go.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort)

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Catch up on SAS Innovate 2026

Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort)

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Re: Copy dataset and keep/drop variables without reading data into memory (no datastep, no proc sort

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away