I want to drop a range of variables & obs from my data set to make it smaller. The data set has 1200 variables & 80000 obs. My objective is to create a small sample data set for which I can write the test code otherwise it takes too long to test every step. The obs bit was easy as I used the (obs = ___ ) option.
I found several threads which explain how to drop variables conditionally or when there is a pattern to the varname but nothing which explains how to just keep first n variables or n random variables.
There is no pattern in my varnames as such . I just want to keep any 50 variables and remove the rest without having to type varnames.
thanks,
Nikhil
You could use something like the following. The following creates a file containing the first 4 variables, and first 10 records. from the file sashelp.class:
proc sql noprint;
select name
into :keeps separated by ' '
from dictionary.columns
where libname='SASHELP' and
memname='CLASS' and
varnum le 4
;
quit;
data want;
set sashelp.class (keep=&keeps. obs=10);
run;
You could use something like the following. The following creates a file containing the first 4 variables, and first 10 records. from the file sashelp.class:
proc sql noprint;
select name
into :keeps separated by ' '
from dictionary.columns
where libname='SASHELP' and
memname='CLASS' and
varnum le 4
;
quit;
data want;
set sashelp.class (keep=&keeps. obs=10);
run;
or you could do this if you know the first variable name and the 50th variable name.
data want;
set have( keep= 1st_var -- 50th_var );
run;
You have got excellent answers regarding first n, here is one way to do 'random n', the example is to get 10obs, and random 7 variables from sashelp.cars. (note: It runs faster if using Proc SQL on dictionary.columns, sashelp.vcolumn is much slower, it will take out couple of minutes on your first run):
PROC SURVEYSELECT DATA=SASHELP.VCOLUMN (WHERE=(LIBNAME='SASHELP' AND MEMNAME='CARS'))
OUT=VARNUM(KEEP=VARNUM NAME)
METHOD=SRS
N=7;
RUN;
PROC SQL NOPRINT;
SELECT NAME INTO :VNAME SEPARATED BY ' ' FROM VARNUM;
QUIT;
DATA WANT;
SET SASHELP.CARS (OBS=10 KEEP=&VNAME);
RUN;
Haikuo
Hi Hai.Kuo, Can you please tell me how to format the code in SAS Scheme in the editor window? Is there a meta-post which explains how to do this ?
thanks,
nikhil
EG does for me. Right click inside program window, you will see an option "format code". And I don't think Program editor in Base SAS can do that.
No, I meant when I copy paste it to the text editor here in SAS communities, it loses all formatting.
Or, that is where it gets complicated. It highly depends on the type of your Web browser and settings in your browser. Since all you lost are just formats, you could use Microsoft Word as a middle man. It works for some.
Good luck,
Haikuo
First copy & paste to Word, then Copy & paste again into the forum.
Thanks - yeah that sounds like the thing to do !
If it turns out that you really do want the variable selection to be random, the following will produce a warning, but will (I believe) correctly accomplish the task fairly easily:
proc sql noprint outobs=4;
select name
into :keeps separated by ' '
from dictionary.columns
where libname='SASHELP' and
memname='CLASS'
order by ranuni(0)
;
quit;
data want;
set sashelp.class (keep=&keeps. obs=10);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.