Help using Base SAS procedures

Import and export 200,000 variables TXT files

Reply
Contributor
Posts: 36

Import and export 200,000 variables TXT files

Hi.

I got stuck on TAB separated TXT files with 200,000 variables each. I need to import them and then process them and export them back. I can import the first row of variable names like this:

data varnames;

infile 'x'  dsd delimiter = '09'x recfm = v lrecl = 5000000 obs = 1;

input varname : $32. @@;

run;

However, the datalayout should be that the variable names are on the columns and the values on the rows. I need to import all 200,000 variables, and then export them as well.

How to do this?

Super User
Posts: 10,466

Re: Import and export 200,000 variables TXT files

I would start with PROC IMPORT. You have most of the important bits already.
Proc import datafile='X' /* or what ever the path/name to your data*/

       out = dataout

       dbms=dlm

       replace; /* in case you need to run this again*/

       delimiter = '09'x;

       datarow = 2; /*this will assume that the first row has the variable names */

       guessingrows = 32767; /* sas will "guess" whether your variables are numeric (or date/time) or string by looking at up to this many rows*/

run;

Respected Advisor
Posts: 3,124

Re: Import and export 200,000 variables TXT files

If my memory serves me right, Proc import has the 'lrecl' limit as 32,767. While OP needs to import 200,000 variables. Not sure how to accommodate using proc import.

One thing you could try is to put all of the variable names into macro variables (you probably need more than one, as the length of macro variable also has the same limit) if they are of the same type (character or numeric) or ideally same informat, and then using data step to input them.

Regards,

Haikuo

Trusted Advisor
Posts: 1,300

Re: Import and export 200,000 variables TXT files

filename x temp;

*build fake datafile;

data _null_;

file x;

put '6509660967'x;

put '4809490950'x;

run;

*collect variable names and number of variables;

data _null_;

infile x dlm='09'x obs=1;

input varname : $32. @@;

call symputx(cats('varname',_n_),varname);

call symputx('nobs',_n_);

run;

*input data, assuming all columns have the same informat $32.;

options mprint;

data foo;

infile x dlm='09'x firstobs=2;

input %macro vloop; ( %do i=1 %to &nobs; &&varname&i %end; ) (:$32.) %mend; %vloop ;

run;

Contributor
Posts: 44

Re: Import and export 200,000 variables TXT files

Can you provide us with a snapshot of the data? Maybe the first 5 rows of columns 1 to 80 to give us an idea of what you mean?

Contributor
Posts: 36

Re: Import and export 200,000 variables TXT files

Many thanks. I noted it was much faster with the following:

data varnames;
infile "x" dsd delimiter = '09'x recfm = v lrecl = 5000000 pad obs = 1;
input varname : $32. @@;

call symput('nvars', _n_);
run;

data varnames (drop = i);
set varnames end = eof ;
if _n_ = 1 then delete;
do i = 1 to 1000; if _n_ >= 1000 * (i - 1) then mvar_cat = i; end;
if eof then call symput('ncats', mvar_cat);
run;

%macro mvarcat_loop(i); proc sql noprint; %global var&i; select varname into :var&i separated by ' ' from varnames where mvar_cat = &i; quit; %mend mvarcat_loop;
%macro mvarcat; %do i = 1 %to &ncats; %mvarcat_loop(&i); %end; %mend mvarcat; %mvarcat;

%macro mvarcat2;

data prework;
infile "x" dsd delimiter = '09'x recfm = v lrecl = 5000000 pad firstobs = 2 obs = 10;
input (%do i = 1 %to &ncats; &&var&i %endSmiley Wink (:$2.);
run;
%mend mvarcat2;
%mvarcat2;

Super User
Super User
Posts: 6,495

Re: Import and export 200,000 variables TXT files

How about letting PROC TRANSPOSE do it for you.

* 1) Read the names into a dataset.  I used _NAME_ for the variable name as this is the default for PROC TRANSPOSE. ;

* Generate NCOL macro variable to see how many columns to read for each row. ;

data varnames;

  length _name_ $32 ;

  infile tmpfile1 dlm='09'x dsd obs=1 lrecl=5000000 end=eof;

  input _name_ @@;

  if eof then call symputx('ncol',_n_);

run;

sasfile varnames open;

* 2) Create a data step view to read the actual values.  Generate a ROW number  ;

* Use POINT= option to get the _NAME_ for the column. ;

data vertical / view=vertical;

  infile tmpfile1 dlm='09'x dsd firstobs=2 truncover lrecl=5000000;

  row+1;

  do col=1 to &ncol;

    input value $ @;

    set varnames point=col;

    output;

  end;

run;

* 3) PROC TRANSPOSE ;

proc transpose data=vertical out=horizontal(drop=_name_);

  by row;

  var value;

run;

sasfile varnames close;

Trusted Advisor
Posts: 1,300

Re: Import and export 200,000 variables TXT files

In my first response I did not think through the quantity of macro variables you would be creating.  It would not be a good approach.  Instead I would generate an include file, the following could be cleaned up:

filename x temp;

data _null_;

file x;

put '6509660967'x;

put '4809490950'x;

run;

filename codegen temp;

data _null_;

file codegen;

put '%macro vars;';

run;

data _null_;

file codegen mod;

infile x dlm='09'x obs=1 end=last;

input varname : $32. @@;

put varname;

run;

data _null_;

file codegen mod;

put '%mend;';

run;

%include "%sysfunc(pathname(codegen))" /source2;

data foo;

infile x dlm='09'x firstobs=2;

input ( %vars ) (:$32.);

run;

Contributor
Posts: 36

Re: Import and export 200,000 variables TXT files

It works just fine, and I will something similar for the TXT-export as well.

Ask a Question
Discussion stats
  • 8 replies
  • 425 views
  • 0 likes
  • 6 in conversation