BookmarkSubscribeRSS Feed
nswaroop
Calcite | Level 5

 

Is a reverse of PROC CONTENTS possible to create an EMPTY DATASET.

 

I have created a proc contents dataset as follows:

 

proc contents data=EMP out=EMP_OUT noprint;
run;

 

Now I updated the LENGTH of few columns in EMP_OUT using proc sql.

 

Then I like to create the EMPTY DATASET having this updated LENGTHs and the rest of the attributes remain the same.

After that I like to merge the EMPTY DATASET back to the original EMP.

 

Basically, I am trying to update lengths of each of the variables accoring to its maximum value length in that variable. To reduce the size of the dataset.

11 REPLIES 11
Reeza
Super User

Have you used or tried the DESCRIBE statement in PROC SQL. 

nswaroop
Calcite | Level 5

The DESCRIBE is printing a CREATE TABLE statement with all the columns in the proc contents file. What I need is a CREATE TABLE of the table EMP with new lengths.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In your SQL which fixes the length, put a condition which can't be true:

where 1=0;

Then you will have a dataset with 0 observations.

 

However it sounds a bit of a pointless task, your not going to save anything in your dataset by reducing the length (unless you talking about 10s of millions of rows - in which case you should be looknig at proper tech for that).  Its probably also a bit risky, if your column width is dynamic, then all code after that and based on that data has to be dynamic.  Me I prefer nice fixed structures, simpler coding, straight forward documentation, no faffing about.

 

 

 

 

nswaroop
Calcite | Level 5
Not sure I understood the solution. But, I am tending to think the requirement is different.
You are correct that better coding practices but I am beginner to SAS and so...
Kurt_Bremser
Super User

If you are concerned with overlong character variable lengths and the resulting use of disk space, just use the compress=yes option to minimize the space used by empty character variables.

Keep in mind that the defined lengths might be needed in the future.

 

nswaroop
Calcite | Level 5
Compress is okay. This is a requirement of the FDA (clinical domain) to set minimum lengths possible before sending it to them.
Astounding
PROC Star

You have three steps now:  PROC CONTENTS, PROC SQL, and a DATA step to merge.  You could do this much more simply in a single DATA step, since you know the names of the variables you want to hange and the lengths you want to assign:

 

data want;

length name $ 20 address $ 30;

set have;

run;

 

Just make sure the LENGTH statement appears before the SET statement.

rogerjdeangelis
Barite | Level 11
Hits #41 Creating an empty sas datasets (from SAS-L)

Not exactly a reversal of proc contents, but related

/* T000180 CREATING AN EMPTY SAS DATASETS
                              */
   data shell;
       length x 8 name sex $16;
       call missing(of _all_);
       stop;
   run;quit;  /* call missing elims warniing */


   data shell(drop=age);
        stop;
        set sashelp.class sashelp.shoes;
        AgeChar=put(age,z2.);
   run;

   proc sql;
        create table t1(
             id int
            ,ic varchar(10)
            ,icd varchar(500)
            ,Idca varchar(500)
        ) ;
   quit;

   proc sql;
        create
           table new
        like
           sashelp.class;
   quit;

   Also you can create a SAS dataset with data
   (bug introduced after 9.2 - char lengths missing)

   filename tagset http "http://support.sas.com/rnd/base/ods/odsmarkup/sql.sas";
   %include tagset;
   ods tagsets.sql file="class.sql";
   proc print data=sashelp.class ;
   run;
   ods _all_ close;
   ods listing;

   Create table CLASS
   (Name varchar(7), Sex varchar(1), Age float, Height float, Weight float);
   Insert into CLASS(Name, Sex, Age, Height, Weight)
   Values ('Alfred', 'M', 14, 69.0, 112.5);
   Insert into CLASS(Name, Sex, Age, Height, Weight)
   Values ('Alice', 'F', 13, 56.5, 84.0);
   /* Can be simplified */
   Create table CLASS
   (Name varchar(7), Sex varchar(1), Age float, Height float, Weight float);
   Insert into CLASS(Name, Sex, Age, Height, Weight)
   Values ('Alfred', 'M', 14, 69.0, 112.5)
   Values ('Alice', 'F', 13, 56.5, 84.0);


nswaroop
Calcite | Level 5
I am trying it automate it in a macro by sending the name of the dataset and getting a return dataset with reduced variable lengths.
nswaroop
Calcite | Level 5

Thank you all for your time.

 

From the replies I understand there is no easy way to create a dataset reverse of PROC CONTENTS. So, I am stating my actual purpose in asking.

 

Just send a dataset to a macro and get a return dataset that has reduced varaible lengths according to the maximum length of data in each character column (leaving numeric columns lengths as original).

Kurt_Bremser
Super User

Try this:

%macro reduce_length(dataset);
%let lib=%upcase(%sysfunc(scan(&dataset,1,.)));
%let ds=%upcase(%sysfunc(scan(&dataset,2,.)));

proc sql noprint;
select name into :names separated by ' '
from dictionary.columns
where libname = "&lib" and memname = "&ds" and type = 'char'
;
select length into :lengths separated by ' '
from dictionary.columns
where libname = "&lib" and memname = "&ds" and type = 'char'
;
quit;

%let num_vars=%sysfunc(countw(&names));

data lengths;
set &dataset (keep=&names) end=done;
array vars {*} &names;
array lens {&num_vars} _temporary_
;
do i = 1 to dim(vars);
  lens{i} = max(lens{i},length(vars{i}));
end;
if done
then do;
  do i = 1 to dim(vars);
    __name = vname(vars{i});
    __length = lens{i};
    output;
  end;
end;
keep __name __length;
run;

data _null_;
set lengths end=done;
if _n_ = 1
then do;
  call execute("data work.new_&ds.; length ");
end;
call execute(trim(__name) !! " $" !! trim(put(__length,5.)) !! " ");
if done then call execute("; set &dataset.; run;");
run;

%mend;

It retrieves the defined lengths from dictionary.columns, then runs a data step with array processing to get the maximum actual lengths, writes them to a dataset, and then creates a data step that sets the new lengths.

- the order of variables in the resulting dataset will change, as all character variables will move to the left (front).

- you will get warnings about multiple lengths specified.

I tried it with sashelp.cars, and it reduced the length of type from 8 to 6.

 

You might consider adding additional macro parameters to specify the output library or similar.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1404 views
  • 1 like
  • 6 in conversation