BookmarkSubscribeRSS Feed
RobF
Quartz | Level 8

I'm attempting to output observations to a SAS dataset on my company's server in SAS Enterprise Guide from within a data _null_ step.

Here's the basic idea using a test dataset:

data test_data;

input y x1 x2;

cards;

1 2 3

10 20 30

100 200 300

;

run;

data _NULL_;

set test_data;

file 'E:\SAS Temporary Files\_TD21860_VO-DCA-VSAS01_\Prc2\report.sas7bdat';

put y;

run;

The file statement includes the address of the WORK folder on the company server.

The code successfully creates a file named "report", however when I attempt to open the file I receive the following error message even though I specified the SAS data set extension ".sas7bdat" in the file statement:

The open data operation failed. The following error occurred.

[Error] File WORK.REPORT.DATA is not a SAS data set.

Neither can I successfully run:

proc print data=report;

run;

What am I doing wrong?

Thanks in advance

Robert

14 REPLIES 14
ballardw
Super User

You created a TEXT file not a SAS data set. PUT generates character output and FILE generally only creates TEXT files (unless you are doing extra stuff to look like specific form of file).

If you want a SAS data set then it should be referenced with a library and named on 1) the Data step (data mylib.report with the library specified to the location, though it looks like you MIGHT have been attempting to write to the default WORK library which wouldn't need to be referenced.)

2) specified in the output method used by given procedure, often OUT= or OUTPUT though ods adds some options.

RobF
Quartz | Level 8

Thanks ballardw - I was hoping there would be a quick fix while sticking with the data _null_ step.

Tom
Super User Tom
Super User

Did you mean something like this:

data 'E:\SAS Temporary Files\_TD21860_VO-DCA-VSAS01_\Prc2\report.sas7bdat';

set test_data;

keep y;

run;

RobF
Quartz | Level 8

That may work . . . but I'm trying to keep overhead memory consumption at a minimum, especially if working with big dataset. Hence the use of the data _null_ statement instead of simply doing what you suggested.

Or are my worries unjustified?

ballardw
Super User

Maybe you are looking for Proc Copy or Proc Datasets to move the dataset around. Data _null_ in any form processes every record in the dataset and your overhead may well go down significantly using a procedure designed to move entire datasets.

RobF
Quartz | Level 8


Well I'm actually doing a lot of data processing inside the data _null_ step by reading my data into an array, and then just keeping the final output, so not sure if a proc copy or proc datasets would work inside of the data _null_. (The example I offered in my question may be a bit misleading since it's grossly simplified.)

I'll try Tom's idea and see how it works. I suppose the other option is just output the final results to the SAS log with a put statement, then copy and paste to Excel or whatever.

TomKari
Onyx | Level 15

Could it be what you need is a DROP statement to prevent your array elements from being saved in your dataset? Your comment about saving memory isn't in line with what you're trying to do.

Tom

ballardw
Super User

Put to the log, especially if you are talking about large datasets is likely even more inefficient with the added possibility of exceeding the number of lines allowed in the log. Which then adds yet another layer of complexity to the project.

I have a hard time seeing why DATA lib.datasetname is unacceptable as it will execute in basically the same time as data _null_.

libname mylib "'E:\SAS Temporary Files\_TD21860_VO-DCA-VSAS01_\Prc2";

data mylib.report;

     set test_data;

      <other calculations>

run;

Though I would be VERY hesitant to write anything I wanted later to a folder subordinate to a SAS temporary folder as those will get deleted at the end of the session.

You may not be aware that you can specify when data is written to the dataset. So you could retain values across iterations of the data step and then output when the desired summary has been completed for groups of records.

RobF
Quartz | Level 8

Yea, think I'll scratch the "data _null_;" idea and just use a regular "data report (keep=...);" statement to output the end results of my program computations into a separate dataset.

That works fine - I just want SAS to avoid creating a duplicate dataset in memory, then dropping the excess columns after reading the keep= line in my data statement.

Kurt_Bremser
Super User

When you use the keep= or drop= options in the set statament of the data step, you already reduce the size of the PDV and the required memory.

But:

If you don't have gazillions of variables in your input data set (or create them in the data step), the memory consumption of a data step is negligible.

1000 numerical variables will "eat" 8K and the space needed for the metadata (name to location table), which is peanuts compared to what the SAS system itself needs to simply run.

RobF
Quartz | Level 8

Will do, thanks Tom. So far my program is running fine with the keep statement. Thank you, all, for the suggestions.

At the moment my worries about working with really big data that consumes most of the memory on my machine are mostly theoretical, but I'd like to write my code with that contingency in mind for efficiency's sake.

Kurt_Bremser
Super User

Big data is just disk space. In a data step, SAS takes only the memory needed for 1 record at a time, unless you make excessive use of functions like lag(). So the memory consumption of a data step depends mainly on the record size of the dataset(s).

This is different from software like R that loads a complete dataset into memory and treats it more like a spreadsheet.

For dealing with big data, the data step is (IMO) the #1 solution, efficiencywise. Once it works for 1 record, it works for any number of records. The only things that increase are disk space and execution time.

TomKari
Onyx | Level 15

I don't see any reason why you should consume excessive memory, but if you see any indications that you are, please post back. It would indicate that you are doing something unusual, and there might be a less memory-intensive way to do it.

Tom

jwillis
Quartz | Level 8

RobF,

A Proc Printto, to write your put values to a text file, might be a better option when you intend to use information that is normally "put" to your log. 

PROC PRINTTO LOG='Your\folder\and file\location\data null output file name.TXT' NEW; RUN;

data _null_;

run;

PROC PRINTTO;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 3231 views
  • 11 likes
  • 6 in conversation