Re: Same name ?

edasdfasdfasdfa · Posted 05-14-2020 08:03 AM

Hello,

Can we use the same data name throughout code (different blocks) and will the program know we are referring to the same data set or do we need a new name for new block?

Kurt_Bremser · Posted 05-14-2020 08:11 AM

A dataset is a dataset is a dataset and will have the same name for the time of its existence.

What exactly is your (perceived) issue? Show code and point out where you have doubts or questions.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

edasdfasdfasdfa · Posted 05-14-2020 08:28 AM

Sorry for any confusion.

I just mean that I tend to give a new name to a data set each time I start a new block (assuming there is significant additions in that new block). But I have had the habit of reading the old data set name with set and giving the new block a new name. I'm just asking if I can use the same name throughout even if i start new blocks?

Kurt_Bremser · Posted 05-14-2020 08:31 AM

Define "block". In SAS parlance, a "block" is a set of statements that together constitute a logical entity, usually a do/end block in a data step.

SHOW CODE AND POINT OUT YOUR QUESTION POINTS.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

edasdfasdfasdfa · Posted 05-14-2020 08:36 AM

data banktransactions;

infile

input

run;

can i just now refer to it again after the run statement like data banktransactions; ??

do i need a set statement?

Kurt_Bremser · Posted 05-14-2020 08:40 AM

This very rudimentary piece of pseudo-code seems to read external data from an external file into a dataset, therefore a SET statement is not needed; if you want to read the resulting dataset in a follow-up DATA step, a SET, MERGE or MODIFY statement is needed.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ChrisHemedinger · Posted 05-14-2020 08:42 AM

I think you want to know whether it's possible (and maybe good practice) to operate on the same data set over many different steps.

It's possible, Yes.

data mydata;
 infile...;
 input ...;
 /* do work */
run;

data mydata;
 set mydata; /* bring in existing data file */
 /* do more work */
run;

data mydata;
 set mydata; /* bring in existing data file */
 /* do EVEN MORE work */
run;

As to whether it's a good idea...well, it depends. With each step you are reading the data again, so it may be more efficient to combine steps. And during development you might make mistakes and alter your data in ways you didn't intend. If that happens, you don't have an intermediate version to go back to so you have to start over with running the first step.

SAS Hackathon registration is open! Build your skills. Make connections. Enjoy creative freedom. Maybe change the world.

Reeza · Posted 05-14-2020 12:45 PM

@edasdfasdfasdfa wrote:

data banktransactions;

infile

input

run;

can i just now refer to it again after the run statement like data banktransactions; ??

do i need a set statement?

You do not need a SET statement but you cannot just refer to your data set as 'data banktransactions'.

When you want to use the data set you refer to it using the libraryName.dataSetName notation. If the library is WORK you can omit that portion of the statement. However, you can use it in a DATA statement, SET statement or other relevant statements.

Using the same name over and over is not recommended because it makes it harder to debug your code. My development process is to write my code in steps and then once it's working add it back to my main data step.

*import or make fake data;
data have;
set sashelp.class;
run;

*summarize data set;
proc means data=have;
run;

proc freq data=have;
run;

*add a new variable BMI;
data have2;
set have;
*****calculate BMI;
BMI = weight/ (height**2) * 703;
run;

*categorize that variable;
data have3;
set have2;
    	length category $20.;

    	if bmi < 18 then
    		category='Under Weight';
    	else if 18 <= BMI < 25 then
    		category='Normal';
    	else if 25 <= BMI < 30 then
    		category ='Over Weight';
    	else if BMI >=30 then
    		category = 'Obese';
run;

That would be my first stab. Once I was sure the category and calculation was working, I would then modify my code to be:

*add a new variable BMI and categorize it;
data have2;
set have;

length category $20.;

*****calculate BMI;
BMI = weight/ (height**2) * 703;



    	if bmi < 18 then
    		category='Under Weight';
    	else if 18 <= BMI < 25 then
    		category='Normal';
    	else if 25 <= BMI < 30 then
    		category ='Over Weight';
    	else if BMI >=30 then
    		category = 'Obese';
run;

ballardw · Posted 05-14-2020 05:08 PM

If you mean that you generate code similar to this:

data one;
  infile filename <fileoptions>;
  input <variables>;
run;

data two;
   set one;
   newvar = <some calculation>;
end;

data three;
   set two;
   othervar = <some calculation>;
end;

data four;
   set three;
   nowthatvar =<more calculations>;
run;

What I would suggest is after you testing the data from creating data set two to move the calculations in the data step that creates data set one. Then as you test each bit of code to move it to the earliest practical set.

It may be that the changes should all be moved into the data two step especially if you are replacing values in existing variables (data cleaning, standardizing or recoding perhaps). That way if data step two does have a problem then the data one is available to restart from the same point.

If any of these extra data steps only do Format or Label assignments then really they should be in an earlier data step.

Use same name for data across different DATA steps?

Re: Same name ?

Re: Same name ?

Re: Same name ?

Re: Same name ?

Re: Same name ?

Re: Same name ?

Re: Same name ?

Re: Use same name for data across different DATA steps?

Ready to join fellow brilliant minds for the SAS Hackathon?