BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jimmychoi
Obsidian | Level 7

Hi, i'm running a program and with data, as below:

 

data two (xy.txt):

5 2
3 1
5 6

the program:

 

data two;
	infile '/folders/myfolders/sasuser.v94/xy.txt';
	input x y;
run;

data one two other;
  set two;
  if x = 5 then output one;
  if y < 5 then output two;
  output;
run;

 

since output statement is placed at the end of the program, without any conditions (if),

shouldn't the dataset two have 2 exact same observations?

 

but what I see, from the output windows of SAS is:

- a table with 5 observations.

 

Please help.

 

below, I copied and pasted from the output window.

 

 Work.two Total rows: 5Total columns: 2

152 
252 
331 
431 
556

 

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ

Hi:
The issue you're going to run into is that the last OUTPUT statement will also write observations to one and two. So you'll end up with MORE observations in ONE and TWO than you might intend. It depends on what your intention is. Consider this output and debugging version of the program shown below. In the second output and the code, the HOW_OUT variable shows you exactly HOW each observation was written to each output file.
Cynthia

 

When I run a version of your program (to eliminate the confusion of having data two and set two, I started with a data set called 'fakedata'), this is what I get:

use_IF_out.png

 

This is how each obs got into the output files -- notice the new variable called "HOW_OUT" which shows exactly which statement wrote the obs to the file:

how_out.png

 

using this code

data fakedata;
input x y;
datalines;
5 2
3 1
5 6
;
run;

data one two other;
  length x 8 y 8 how_out $14;
  set fakedata;
  if x = 5 then do; how_out='if x = 5'; output one; end;
  if y < 5 then do; how_out='if y < 5'; output two; end;
  how_out='final output';
  output;
run;

proc print data=fakedata noobs;
  title '0) starting with work.fakedata';
  run;
 
proc print data=one noobs;
  title '1) what is in work.one';
  run;
 
proc print data=two noobs;
  title '2) what is in work.two';
  run;

proc print data=other noobs;
  title '3) what is in work.other';
  run;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

Yes, the data set two should have two identical observations. And it does.

 

I suspect you mistake the observation number for an actual variable? See the code below

 

data two;
input x y;
datalines;
5 2
3 1
5 6
;

data one two other;
  set two;
  if x = 5 then output one;
  if y < 5 then output two;
  output;
run;

proc print data=two;
run;
Cynthia_sas
SAS Super FREQ

Hi:
The issue you're going to run into is that the last OUTPUT statement will also write observations to one and two. So you'll end up with MORE observations in ONE and TWO than you might intend. It depends on what your intention is. Consider this output and debugging version of the program shown below. In the second output and the code, the HOW_OUT variable shows you exactly HOW each observation was written to each output file.
Cynthia

 

When I run a version of your program (to eliminate the confusion of having data two and set two, I started with a data set called 'fakedata'), this is what I get:

use_IF_out.png

 

This is how each obs got into the output files -- notice the new variable called "HOW_OUT" which shows exactly which statement wrote the obs to the file:

how_out.png

 

using this code

data fakedata;
input x y;
datalines;
5 2
3 1
5 6
;
run;

data one two other;
  length x 8 y 8 how_out $14;
  set fakedata;
  if x = 5 then do; how_out='if x = 5'; output one; end;
  if y < 5 then do; how_out='if y < 5'; output two; end;
  how_out='final output';
  output;
run;

proc print data=fakedata noobs;
  title '0) starting with work.fakedata';
  run;
 
proc print data=one noobs;
  title '1) what is in work.one';
  run;
 
proc print data=two noobs;
  title '2) what is in work.two';
  run;

proc print data=other noobs;
  title '3) what is in work.other';
  run;

jimmychoi
Obsidian | Level 7
Your idea to name data two to fakedata really helped me to understand, thanks
Cynthia_sas
SAS Super FREQ

Hi,
That really was something I consider a best practice. In my world, it is not good to do this:
data mydata;
   set mydata;
... more code ...;
run;

Because that makes it impossible to separate the INPUT data (on the SET statement) from the OUTPUT data (on the DATA statement) and could result in the loss of the INPUT data if you have any fatal errors in your code.

I ALWAYS recommend to my students that they avoid the temptation to use the same name on both their DATA and SET statements.

Cynthia

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 936 views
  • 4 likes
  • 3 in conversation