BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shengnian
Fluorite | Level 6

data aaa;
aa = 1; b = 3; output;
run;

data ccc;
aa = 2; output;
aa = 3; output;
aa = 3; output;
aa = 3; output;
aa = 4; output;
aa = 5; output;
run;

data aab;
put _all_;
set aaa ccc;
/* by aa;*/
if aa = 3 then do;
b = 1;
b1 = 2;
end;
put _all_;
run;

 

As above, when I comment the by statement, b is retained as 1  when aa = 4, 5. However when I uncomment the by statement, the value of b becomes missing. I wonder what happended when by xxx is used with set statement?

 

By the way, if the set statement is replaced by merge statement, no matter whether commenting the by statement or not, the value of b never become 1 when aa = 4,5

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Consider an abbreviated version of your example:

 

data combined;

set aaa ccc;

by aa;

run;

 

AAA contains both AA and B. CCC contains AA only.

 

As the DATA step processes the observations, it alternates reading observations from AAA and CCC.  As part of that process, whenever it switches from one data set to the other, it reinitializes B to missing.  After that, if the next observation comes from AAA, it replaces B.  If the next observation comes from CCC, it does not replace B.

View solution in original post

4 REPLIES 4
Astounding
PROC Star

You're looking at the effects of a few features.

 

Variables that come from a SAS data set are automatically retained.  That includes B, since it comes from AAA.  Without a BY statement, you set B to 1 and nothing replaces B for the rest of the DATA step.  So it remains 1 from that point forward. The software has to decide when to set variables to missing when they are brought in from a SAS data set, and does so whenever it switches from one data set to another.

 

You might be interested to compare that to what happens if you make a slight change to your program:

 

if aa=4 then do;

 

With a BY statement, the software has an additional function to perform.  Should it ever re-set retained variables to a missing value?  The answer depends on whether you use SET or MERGE.  With SET + BY, the software re-sets retained variables to missing when it begins reading observations from a new data set.  With MERGE + BY, the software re-sets retained variables to missing when it begins a new value of a BY variable.

shengnian
Fluorite | Level 6

With SET + BY, the software re-sets retained variables to missing when it begins reading observations from a new data set.

 

Can you explain more about the new data set?  Very grateful.

Astounding
PROC Star

Consider an abbreviated version of your example:

 

data combined;

set aaa ccc;

by aa;

run;

 

AAA contains both AA and B. CCC contains AA only.

 

As the DATA step processes the observations, it alternates reading observations from AAA and CCC.  As part of that process, whenever it switches from one data set to the other, it reinitializes B to missing.  After that, if the next observation comes from AAA, it replaces B.  If the next observation comes from CCC, it does not replace B.

shengnian
Fluorite | Level 6

You answered my question perfectly ! Thanks, Astounding.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 872 views
  • 1 like
  • 2 in conversation