Hello,
I want to shifts up the var2 this data.
For that, I use the option next.
I have two problems,
I can not define the beginning of the missing values and their end.
here is an example
data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
;
run;
data test1;
set test end=eof;
next=_n_+4;
if not eof then set test (keep=var2 rename=(var2=next_var2 )) point=next;
run;
Thank you
When you have a statement like:
SET A B;
you are creating a single stream of data by concatenating observations in A , followed by observations in B. It doesn't matter that A and B don't have the same variables, you will still get Na+Nb observations (where Na is number of obs in A, and Nb is defined similarly).
But
SET A;
SET B;
generates two synchronized streams, so the number of obs will be the minimum of Na vs Nb. (The data step stops when either SET statement attempts to read beyond end of data. Of course any variable in both B and A will get the value from B overwriting the value from A.
Finally,
MERGE A B;
makes a (let's call it) a single merged stream. It will produce the same results as "SET A; SET B;", EXCEPT anytime Na^=Nb, there will be additional observations generating a total number of observations equal to the MAXIMUM of Na vs Nb. And those extra observations will have missing values assigned to variables belonging only to the smaller dataset.
These are very powerful capabilities of the SAS data step, which is often able to trivially generate results attainable in proc sql only by very tortured coding.
As to the question on the "end=" option of the SET statement, that is a question (unlike the above) which can easily be answered by the sas documentation, or a google search for sas+set+end.
What is your expected output ?
+1 mate!
data test;
input var1 var2;
cards;
1 4
2 9
3 .
4 .
5 .
3 .
;
run;
data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
;
run;
data test1;
merge test(drop=var2) test(keep=var2 where=(var2 is not missing)) ;
run;
And what if the OP wanted:
data want;
input var1 var2;
cards;
1 4
2 9
3 9
4 9
5 9
3 9
run;
data test1;
merge test(drop=var2) test(keep=var2 where=(var2 is not missing)) ;
retain _iorc_;
if not missing(var2) then _iorc_=var2;
else var2=_iorc_;
run;
An easy way to retain a SET (or MERGE) variable across multiple observations is to do conditional SET statements:
data want;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
run;
data want;
set have (drop=var2);
if eod2=0 then set have (keep=var2 where=(not missing(var2))) end=eod2;
run;
VAR2, like all variables read by SET or MERGE, is automatically retained until a subsequent data step iteration executes a SET or MERGE retrieving the same variable and overwrites the old value. Simply avoid any attempt to read var2 beyond end of data set HAVE.
While there's not much advantage in the case of a single VAR2 variables, consider the advantages when there are, say, a dozen variables that are simultaneously all missing or all non-missing and all need to be "shifted up".
Nice
Thank you,
in this example, why i have a 12 observations in t3 and only 6 in want.
data want;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
run;
data t1;
set want(keep=var1);
run;
data t2;
set want(keep=var2);
run;
data t3;
set t1 t2;
run;
data want;
set have (drop=var2);
set have (keep=var2);
run;
another question please.
is eod2 own for this statement only or for data result (want)?
then set have (keep=var2 where=(not missing(var2))) end=eod2
if so, could you, please, explain to me how he had to go "shift up" the observations of var2
When you have a statement like:
SET A B;
you are creating a single stream of data by concatenating observations in A , followed by observations in B. It doesn't matter that A and B don't have the same variables, you will still get Na+Nb observations (where Na is number of obs in A, and Nb is defined similarly).
But
SET A;
SET B;
generates two synchronized streams, so the number of obs will be the minimum of Na vs Nb. (The data step stops when either SET statement attempts to read beyond end of data. Of course any variable in both B and A will get the value from B overwriting the value from A.
Finally,
MERGE A B;
makes a (let's call it) a single merged stream. It will produce the same results as "SET A; SET B;", EXCEPT anytime Na^=Nb, there will be additional observations generating a total number of observations equal to the MAXIMUM of Na vs Nb. And those extra observations will have missing values assigned to variables belonging only to the smaller dataset.
These are very powerful capabilities of the SAS data step, which is often able to trivially generate results attainable in proc sql only by very tortured coding.
As to the question on the "end=" option of the SET statement, that is a question (unlike the above) which can easily be answered by the sas documentation, or a google search for sas+set+end.
data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
;
run;
data test1;
merge test(drop=var2) test(keep=var2 firstobs=5) ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.