BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hello,

I want to shifts up the var2 this data.

For that, I use the option next.

I have two problems,

I can not define the beginning of the missing values ​​and their end.

here is an example

 

 

data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
; 
run;


data test1;
set test end=eof;
next=_n_+4;
if not eof then set test (keep=var2 rename=(var2=next_var2 )) point=next;
run;

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
Jade | Level 19

When you have a statement like:

     SET A B;

 

you are creating a single stream of data by concatenating observations in A , followed by observations in B.  It doesn't matter that A and B don't have the  same variables, you will still get Na+Nb observations (where Na is number of obs in A, and Nb is defined similarly).

 

But

    SET A;
    SET B;

 

generates two synchronized streams, so the number of obs will be the minimum of Na vs Nb.   (The data step stops when either SET statement attempts to read beyond end of data.  Of  course any  variable in both B and  A will get the value from B overwriting the value from A.

 

Finally,

   MERGE A B;

makes a (let's call it) a single merged stream.  It will produce the same results  as "SET A; SET B;",  EXCEPT anytime  Na^=Nb, there will be additional observations generating a  total number of observations equal to the  MAXIMUM of Na vs Nb.   And those extra observations will have missing values assigned to variables belonging only to the smaller dataset.

 

These are very powerful capabilities of the SAS data step, which is often able to trivially generate results attainable in proc sql only by very tortured coding.

 

 

As to the question on the "end=" option of the SET statement, that is a question (unlike the above) which can easily be answered by the sas documentation, or a google search for   sas+set+end.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

11 REPLIES 11
r_behata
Barite | Level 11

What is your expected output ?

novinosrin
Tourmaline | Level 20

+1 mate!

mansour_ib_sas
Pyrite | Level 9
data test;
input var1 var2;
cards;
1 4
2 9
3 .
4 .
5 .
3 .
; 
run;
novinosrin
Tourmaline | Level 20

data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
; 
run;


data test1;
merge test(drop=var2)  test(keep=var2 where=(var2 is not missing)) ;
run;
mkeintz
Jade | Level 19

And what if the OP wanted:

 

data want;
input var1 var2;
cards;
1 4
2 9
3 9
4 9
5 9
3 9
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20
data test1;
merge test(drop=var2)  test(keep=var2 where=(var2 is not missing)) ;
retain _iorc_;
if not missing(var2) then _iorc_=var2;
else var2=_iorc_;
run;
mkeintz
Jade | Level 19

An easy way to retain a SET (or MERGE) variable across multiple observations is to do conditional SET statements:

 

data want;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
run;

data want;
  set have (drop=var2);
  if eod2=0 then set have (keep=var2 where=(not missing(var2))) end=eod2;
run;

 

VAR2, like all variables read by SET or MERGE, is automatically retained until a subsequent data step iteration executes a SET or MERGE retrieving the same variable and overwrites the old value.  Simply avoid any attempt to read var2 beyond end of data set HAVE.

 

While there's not much advantage in the case of a single VAR2 variables, consider the advantages when there are, say, a dozen variables that are simultaneously all missing or all non-missing and all need to be "shifted up".

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

Nice

mansour_ib_sas
Pyrite | Level 9

Thank you,

in this example, why i have a 12 observations in t3 and only 6 in want.

 

data want;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
run;

data t1;
set want(keep=var1);
run;

data t2;
set want(keep=var2);
run;

data t3;
set t1 t2;
run;

data want;
  set have (drop=var2);
  set have (keep=var2);
run;

another question please.

is  eod2 own for this statement only or for data result (want)?

then set have (keep=var2 where=(not missing(var2))) end=eod2

 if so, could you, please, explain to me how he had to go "shift up" the observations of var2

mkeintz
Jade | Level 19

When you have a statement like:

     SET A B;

 

you are creating a single stream of data by concatenating observations in A , followed by observations in B.  It doesn't matter that A and B don't have the  same variables, you will still get Na+Nb observations (where Na is number of obs in A, and Nb is defined similarly).

 

But

    SET A;
    SET B;

 

generates two synchronized streams, so the number of obs will be the minimum of Na vs Nb.   (The data step stops when either SET statement attempts to read beyond end of data.  Of  course any  variable in both B and  A will get the value from B overwriting the value from A.

 

Finally,

   MERGE A B;

makes a (let's call it) a single merged stream.  It will produce the same results  as "SET A; SET B;",  EXCEPT anytime  Na^=Nb, there will be additional observations generating a  total number of observations equal to the  MAXIMUM of Na vs Nb.   And those extra observations will have missing values assigned to variables belonging only to the smaller dataset.

 

These are very powerful capabilities of the SAS data step, which is often able to trivially generate results attainable in proc sql only by very tortured coding.

 

 

As to the question on the "end=" option of the SET statement, that is a question (unlike the above) which can easily be answered by the sas documentation, or a google search for   sas+set+end.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User

data test;
input var1 var2;
cards;
1 .
2 .
3 .
4 .
5 4
3 9
; 
run;


data test1;
merge test(drop=var2)  test(keep=var2 firstobs=5) ;
run;

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1045 views
  • 4 likes
  • 5 in conversation