DATA Step, Macro, Functions and more

remove last 0 rows(All trailing zeroes)

Reply
Super Contributor
Posts: 267

remove last 0 rows(All trailing zeroes)

[ Edited ]
data have;
input value;
cards;
  371
  0
  145
   75
   40
   41
   19
    0
   10
    2
    0
    1
    3
    999
   0
   0
   0
   0
   0
   0
   0
   0
   0
;
run;

 What SAS code logic can  remove last several zero value rows(In this example,remove all rows after value is 999)?

 

Thanks!

 

Here is my method please advise:

data need1;
set have;
n=_n_;
run;
proc sort data=need1 out=need2;
by descending n;
run;
data need3;
retain flag 0;
set need2 nobs=obs;
do i=1 to obs;
if value=0 then do;
  if flag =0 then delete;
end;
else do;
  flag=1;
end;
end;
run;
proc sort data=need3 out=need(keep=value);
by n;
run;
Super User
Posts: 23,267

Re: remove last 0 rows

Posted in reply to GeorgeSAS

What's the logic/rule?

Trusted Advisor
Posts: 1,311

Re: remove last 0 rows

Posted in reply to GeorgeSAS
What is your criterion? All trailing zeroes? All zeroes after 999? After the last 999?
Super Contributor
Posts: 267

Re: remove last 0 rows

All trailing zeroes.

Super User
Posts: 23,267

Re: remove last 0 rows

Posted in reply to GeorgeSAS

Reverse your data, delete all the first zeros and then reverse it back.

Super User
Posts: 6,629

Re: remove last 0 rows

Posted in reply to GeorgeSAS

There's really no way to do this in one step, since you have to keep reading in all the data to see if there is another nonzero.

 

Here's one approach:

 

data have;
input value;
if value ne 0 then call symputx('good_obs', _n_); cards; ...
;
data want;
set have (obs=&good_obs);
run;

 

Super Contributor
Posts: 267

Re: remove last 0 rows

Posted in reply to Astounding

This is good fancy solution.

 

Also use SYMPUTX instead of symput takes the additional step of removing any leading blanks

 

Thanks!

Contributor
Posts: 23

Re: remove last 0 rows(All trailing zeroes)

Posted in reply to GeorgeSAS

You could use this technique; Seems to work with this case and it's one data step.

*Merge dataset with itself, starting with subsequent values. This will get post values. Output if you don't have 2 trailing zeroes;
data want (drop = post_value1 post_value2);
merge have
      have (firstobs = 2 rename = value = post_value1)
      have (firstobs = 3 rename = value = post_value2);

if sum (value, post_value1, post_value2) > 0 then output;
run;
Super Contributor
Posts: 267

Re: remove last 0 rows(All trailing zeroes)

I am not understand your code and the code has error after I run.

Respected Advisor
Posts: 3,845

Re: remove last 0 rows(All trailing zeroes)

Posted in reply to GeorgeSAS

MODIFY.

 

data have;
   input value @@;
   cards;
  371
  0
  145
   75    40
   41   19
    0   10
    2    0
    1    3
    999
   0   0
   0   0
   0   0
   0   0
   0
;
run;
proc print;
   run;
data have;
   do i=j by -1 to 1;
      modify have point=i nobs=j;
      if value eq 0 then remove;
      else stop;
      end;
   stop;
   run;
proc print;
   run;
Super Contributor
Posts: 267

Re: remove last 0 rows(All trailing zeroes)

[ Edited ]
Posted in reply to data_null__

Great solution! very fancy code!

 

I never used 'point='  in a data step.I want to learn it by this example.

 

May I ask what the "point=i" here does? 

 

Thanks!

 

 

By the way here will be a problem if the have dataset created in a different environment than the update program:

(that is if I created the 'have'  in UNIX, but when i use this code in PC to update the dataset, the error will happen.

but that is fine, I can run the update code in UNIX too)

 

ERROR: File have cannot be updated because
its encoding does not match the session encoding or the
file is in a format native to another host, such as
HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64

Respected Advisor
Posts: 3,845

Re: remove last 0 rows(All trailing zeroes)

Posted in reply to GeorgeSAS

POINT= is a MODIFY statement option to name the variable that points to the observation being modified.

 

See the documentation for complete details.

Trusted Advisor
Posts: 1,311

Re: remove last 0 rows(All trailing zeroes)

Posted in reply to GeorgeSAS

I like the use of the MODIFY statement for removing observations in place (i.e. don't copy the original data set, just update in place).

 

However, be aware that the data set attribute NOBS (number of observations) is unchanged by REMOVE statements.  I think this is because SAS needs to know how much physical space is used by the dataset.  And since it would be inefficient for SAS to delete internal records by overwriting all the subsequent records to new locations within the data set, they are just marked as removed, and total physical space is not reduced by REMOVE.  However, the NLOBS (number of logical records) is adjusted.

 

If you need to make a new dataset, which will have NOBS=NLOBS, you can use this program, which uses the same "point=" logic as @data_null__:

 

data want;
  if _n_=1 then do p=nrecs to 1 by -1 until(value^=0);
    set have point=p nobs=nrecs;
  end;
  set have;
  if _n_>p then stop;
run;
Super User
Posts: 10,689

Re: remove last 0 rows(All trailing zeroes)

Posted in reply to GeorgeSAS
data have;
input value;
cards;
  371
  0
  145
   75
   40
   41
   19
    0
   10
    2
    0
    1
    3
    999
   0
   0
   0
   0
   0
   0
   0
   0
   0
;
run;
data have; 
 set have;
 if value=0 then group=0;
  else group=1;
run;
data have;
 set have end=last;
 by group notsorted;
 n+first.group;
 if last then call symputx('n',n);
run;
data want;
 set have;
 if n=&n and value=0 then delete;
 drop group n;
run;

Ask a Question
Discussion stats
  • 13 replies
  • 268 views
  • 2 likes
  • 7 in conversation