BookmarkSubscribeRSS Feed
GeorgeSAS
Lapis Lazuli | Level 10
data have;
input value;
cards;
  371
  0
  145
   75
   40
   41
   19
    0
   10
    2
    0
    1
    3
    999
   0
   0
   0
   0
   0
   0
   0
   0
   0
;
run;

 What SAS code logic can  remove last several zero value rows(In this example,remove all rows after value is 999)?

 

Thanks!

 

Here is my method please advise:

data need1;
set have;
n=_n_;
run;
proc sort data=need1 out=need2;
by descending n;
run;
data need3;
retain flag 0;
set need2 nobs=obs;
do i=1 to obs;
if value=0 then do;
  if flag =0 then delete;
end;
else do;
  flag=1;
end;
end;
run;
proc sort data=need3 out=need(keep=value);
by n;
run;
13 REPLIES 13
Reeza
Super User

What's the logic/rule?

mkeintz
PROC Star
What is your criterion? All trailing zeroes? All zeroes after 999? After the last 999?
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
GeorgeSAS
Lapis Lazuli | Level 10

All trailing zeroes.

Reeza
Super User

Reverse your data, delete all the first zeros and then reverse it back.

Astounding
PROC Star

There's really no way to do this in one step, since you have to keep reading in all the data to see if there is another nonzero.

 

Here's one approach:

 

data have;
input value;
if value ne 0 then call symputx('good_obs', _n_); cards; ...
;
data want;
set have (obs=&good_obs);
run;

 

GeorgeSAS
Lapis Lazuli | Level 10

This is good fancy solution.

 

Also use SYMPUTX instead of symput takes the additional step of removing any leading blanks

 

Thanks!

Rwon
Obsidian | Level 7

You could use this technique; Seems to work with this case and it's one data step.

*Merge dataset with itself, starting with subsequent values. This will get post values. Output if you don't have 2 trailing zeroes;
data want (drop = post_value1 post_value2);
merge have
      have (firstobs = 2 rename = value = post_value1)
      have (firstobs = 3 rename = value = post_value2);

if sum (value, post_value1, post_value2) > 0 then output;
run;
GeorgeSAS
Lapis Lazuli | Level 10

I am not understand your code and the code has error after I run.

data_null__
Jade | Level 19

MODIFY.

 

data have;
   input value @@;
   cards;
  371
  0
  145
   75    40
   41   19
    0   10
    2    0
    1    3
    999
   0   0
   0   0
   0   0
   0   0
   0
;
run;
proc print;
   run;
data have;
   do i=j by -1 to 1;
      modify have point=i nobs=j;
      if value eq 0 then remove;
      else stop;
      end;
   stop;
   run;
proc print;
   run;
GeorgeSAS
Lapis Lazuli | Level 10

Great solution! very fancy code!

 

I never used 'point='  in a data step.I want to learn it by this example.

 

May I ask what the "point=i" here does? 

 

Thanks!

 

 

By the way here will be a problem if the have dataset created in a different environment than the update program:

(that is if I created the 'have'  in UNIX, but when i use this code in PC to update the dataset, the error will happen.

but that is fine, I can run the update code in UNIX too)

 

ERROR: File have cannot be updated because
its encoding does not match the session encoding or the
file is in a format native to another host, such as
HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64

data_null__
Jade | Level 19

POINT= is a MODIFY statement option to name the variable that points to the observation being modified.

 

See the documentation for complete details.

mkeintz
PROC Star

I like the use of the MODIFY statement for removing observations in place (i.e. don't copy the original data set, just update in place).

 

However, be aware that the data set attribute NOBS (number of observations) is unchanged by REMOVE statements.  I think this is because SAS needs to know how much physical space is used by the dataset.  And since it would be inefficient for SAS to delete internal records by overwriting all the subsequent records to new locations within the data set, they are just marked as removed, and total physical space is not reduced by REMOVE.  However, the NLOBS (number of logical records) is adjusted.

 

If you need to make a new dataset, which will have NOBS=NLOBS, you can use this program, which uses the same "point=" logic as @data_null__:

 

data want;
  if _n_=1 then do p=nrecs to 1 by -1 until(value^=0);
    set have point=p nobs=nrecs;
  end;
  set have;
  if _n_>p then stop;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
input value;
cards;
  371
  0
  145
   75
   40
   41
   19
    0
   10
    2
    0
    1
    3
    999
   0
   0
   0
   0
   0
   0
   0
   0
   0
;
run;
data have; 
 set have;
 if value=0 then group=0;
  else group=1;
run;
data have;
 set have end=last;
 by group notsorted;
 n+first.group;
 if last then call symputx('n',n);
run;
data want;
 set have;
 if n=&n and value=0 then delete;
 drop group n;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2360 views
  • 2 likes
  • 7 in conversation