10-27-2017 12:06 PM - edited 10-27-2017 12:29 PM
data have; input value; cards; 371 0 145 75 40 41 19 0 10 2 0 1 3 999 0 0 0 0 0 0 0 0 0 ; run;
What SAS code logic can remove last several zero value rows(In this example,remove all rows after value is 999)?
Here is my method please advise:
data need1; set have; n=_n_; run; proc sort data=need1 out=need2; by descending n; run; data need3; retain flag 0; set need2 nobs=obs; do i=1 to obs; if value=0 then do; if flag =0 then delete; end; else do; flag=1; end; end; run; proc sort data=need3 out=need(keep=value); by n; run;
10-27-2017 12:19 PM
There's really no way to do this in one step, since you have to keep reading in all the data to see if there is another nonzero.
Here's one approach:
data have; input value;
if value ne 0 then call symputx('good_obs', _n_); cards; ...
set have (obs=&good_obs);
10-27-2017 12:57 PM
You could use this technique; Seems to work with this case and it's one data step.
*Merge dataset with itself, starting with subsequent values. This will get post values. Output if you don't have 2 trailing zeroes; data want (drop = post_value1 post_value2); merge have have (firstobs = 2 rename = value = post_value1) have (firstobs = 3 rename = value = post_value2); if sum (value, post_value1, post_value2) > 0 then output; run;
10-27-2017 01:24 PM
data have; input value @@; cards; 371 0 145 75 40 41 19 0 10 2 0 1 3 999 0 0 0 0 0 0 0 0 0 ; run; proc print; run; data have; do i=j by -1 to 1; modify have point=i nobs=j; if value eq 0 then remove; else stop; end; stop; run; proc print; run;
10-27-2017 03:28 PM - edited 10-27-2017 03:42 PM
Great solution! very fancy code!
I never used 'point=' in a data step.I want to learn it by this example.
May I ask what the "point=i" here does?
By the way here will be a problem if the have dataset created in a different environment than the update program:
(that is if I created the 'have' in UNIX, but when i use this code in PC to update the dataset, the error will happen.
but that is fine, I can run the update code in UNIX too)
ERROR: File have cannot be updated because
its encoding does not match the session encoding or the
file is in a format native to another host, such as
HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64
10-27-2017 03:38 PM
POINT= is a MODIFY statement option to name the variable that points to the observation being modified.
See the documentation for complete details.
10-27-2017 05:19 PM
I like the use of the MODIFY statement for removing observations in place (i.e. don't copy the original data set, just update in place).
However, be aware that the data set attribute NOBS (number of observations) is unchanged by REMOVE statements. I think this is because SAS needs to know how much physical space is used by the dataset. And since it would be inefficient for SAS to delete internal records by overwriting all the subsequent records to new locations within the data set, they are just marked as removed, and total physical space is not reduced by REMOVE. However, the NLOBS (number of logical records) is adjusted.
If you need to make a new dataset, which will have NOBS=NLOBS, you can use this program, which uses the same "point=" logic as @data_null__:
data want; if _n_=1 then do p=nrecs to 1 by -1 until(value^=0); set have point=p nobs=nrecs; end; set have; if _n_>p then stop; run;
10-29-2017 08:03 AM
data have; input value; cards; 371 0 145 75 40 41 19 0 10 2 0 1 3 999 0 0 0 0 0 0 0 0 0 ; run; data have; set have; if value=0 then group=0; else group=1; run; data have; set have end=last; by group notsorted; n+first.group; if last then call symputx('n',n); run; data want; set have; if n=&n and value=0 then delete; drop group n; run;