BookmarkSubscribeRSS Feed
Shradha1
Obsidian | Level 7
I have a list of transaction amounts. From there I want to select only those observations where the transaction amount is in round figure, like 250, 500, 1000, 2000, etc.
If I take only cases where mod(x,10)=0 then it also keeps transactions like 1410, 1280, but I do not want these, I want round figures like 250, 300, 500,etc.
How do I proceed? I am using SAS 9.4.
8 REPLIES 8
mkeintz
PROC Star

Editted note:  Apologies:  I have misunderstood your problem.  You can ignore everything below.  I leave it for the curious.

 

 

 

 

You apparently want all values that are multiple of 250 times 2 to an integer power.  I.e.

  • x=250 =   250* (2**0)
  • x=500 =   250* (2**1)
  • x=1000 = 250* (2**2)
  • etc.

So, if the log2 of (x/250) is a non-negative integer, you would satisfy this requirement:

 

data have;
  do x=1 to 16000; output; end;
run;

data want;
  set have;
  where log2(x/250) = floor(log2(x/250));
  where also log2(x/250)>=0;
  put x=;
run;

Of course, at some point there may be a numeric precision issue, but it works correctly at least up to 250*(2**20).  When I replaced 16000 above by 250*(2**20), I got the log below, taking about 40 seconds to run: 

 

6890   data have;
6891     do x=1 to 250*(2**20); output; end;
6892   run;

NOTE: The data set WORK.HAVE has 262144000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           5.01 seconds
      cpu time            5.00 seconds


6893
6894   data want;
6895     set have;
6896     where log2(x/250) = floor(log2(x/250));
6897     where also log2(x/250)>=0;
NOTE: WHERE clause has been augmented.
6898     put x=;
6899   run;

x=250
x=500
x=1000
x=2000
x=4000
x=8000
x=16000
x=32000
x=64000
x=128000
x=256000
x=512000
x=1024000
x=2048000
x=4096000
x=8192000
x=16384000
x=32768000
x=65536000
x=131072000
x=262144000
NOTE: There were 21 observations read from the data set WORK.HAVE.
      WHERE (LOG2((x/250))=FLOOR(LOG2((x/250)))) and (LOG2((x/250))>=0);
NOTE: The data set WORK.WANT has 21 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           35.17 seconds
      cpu time            35.15 seconds

And why do I know this is correct?  Because there are 21 resulting observations, representing powers of 2 from 0 through 20.

 

 

Here's a faster way: make a list of desired value in an array.  Then you don't have to bother with generating logs.  The data step below took 10 seconds (vs 35 seconds above):

 

data want;
  array testvalues {0:20} _temporary_;
  if _n_=1 then do p=0 to 20;
    testvalues{p}=250*(2**p);
  end;

  set have;
  if  whichn(x,of testvalues{*});
  put x=;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

@mkeintz wrote:

You apparently want all values that are multiple of 250 times 2 to an integer power.  I.e.

  • x=250 =   250* (2**0)
  • x=500 =   250* (2**1)
  • x=1000 = 250* (2**2)
  • etc.

Except that the OP @Shradha1 stated in his/her text (which doesn't agree with the title, and I always find that confusing) that he/she wants to select 300 as well.

 

Which brings me back to my comment. The OP needs to define "round figures".

--
Paige Miller
mkeintz
PROC Star
Yes, just realized that. I put a disclaimer at the top of my note. THX.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PhilC
Rhodochrosite | Level 12

I agree the definition of "round" is not standard.  What I think is being described is a rounding, in terms of scientific notation, where the significand or mantissa is rounded to the nearest 0.5.  If that is so, I find the mantissa, round, and convert back to normal, non-scientific-notation form.

 

 

%macro RandBetween(min, max);
   (&min + floor((1+&max-&min)*rand("uniform")))
%mend;

data want ;
  do i = 1 to 10;
    p=%RandBetween(2,nobs);
    set SASHELP.GEOEXM(Keep=last) point=p nobs=nobs;
    last_power = 10**floor(LOG10(last));
      drop last_power ;
    last_new = round(last,0.5*last_power);
    output;
  end;
  stop;
proc sort;
  by last;
data want;
  set want;
  i=_N_;
run;

 

i	Last	last_new
1	6187	6000
2	9933	10000
3	15660	15000
4	36662	35000
5	45234	45000
6	47104	45000
7	64690	65000
8	72860	75000
9	76560	75000
10	124603	100000

 

PaigeMiller
Diamond | Level 26

What I think is being described is a rounding, in terms of scientific notation, where the significand or mantissa is rounded to the nearest 0.5.

How can you tell that from the contradictory problem explanation?

--
Paige Miller
PhilC
Rhodochrosite | Level 12

A problem explanation than can be contradictory implies a problem that was written in a mathematical logic.  Our friend gave us an example of what they are looking for.  I formed an opinion of what they wanted, and offered an my best educated guess.  I though it was a better answer than Mr. @mkeintz , before he retracted it.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5369 views
  • 1 like
  • 5 in conversation