Solved: how to find count of the following ?

rajeshalwayswel · Posted 05-21-2018 11:55 AM

0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628

Required output:

count

6 ---less then or equal to 0.00288

18----less then or equal to 0.00381

and so on.....

mkeintz · Posted 05-26-2018 07:41 PM

It appears that both of your columns are in ascending order. And you want column B to form boundaries for tabulating column A, correct? If so then you can interleave records sorted on the values in both a and b, keeping a count of records using A. Every time the value-in-hand comes from column B, subtract the prior count of A from the current count, and output.

This program uses the DIF functions (DIF(x)=x-lag(x)). Since the first DIF value will be missing, I use the expression

FREQ=min(counta,dif(counta)); so that the first FREQ is just a count, while subsequent FREQs are differences in counts:

data have;
  input a b;
datalines;
0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628
run;

data want (keep=a freq);
  set have (keep=a in=incolumna)
      have (keep=b rename=(b=a) in=boundary);
  by a;

  counta+incolumna;
  if boundary;
  freq=min(counta,dif(counta));
run;

If your data are not sorted as I suggested, you could split the dataset into two datasets (one with column A values, one with column B values), sort each, and interleave them as above.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

jklaverstijn · Posted 05-21-2018 12:26 PM

I am assunming every numer will be a value in its own observation and we'll call the variable " val". Than proc format is your friend like this:

proc format;
value ranges
   0-0.00288='less than 0.00288'
   0.00288-<0.00381='less than 0.00381'
   other='other ranges can be added';
run;
proc freq data=have;
   table val;
   format val ranges.;
run;

Hope this helps,

-- Jan

rajeshalwayswel · Posted 05-26-2018 01:36 PM

Sorry, It not efficient code. I have lot of records. Any more suggestions.

novinosrin · Posted 05-21-2018 01:45 PM

This thread looks like a duplicate or extension of https://communities.sas.com/t5/Base-SAS-Programming/I-have-data-set-need-to-count-column-b-values-ho...

I used the sample you gave in that thread as floats gives me enormous eyestrain. If you can follow this code, you should be able to test on any sample:

data a;
input a b;
cards;
0.1 1
0.2 2
1   3
1   4
2   5
3   6
4	.
4	.
;
run;


data want;
if _n_=1 then do;
if 0 then set a(keep=a);
 dcl hash H (dataset:'a(keep=a)',multidata:'y') ;
   h.definekey  ('a') ;
   h.definedata ('a') ;
    h.definedone () ;
dcl hash H1 (multidata:'y') ;
   h1.definekey  ('a') ;
   h1.definedone () ;
 dcl hiter i('h');
end;
set a(keep=b);
array t(100);/*arbitrary subscript for small sample, make it larger*/
call missing(of t(*));
where not missing(b);
count=0;
rc = i.first();
do _n_=1 by 1 while (rc = 0);
if h.check(key:b)=0 and ((a<=b and a  in t) or (a<=b and h1.check() ne 0))  then count+1;
h1.add();
t(_n_)=a;
rc = i.next();
end;
drop t: a rc;
run;

mkeintz · Posted 05-26-2018 07:41 PM

It appears that both of your columns are in ascending order. And you want column B to form boundaries for tabulating column A, correct? If so then you can interleave records sorted on the values in both a and b, keeping a count of records using A. Every time the value-in-hand comes from column B, subtract the prior count of A from the current count, and output.

This program uses the DIF functions (DIF(x)=x-lag(x)). Since the first DIF value will be missing, I use the expression

FREQ=min(counta,dif(counta)); so that the first FREQ is just a count, while subsequent FREQs are differences in counts:

data have;
  input a b;
datalines;
0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628
run;

data want (keep=a freq);
  set have (keep=a in=incolumna)
      have (keep=b rename=(b=a) in=boundary);
  by a;

  counta+incolumna;
  if boundary;
  freq=min(counta,dif(counta));
run;

If your data are not sorted as I suggested, you could split the dataset into two datasets (one with column A values, one with column B values), sort each, and interleave them as above.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

rajeshalwayswel · Posted 05-27-2018 02:49 AM

Thanks alot.. I have being try this for two weeks thanks alot.. It really helpful me...

how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Catch up on SAS Innovate 2026

how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Re: how to find count of the following ?

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away