0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628
Required output:
count
6 ---less then or equal to 0.00288
18----less then or equal to 0.00381
and so on.....
It appears that both of your columns are in ascending order. And you want column B to form boundaries for tabulating column A, correct? If so then you can interleave records sorted on the values in both a and b, keeping a count of records using A. Every time the value-in-hand comes from column B, subtract the prior count of A from the current count, and output.
This program uses the DIF functions (DIF(x)=x-lag(x)). Since the first DIF value will be missing, I use the expression
FREQ=min(counta,dif(counta)); so that the first FREQ is just a count, while subsequent FREQs are differences in counts:
data have;
input a b;
datalines;
0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628
run;
data want (keep=a freq);
set have (keep=a in=incolumna)
have (keep=b rename=(b=a) in=boundary);
by a;
counta+incolumna;
if boundary;
freq=min(counta,dif(counta));
run;
If your data are not sorted as I suggested, you could split the dataset into two datasets (one with column A values, one with column B values), sort each, and interleave them as above.
I am assunming every numer will be a value in its own observation and we'll call the variable " val". Than proc format is your friend like this:
proc format;
value ranges
0-0.00288='less than 0.00288'
0.00288-<0.00381='less than 0.00381'
other='other ranges can be added';
run;
proc freq data=have;
table val;
format val ranges.;
run;
Hope this helps,
-- Jan
Sorry, It not efficient code. I have lot of records. Any more suggestions.
This thread looks like a duplicate or extension of https://communities.sas.com/t5/Base-SAS-Programming/I-have-data-set-need-to-count-column-b-values-ho...
I used the sample you gave in that thread as floats gives me enormous eyestrain. If you can follow this code, you should be able to test on any sample:
data a;
input a b;
cards;
0.1 1
0.2 2
1 3
1 4
2 5
3 6
4 .
4 .
;
run;
data want;
if _n_=1 then do;
if 0 then set a(keep=a);
dcl hash H (dataset:'a(keep=a)',multidata:'y') ;
h.definekey ('a') ;
h.definedata ('a') ;
h.definedone () ;
dcl hash H1 (multidata:'y') ;
h1.definekey ('a') ;
h1.definedone () ;
dcl hiter i('h');
end;
set a(keep=b);
array t(100);/*arbitrary subscript for small sample, make it larger*/
call missing(of t(*));
where not missing(b);
count=0;
rc = i.first();
do _n_=1 by 1 while (rc = 0);
if h.check(key:b)=0 and ((a<=b and a in t) or (a<=b and h1.check() ne 0)) then count+1;
h1.add();
t(_n_)=a;
rc = i.next();
end;
drop t: a rc;
run;
It appears that both of your columns are in ascending order. And you want column B to form boundaries for tabulating column A, correct? If so then you can interleave records sorted on the values in both a and b, keeping a count of records using A. Every time the value-in-hand comes from column B, subtract the prior count of A from the current count, and output.
This program uses the DIF functions (DIF(x)=x-lag(x)). Since the first DIF value will be missing, I use the expression
FREQ=min(counta,dif(counta)); so that the first FREQ is just a count, while subsequent FREQs are differences in counts:
data have;
input a b;
datalines;
0.00256 0.00288
0.00256 0.00381
0.00256 0.00475
0.00267 0.00568
0.00270 0.00662
0.00288 0.00756
0.00288 0.00849
0.00314 0.00943
0.00320 0.01037
0.00320 0.01130
0.00321 0.01224
0.00323 0.01317
0.00326 0.01411
0.00332 0.01505
0.00345 0.01598
0.00351 0.01692
0.00351 0.01785
0.00353 0.01879
0.00353 0.01973
0.00354 0.02066
0.00354 0.02160
0.00363 0.02254
0.00371 0.02347
0.00378 0.02441
0.00380 0.02534
0.00383 0.02628
run;
data want (keep=a freq);
set have (keep=a in=incolumna)
have (keep=b rename=(b=a) in=boundary);
by a;
counta+incolumna;
if boundary;
freq=min(counta,dif(counta));
run;
If your data are not sorted as I suggested, you could split the dataset into two datasets (one with column A values, one with column B values), sort each, and interleave them as above.
Thanks alot.. I have being try this for two weeks thanks alot.. It really helpful me...
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.