BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PaigeMiller
Diamond | Level 26

@Cruise wrote:

@Reeza @PaigeMiller 

 

I ended up using  Reeza's approach since this took a percentile of the rows. Paige's approach marked N 1-100 rows in total from 27,440 rows.

 

REEZA.png


@Cruise  you never mentioned that you had 27440 observations. First you showed 12 observations, later you showed 100, and we could get to the correct answer much more quickly if we had known that you would want to do this on an arbitrarily sized data set in the first place.

 

In particular, I point out that pct32 has 31 percent of the observations, not 32 percent of the observations. You seem to have indicated earlier that you want 32 percent of the observations in pct32, not 31 percent of the observations in pct32, so again, it's not clear to me what the exact result is that you want. But if you can accept pct32 having a value of 31 then my very first solution provides that answer as well.

--
Paige Miller
Cruise
Ammonite | Level 13

@PaigeMiller 

I truly appreciate your pointing out to the pct31 an pct32 are both associated with 31% of the data. I apologize for not being clear. I have different sizes of samples in actuality depending on the anatomical site of the body. N=27,440 is the site of a pancreas. Your rank and rand approach yielded exact matching results. Is it possible to convert your code to percentile so pct31 would cover the exact 31% of the data and pct32 covers exact 32% of the data., for example. Again, sorry for not being clear. Big lesson learned here.

Cruise
Ammonite | Level 13

@PaigeMiller @Reeza 

As Paige pointed out, I checked for the all discordant cases and Bernoulli distribution doesn't provide exact % of the data at following 5 occasions. I have varying size of samples for the different anatomical sites of the disease.

 

DISCORDANT.png

PaigeMiller
Diamond | Level 26

@Cruise wrote:

@PaigeMiller 

I truly appreciate your pointing out to the pct31 an pct32 are both associated with 31% of the data. I apologize for not being clear. I have different sizes of samples in actuality depending on the anatomical site of the body. N=27,440 is the site of a pancreas. Your rank and rand approach yielded exact matching results. Is it possible to convert your code to percentile so pct31 would cover the exact 31% of the data and pct32 covers exact 32% of the data., for example. Again, sorry for not being clear. Big lesson learned here.


A slight modification to my earlier solution will work for the case of N=27,440 (or any other value of N), and provides every percent variable with the exact amount of values of 1 that you want, to within round-off error. (You will note that for any arbitrary N, the value of 32% of that N may not be an integer, we can't overcome that with these methods).

 

data have;
	do id=1 to 27440;
    rand=rand('uniform');
	output;
	end;
run;
proc rank data=have out=ranked;
    var rand;
	ranks ranks;
run;
/* this next PROC SUMMARY is used if you want SAS to determine n for any data set, even though */
/* for this example we know that n = 27440 */
proc summary data=ranked;
	var id;
	output out=n n=n;
run;
data want;
	if _n_=1 then set n;
	set ranked;
	array mark mark1-mark99;
	do i=1 to dim(mark);
	    if ranks<=i*n/100 then mark(i)=1; /* here we use the computed value of n */
		else mark(i)=0;
	end;
run;
proc summary data=want;
	var mark1-mark99;
	output out=stats mean=;
    	format mark1-mark99 percent7.1;
run;
--
Paige Miller
Cruise
Ammonite | Level 13
I tried this method on three difference datasets. All outputs had exact percent variable with the exact amount of values. Thank you !

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 19 replies
  • 4329 views
  • 12 likes
  • 3 in conversation