Equivalent SAS code

Reply
Contributor
Posts: 37

Equivalent SAS code

What would be an equivalent SAS code for this STATA code:

 

xtile patient_xtile = totalpatients if  surveymiss !=1, nquantiles(4)

ta patient_xtile
bysort patient_xtile: sum totalpatients

 

Thanking you in advance.

Contributor
Posts: 37

Re: Equivalent SAS code

Posted in reply to pmpradhan

I'm getting slighlty different result with following code. Is the code below equivalent: 

 

proc sort data=dataname;
by totalpatients;
run;

 

data test;

set dataname;

n_group=floor(_n_/num/4));

if n_group=4 then n_group=3;

run;

 

proc means data=test;
class n_group;
var totalpatients;
run;

 

Please confirm!

Super User
Posts: 6,785

Re: Equivalent SAS code

Posted in reply to pmpradhan

Your calculations for N_GROUP look like they compute uneven group sizes:

 

n_group=floor(_n_/num/4));

if n_group=4 then n_group=3;

 

Since I'm not a STATA user, I can't really tell the intent of the code.  But if you want 4 equal size groups, you could use:

 

data test;

set dataname nobs=_total_obs_;

n_group=ceil (4 * _n_ / _total_obs_);

run;

 

If you were looking for 5 equal size groups, just change "4" to "5".

Super User
Posts: 23,776

Re: Equivalent SAS code

[ Edited ]
Posted in reply to Astounding

quantiles makes me think it's quartiles and that PROC RANK with GROUP=4 should be used.

EDIT: was going to link to a similar question from last week, but it was your question. I'm assuming they're related?

https://communities.sas.com/t5/Base-SAS-Programming/Quartiling-and-finding-the-average-in-each-quart...

 

The answer is the same.

 

proc rank data=sashelp.cars out=ranked groups=4;
var mpg_city;
rank rank_mpg_city;
run;

proc means data=ranked noprint nway;
class rank_mpg_city;
var mpg_city;
output out=want mean(mpg_city)=avg_mpg_city_Quartiled;
run;

proc print data=ranked;
run;
Contributor
Posts: 37

Re: Equivalent SAS code

Yes, the post you referenced was from me. This time I wanted to reproduce the results that another person produced in STATA. I got close enough result but not same. I was off by few numbers. I will try reorganizing the data. Thanks again!

Super User
Posts: 23,776

Re: Equivalent SAS code

Posted in reply to pmpradhan

Quantile/Percentile calculations likely differ, especially if you have ties or not a lot of data. In that case, see the defintions for how PROC RANK calculates percentiles against STATA and see which definition you should be using. 

 

There is no 'standard' method to calculate the percentiles - Excel will do it differently as well.

 


@pmpradhan wrote:

Yes, the post you referenced was from me. This time I wanted to reproduce the results that another person produced in STATA. I got close enough result but not same. I was off by few numbers. I will try reorganizing the data. Thanks again!


 

Contributor
Posts: 37

Re: Equivalent SAS code

Posted in reply to Astounding

Thank you Astounding. Since the no of observation in the dataset is not even-I had to do so. But I like your use of ceil too. Thank you!

Super User
Posts: 23,776

Re: Equivalent SAS code

Posted in reply to pmpradhan

Explain the logic and we can answer your question faster. Otherwise you need to wait for someone who understands Stata code.

Contributor
Posts: 37

Re: Equivalent SAS code

The logic in this particular case is to translate the stata code so that I can have same results. As I replied above, I'm few numbers far from getting an exact match. I will try with sorting the data again and update this thread. I appreciate the community support-here, Thanks team! 

Super User
Posts: 13,583

Re: Equivalent SAS code

Posted in reply to pmpradhan

@pmpradhan wrote:

The logic in this particular case is to translate the stata code so that I can have same results. As I replied above, I'm few numbers far from getting an exact match. I will try with sorting the data again and update this thread. I appreciate the community support-here, Thanks team! 


Quite often with non-trivial cases the "same results" may not be possible due to things like internal rounding of values, precision of the hardware/software used or just plain differences in algorithms for approximations.

 

If you search this forum you will find a few questions about people getting different results between SAS version X.X and Y.Y where algorithms are tweaked between versions, or moving from one OS to another, especially the 32 bit vs. 64 bit versions of the same OS.

 

You may have to set a target for "close enough".

PROC Star
Posts: 8,167

Re: Equivalent SAS code

Posted in reply to pmpradhan

I'm not familiar with STATA, but I would try a couple of changes to @Reeza's suggested code. If I'm reading your STATA code correctly, it appears that you're excluding cases where surveymiss is equal to 1. Also, it appears that the xtile statement produces ranks from 1 to 4, while SAS produces 0 to 3. As such, I'd try something like:

data cars;
  set sashelp.cars;
  if _n_ in (5,7,15,40) then surveymiss=1;
  else surveymiss=2;
run;

proc rank data=cars (where=(surveymiss ne 1)) out=ranked groups=4;
  var mpg_city;
  ranks rank_mpg_city;
run;

data ranked;
  set ranked;
  rank_mpg_city=rank_mpg_city+1;
run;

proc means data=ranked noprint nway;
  class rank_mpg_city;
  var mpg_city;
  output out=want mean(mpg_city)=avg_mpg_city_Quartiled;
run;

proc print data=ranked;
run;

Art, CEO, AnalystFinder.com

 

Ask a Question
Discussion stats
  • 10 replies
  • 148 views
  • 3 likes
  • 5 in conversation