BookmarkSubscribeRSS Feed
Flyer2345
Calcite | Level 5

Hi, 

 

I have a financial dataset which is split by 2018Q1 and 2019Q1 and I need to find the difference of Median between two datasets. I can use proc means and by YearQtr however it brings PROC MEANS results for 2018Q1 and 2019Q1 and I have to do manually difference between two datasets. I am looking for a Macro that basically takes PROC Means results and put them in Array and then take difference and spit out MEDIAN for first dataset (i.e. 2018Q1), MEDIAN for second dataset (i.e. 2019Q1) and associated difference.

 

Thanks much in advance.

7 REPLIES 7
Rick_SAS
SAS Super FREQ

 

I suggest you concatenate the data into a single data set that has an indicator variable Quarter that has values "2018Q1" and "2019Q1".

You can then use the CLASS statement to analyze both quarters in the same procedure call. For example, the following gives medians for males and females in the same data set:

 

proc means data=sashelp.class median;
class sex;
var height;
output out=Medians median=;
run;

data diff;
set Medians;
where _TYPE_ = 1;
diff = dif(Height);
run;

proc print data=diff; 
run;

If you are a statistician, this approach also permits more sophisticated analyses. For example, instead of merely computing the difference between the group medians, you can ask whether the difference is statistically significant and get confidence intervals for the difference. This might be more than you need, but if you are interested, read about "The difference of medians in SAS"

 

and adapt the following code:

 

proc quantreg data=Sashelp.Class;
   class sex;
   model height = sex / quantile=0.5;
   estimate 'Diff in Medians' sex 1 -1 / CL;
run;

 

 

 

Flyer2345
Calcite | Level 5
You brought a very good point. Actually I am looking for calculating confidence level between variables in datasets. I do have joined the data so my dataset does contain Year & Quarter together. In order to find the overlap of Mean in two datasets, I was thinking to take difference automatically thru code instead of visually looking at it. Also, I have more than 15 variables to compare for Null Hypothesis & Statistically signficance. What do you recommend regarding checking confidence level and whether Null Hypothesis is correct or not?
Rick_SAS
SAS Super FREQ

The article I linked to has a discussion. It also provides links to the doc for PROC NPAR1WAY, which contains many tests for medians, including the WILCOXON option requests the Wilcoxon test with the MEDIAN option, which requests the median test for difference in location.

 

Be sure to read the section of the NPAR1WAY doc about multiple comparisons and the DSCF option. You need to adjust for multiple comparisons if you have 15 class levels.

Levi_M
Fluorite | Level 6

would you characterize proc quantreg as "median difference was determined by using quantile regression"? Would you expect to see similar significance values as mannU? Is there much a difference between the two?

Rick_SAS
SAS Super FREQ

> would you characterize proc quantreg as "median difference was determined by using quantile regression"?

Yes.

 

> Would you expect to see similar significance values as mannU?

In many cases, yes. But for some data, one MannU might be significant and QUANTREG be no significant, or vice versa. For an example that uses normality tests, see "How often do different statistical tests agree?"

 

> Is there much a difference between the two?

Mathematically? Yes. In terms of testing the median difference, both should be usable in practice and help guide to you make valid inferences.

 

Is there much a difference between the two?

Levi_M
Fluorite | Level 6

Thank you for taking time to reply. I just posted a new question regrading this topic, with my data, code, and results. If you do have a chance, I would appreciate your thoughts. 

 

https://communities.sas.com/t5/Statistical-Procedures/Quantile-Regression-and-MannU-test-for-median-...

Reeza
Super User

You need to provide more details but taking the difference is a trivial operation that may not need a macro.

 

  • What makes you think you need a macro? Most of the time it's simpler to not use macros, they're harder to debug and maintain over time. 
  • How many variables are you dealing with?If one variable this is easy, if for 200, slightly different approach needed.
  • Do the variables have any naming conventions? This can make it easier to code.
  • How many observations? Larger data may need a different approach for efficiency. 
  • Do you need to do this dynamically in the future, ie auto calculate the latest quarter and difference those?
  • What version of SAS are you using? In SAS 9.4+ the Median function is valid in SQL, it's not available in prior versions. 

@Flyer2345 wrote:

Hi, 

 

I have a financial dataset which is split by 2018Q1 and 2019Q1 and I need to find the difference of Median between two datasets. I can use proc means and by YearQtr however it brings PROC MEANS results for 2018Q1 and 2019Q1 and I have to do manually difference between two datasets. I am looking for a Macro that basically takes PROC Means results and put them in Array and then take difference and spit out MEDIAN for first dataset (i.e. 2018Q1), MEDIAN for second dataset (i.e. 2019Q1) and associated difference.

 

Thanks much in advance.


 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2975 views
  • 3 likes
  • 4 in conversation