DATA Step, Macro, Functions and more

What is The Easiest Way to Understand How Two Populations Ranges Are Consistent?

Accepted Solution Solved
Reply
Super Contributor
Posts: 395
Accepted Solution

What is The Easiest Way to Understand How Two Populations Ranges Are Consistent?

Hello everyone,

 

I try to analyze PSI(Population Stability Index) and SSI(Stability Statistic Index) between two data sets. The one of them is Large Small&Medium Businnes mass the other is Commercial&Corporate mass. I do this analyze to understand whether the two data sets’s ranges are consistent or not.

 

Firstly, is this a correct or approach or there can be a better approaches?

 

Secondly, I have two sample data sets as below, of course, these data sets have Model variables but I just did not add the whole variables. My question is that how can I pull just one YearMonth for every Year and then compare the two data sets?

 

Data DataSmall;
Length CustomerID 8 YearMonth $ 10 Year $ 10 Turnover 8 ;
Infile Datalines Missover;
Input CustomerID YearMonth Year Turnover;
Format ;
Datalines;
001 201001 2010 70000
001 201002 2010 70000
001 201003 2010 70000
001 201004 2010 70000
001 201005 2010 70000 
001 201006 2010 70000
001 201007 2010 70000
001 201008 2010 70000
001 201009 2010 70000
001 201010 2010 70000
001 201011 2010 70000
001 201012 2010 70000
001 201101 2011 80000
001 201102 2011 80000
001 201103 2011 80000
001 201104 2011 80000
001 201105 2011 80000
001 201106 2011 80000
001 201107 2011 80000
001 201108 2011 80000
001 201109 2011 80000
001 201110 2011 80000
001 201111 2011 80000
001 201112 2011 80000
;
Run;

Data DataCommercial;
Length CustomerID 8 YearMonth $ 10 Year $ 10 Turnover 8 ;
Infile Datalines Missover;
Input CustomerID YearMonth Year Turnover;
Format ;
Datalines;
003 201001 2010 9000000
003 201002 2010 9000000
003 201003 2010 9000000
003 201004 2010 9000000
003 201005 2010 9000000 
003 201006 2010 9000000
003 201007 2010 9000000
003 201008 2010 9000000
003 201009 2010 9000000
003 201010 2010 9000000
003 201011 2010 9000000
003 201012 2010 9000000
003 201101 2011 10000000
003 201102 2011 10000000
003 201103 2011 10000000
003 201104 2011 10000000
003 201105 2011 10000000
003 201106 2011 10000000
003 201107 2011 10000000
003 201108 2011 10000000
003 201109 2011 10000000
003 201110 2011 10000000
003 201111 2011 10000000
003 201112 2011 10000000
;
Run;

 

I just want to get one YearMonth for every Year then implement the PSI and SSI analysis. It is like take a random sample based on CustomerNo and Year.

 

Here is my desired outputs;

 Sme.png

Commercial.png

 

Now, I will do the analyze, is it possible to do this?

 

Thank you


Accepted Solutions
Solution
‎12-24-2016 11:10 AM
Super User
Posts: 10,028

Re: What is The Easiest Way to Understand How Two Populations Ranges Are Consistent?

You want randomly pick up one obs from each year ?

 

 

proc surveyselect data=datasmall out=want1 sampsize=1 method=srs;
strata CustomerID Year ;
run;


proc surveyselect data=DataCommercial out=want2 sampsize=1 method=srs;
strata CustomerID Year ;
run;

View solution in original post


All Replies
Solution
‎12-24-2016 11:10 AM
Super User
Posts: 10,028

Re: What is The Easiest Way to Understand How Two Populations Ranges Are Consistent?

You want randomly pick up one obs from each year ?

 

 

proc surveyselect data=datasmall out=want1 sampsize=1 method=srs;
strata CustomerID Year ;
run;


proc surveyselect data=DataCommercial out=want2 sampsize=1 method=srs;
strata CustomerID Year ;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 146 views
  • 1 like
  • 2 in conversation