BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
medanesh
Calcite | Level 5

Hi,

I have been stuck with this question for quite a while now. I'm new to SAS and the question I'm asking might be quite easy to answer.

Let's say I have two datasets, for examples American Used Cars and Canadian Used Cars. Each dataset contains a variable called price. What I would like to do is to find the percentile for each Price in the Canadian dataset based on all the prices in the American Dataset. Basically I would like to know the price of each used car in Canada was greater than what percent of the prices in the American market.

Any help is very much appreciated.

Thanks in advance,

Erfan

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Consider this method:

/* Simulate data for American and Canadian Car Prices */
data auc cuc;
call streaminit(876878);
do i = 1 to 10;
      price = 10**(4+0.3*rand("NORMAL"));
      output auc;
      price = 10**(3.5+0.3*rand("NORMAL"));
      output auc;
      price = 10**(4+0.4*rand("NORMAL"));
      output cuc;
      end;
run;

/* Generate the percentiles */
proc rank data=auc out=aucr percent; var price; ranks percent; run;
proc rank data=cuc out=cucr percent; var price; ranks percent; run;

/* Merge the datasets */
data ucr;
set aucr cucr INDSNAME=_s;
source = _s;
run;

proc sort data=ucr; by price source; run;

/* Scan the prices, remember the american percentiles as we go */
data want(keep=price percent USpercent);
set ucr;
retain USpercent;
if source = "WORK.AUCR"

     then USpercent = percent;
     else output;
run;

proc print ; run;

PG

PG

View solution in original post

4 REPLIES 4
ballardw
Super User

Are you trying to do this by make, model, model year and such?

medanesh
Calcite | Level 5

For now just the prices. My main problem is how to find the percentile for a variable based on observations in another dataset.

Data Set 1

10

15

20

25

Data Set 1Data Set 2
1011
1513
2021
25

What I like my program to return for data set 1 is this (percentage of observations in dataset 2 that are less than or equal to each observation in dataset 1)

10->0%

15->66.6%

20->66.6%

25->100%

PGStats
Opal | Level 21

Consider this method:

/* Simulate data for American and Canadian Car Prices */
data auc cuc;
call streaminit(876878);
do i = 1 to 10;
      price = 10**(4+0.3*rand("NORMAL"));
      output auc;
      price = 10**(3.5+0.3*rand("NORMAL"));
      output auc;
      price = 10**(4+0.4*rand("NORMAL"));
      output cuc;
      end;
run;

/* Generate the percentiles */
proc rank data=auc out=aucr percent; var price; ranks percent; run;
proc rank data=cuc out=cucr percent; var price; ranks percent; run;

/* Merge the datasets */
data ucr;
set aucr cucr INDSNAME=_s;
source = _s;
run;

proc sort data=ucr; by price source; run;

/* Scan the prices, remember the american percentiles as we go */
data want(keep=price percent USpercent);
set ucr;
retain USpercent;
if source = "WORK.AUCR"

     then USpercent = percent;
     else output;
run;

proc print ; run;

PG

PG
medanesh
Calcite | Level 5

Thanks a lot PG. That was really helpful.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 809 views
  • 0 likes
  • 3 in conversation