DATA Step, Macro, Functions and more

Compare a column with multiple row to a colum with a single row

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Compare a column with multiple row to a colum with a single row

[ Edited ]

Hello SAS community, I hope someone can help me this. I'm currently working with the SAS Studio free version and I'm just starting (about 2 weeks). I'm a former MATLAB user but SAS is way more complicated ^^.

 

+ Firstly, I created a new table where I calculated the 25th, 50th and 75th percentiles of 2 variables (MSRP and Invoice) from the library SASHELP.cars. I got a new table with a single row and 6 columns (3 percentiles for 2 variables). I named the percentiles as Invoice_P25, etc... Here's the code if it may help you, I also put the code and what I would like to generate in the attachments:

 

------------------------------------------------

proc univariate data=work.Regress noprint;

  var MSRP Invoice;

  output out=work.temp1 pctlpts=25 50 75 pctlpre=MSRP_ Invoice_

  pctlname=P25 P50 P75; /* Compute 25th, 50th and 75th quantiles */

run;

------------------------------------------------

 

Here's my problem:

With these percentiles, I want to create a table where a column tells where's the MSRP of the car is, compared of the rest of MSRPs distribution. Same for the "Invoice". I tried with a structure with a "if".

But when I compare a column (428 rows of MSRP) with a percentile, it doesn't work with its name but if I replace by its value, it works. I think SAS assumed the percentile is a column with 428 rows too but just the first row has a value.

 

Can you tell how I can fix that please? Here's my code:

 

------------------------------------------------

data work.temp;

set work.Regress work.temp1;

/* (keep=var1 var2)

(drop=var1 var2) */

 

length QMSRP $6; /* longueur de nouvelle colonne = 6 caractères */

if MSRP < vvalue(MSRP_P25) then QMSRP = 'QMSRP1';

else if MSRP >= vvalue(MSRP_P25) and MSRP < vvalue(MSRP_P50) then QMSRP = 'QMSRP2';

else if MSRP >= vvalue(MSRP_P50) and MSRP < vvalue(MSRP_P75) then QMSRP = 'QMSRP3';

else QMSRP = 'QMSRP4';

 

length QInvoice $9;

if Invoice < Invoice_P25 then QInvoice = 'QInvoice1';

else if Invoice >= Invoice_P25 and Invoice < Invoice_P50 then QInvoice = 'QInvoice2';

else if Invoice >= Invoice_P50 and Invoice < Invoice_P75 then QInvoice = 'QInvoice3';

else QInvoice = 'QInvoice4';

 

run;

------------------------------------------------

 

Thanks in advance.

 

PS: sorry if some words are in French in the pictures I couldn't change the language.


SASHELP.png

Accepted Solutions
Solution
‎07-14-2017 08:32 PM
PROC Star
Posts: 7,363

Re: Compare a column with multiple row to a colum with a single row

I think that the following does what you wanted:

proc univariate data=sashelp.cars noprint;
  var MSRP Invoice;
  output out=work.temp1 pctlpts=25 50 75 pctlpre=MSRP_ Invoice_
  pctlname=P25 P50 P75; /* Compute 25th, 50th and 75th quantiles */
run;

data work.temp2;
  set sashelp.cars;
  if _n_ eq 1 then set work.temp1;
/* (keep=var1 var2)
(drop=var1 var2) */
 
length QMSRP $6; /* longueur de nouvelle colonne = 6 caractères */
if MSRP < MSRP_P25 then QMSRP = 'QMSRP1';
else if MSRP >= MSRP_P25 and MSRP < MSRP_P50 then QMSRP = 'QMSRP2';
else if MSRP >= MSRP_P50 and MSRP < MSRP_P75 then QMSRP = 'QMSRP3';
else QMSRP = 'QMSRP4';
 
length QInvoice $9;
if Invoice < Invoice_P25 then QInvoice = 'QInvoice1';
else if Invoice >= Invoice_P25 and Invoice < Invoice_P50 then QInvoice = 'QInvoice2';
else if Invoice >= Invoice_P50 and Invoice < Invoice_P75 then QInvoice = 'QInvoice3';
else QInvoice = 'QInvoice4';
 
run;

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
Solution
‎07-14-2017 08:32 PM
PROC Star
Posts: 7,363

Re: Compare a column with multiple row to a colum with a single row

I think that the following does what you wanted:

proc univariate data=sashelp.cars noprint;
  var MSRP Invoice;
  output out=work.temp1 pctlpts=25 50 75 pctlpre=MSRP_ Invoice_
  pctlname=P25 P50 P75; /* Compute 25th, 50th and 75th quantiles */
run;

data work.temp2;
  set sashelp.cars;
  if _n_ eq 1 then set work.temp1;
/* (keep=var1 var2)
(drop=var1 var2) */
 
length QMSRP $6; /* longueur de nouvelle colonne = 6 caractères */
if MSRP < MSRP_P25 then QMSRP = 'QMSRP1';
else if MSRP >= MSRP_P25 and MSRP < MSRP_P50 then QMSRP = 'QMSRP2';
else if MSRP >= MSRP_P50 and MSRP < MSRP_P75 then QMSRP = 'QMSRP3';
else QMSRP = 'QMSRP4';
 
length QInvoice $9;
if Invoice < Invoice_P25 then QInvoice = 'QInvoice1';
else if Invoice >= Invoice_P25 and Invoice < Invoice_P50 then QInvoice = 'QInvoice2';
else if Invoice >= Invoice_P50 and Invoice < Invoice_P75 then QInvoice = 'QInvoice3';
else QInvoice = 'QInvoice4';
 
run;

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 5

Re: Compare a column with multiple row to a colum with a single row

Thank you, it looks good for me. 

 

I saw that you had this :   if _n_ eq 1 then set work.temp1; Would you please explain what it does?

 

I'm not really smart so thank you very much again. 

PROC Star
Posts: 7,363

Re: Compare a column with multiple row to a colum with a single row

_n_ is an automatic system variable, in SAS, that lets you know which record you're reading and, if desired, control if there is anything special that you might want to do.

 

In this case we were reading all of the records from sashelp.cars but, upon reading the first one, read the summary file that was created by proc univariate. As such, the file was only read once, but automatically retained so that it would be available when we processed all of the other records.

 

Yes, SAS has a steeper learning curve than MATLAP but, on the other hand, gives you far more capabilities and control once you follow the upward trend of that curve.

 

Art, CEO, AnalystFinder.com

Occasional Contributor
Posts: 5

Re: Compare a column with multiple row to a colum with a single row

Thank you, it's really appreciated. 

Super User
Posts: 17,840

Re: Compare a column with multiple row to a colum with a single row

Note that PROC RANK does this a lot easier. 

 

proc rank data=sashelp.cars out=grouped groups=4;
var msrp invoice;
ranks rank_msrp rank_invoice;
run;

 

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 153 views
  • 3 likes
  • 3 in conversation