BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NKormanik
Barite | Level 11

Please see the following partial data set:

 

Continuous      Rank
-2.28675         0
-2.21883         0
-1.96124         0
-1.81953         0
-1.28607         0
-0.65065         0
-0.37568         0
-0.33465         0
-0.21190         0
-0.14975         0
0.05879          1
0.10336          1
0.17450          1
0.46424          1
0.46587          1
0.48881          1
0.87138          1
0.91401          1
1.18948          1
1.26697          1


I'm attempting to calculate the diving line(s) of "Continuous" based on the different rank values.

 

In the above case, I could use:

 

(Lowest 1 Value - Highest 0 Value) / 2

 

Let's tentatively call the value:

 

CD_0_1 (CD standing for "Continuous Division")

 

Not shown would be:

 

CD_1_2
CD_2_3
CD_3_4
etc,

 

Can anyone here suggest a way of doing this using SAS.

 

Thoughts and help greatly appreciated.

 

Nicholas Kormanik

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

Assuming your dataset is sorted by Rank, Continuous :

 

data want;
    set have;
    by Rank;
    if last.Rank;
run;

Or, with a proc sql :

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT DISTINCT max(Continuous) AS Continuous, Rank
    FROM have
    GROUP BY Rank
    ORDER BY Rank;
quit;

View solution in original post

8 REPLIES 8
Oligolas
Barite | Level 11

Hi,

 

could you please provide some more information:

- what's a 'diving line'?

- and how should the result look like?

 

________________________

- Cheers -

NKormanik
Barite | Level 11

In the above case:

 

(.05879 + (-.14975)) / 2 = -0.04548

 

Thus, CD_0_1 = -0.04548

 

That value might be thought of as the 'dividing line' separating the "Continuous Variable" based on the Rank Score.

 

This is one value of many more following.

 

In other words:

 

CD_0_1 = -0.04548

CD_1_2 =
CD_2_3 =
CD_3_4 =
etc.

 

 

Reeza
Super User

quickest way - calculate summary statistics for each group first.

Then do group calcuations.

 

@NKormanik can you show the full expected output for the provided sample data.

NKormanik
Barite | Level 11

Another possible solution might be to simply use the highest "Continuous" from each "Rank".

 

This, as opposed to trying to find an average in-between, as described above.

 

So, what the new idea would require is filtering a dataset, only keeping the highest for any rank.

 

In the above case:

 

 

Continuous      Rank

-0.14975         0
1.26697          1

etc.

 


The new dataset, after filtering, would hold the desired values (pretty closely, at least).

 

 

So, the question then becomes, how to filter, leaving only the highest "Continuous" for each ranking value??

 

Any help greatly appreciated.

 

 

gamotte
Rhodochrosite | Level 12

Hello,

 

Assuming your dataset is sorted by Rank, Continuous :

 

data want;
    set have;
    by Rank;
    if last.Rank;
run;

Or, with a proc sql :

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT DISTINCT max(Continuous) AS Continuous, Rank
    FROM have
    GROUP BY Rank
    ORDER BY Rank;
quit;
gamotte
Rhodochrosite | Level 12

Your last post seems contradictory with what you first explained where the maximum value was taken for rank 0 and the minimum value for rank 1.

 

Here is a program that computes the value you indicate in your example :

data want;
	set have;
	by Rank;
	retain minC maxC prevMin prevMax;

	if first.Rank then do;
		maxC=Continuous;
		minC=Continuous;
	end;
	else do;
		maxC=max(Continuous, maxC);
		minC=min(Continuous, minC);
	end;

	if last.Rank then do;
		if Rank>0 then do;
			CD=(max(minC,prevMin)+min(maxC,prevMax))/2;
			output;
		end;
		prevMax=maxC;
		prevMin=minC;
	end;
run;
gamotte
Rhodochrosite | Level 12

If the dataset is sorted by Rank, Continuous, the program can be simplified :

data want;
	set have;
	by Rank;
	keep Continuous Rank CD;
	retain minC maxC prevMin prevMax;

	if first.Rank then do;
		minC=Continuous;
	end;

	if last.Rank then do;
		if Rank>0 then do;
			CD=(max(minC,prevMin)+min(Continuous,prevMax))/2;
			output;
		end;
		prevMax=Continuous;
		prevMin=minC;
	end;
run;

 

Rick_SAS
SAS Super FREQ

I'd use PROC MEANS to compute MIN and MAX, but there are many other ways:

 


proc means data=a min max;
class rank;
var continuous;
output out=out(where=(RANK^=.)) min=min max=max;
run;

data lines;
set out;
line = (min + lag(max)) / 2;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 935 views
  • 0 likes
  • 5 in conversation