DATA Step, Macro, Functions and more

Select the best model from multiple models using SAS Array

Reply
Occasional Contributor
Posts: 6

Select the best model from multiple models using SAS Array

I am stuck with below problem.

 

Each observation in my data set has 3 different valuation and there is a preferred sequence to select the best valuation out of these 3. I have to output the first non missing valuation based on the preferred sequence to the final data set.

 

I tried using array by transposing each observation into 3 separate observation and then used self join to find the selected valuation.

This is working fine on sample data set, but my data set is having 90 million rows and self join is taking long time. Please suggest better approach.

 

Sample data set:

 

ID Val 1Val2 val3 m1m2 m3
11101104105val1val2val3
12201210211val2val3val1
13155150 val3val1val2
14  201val2val1val3

 

Output:

 

IDValuationModel
11101val1
12210val2
13155val1
14201val3
Super Contributor
Posts: 354

Re: Select the best model from multiple models using SAS Array

[ Edited ]
Posted in reply to PRAVIN_JAIN

Hello,

 

data want;
    set have;
    array m(*) m:;
    array val(*) val:;
    format model $4. valuation 3.;
    keep Valuation model;
    do i=1 to dim(m);
        if vvaluex(m(i)) ne . then do;
            Valuation=vvaluex(m(i));
            model=m(i);
            leave;
        end;
    end;
run;
PROC Star
Posts: 1,400

Re: Select the best model from multiple models using SAS Array

Posted in reply to PRAVIN_JAIN

Like this?

 

data have;
input ID$ val1-val3  m1$ m2$ m3$;
datalines;
11 101 104 105 val1 val2 val3 
12 201 210 211 val2 val3 val1 
13 155 150 .   val3 val1 val2 
14 .   .   201 val2 val1 val3 
;

data want;
	set have;
	array m{*} m1-m3;

	do i=1 to dim(m);
		if vvaluex(m[i])<>. then do;
			Valuation=vvaluex(m[i]);
			Model=m[i];	
			output;return;
		end;
	end;

	keep ID Valuation Model;
run;
Occasional Contributor
Posts: 6

Re: Select the best model from multiple models using SAS Array

Thanks for the response.

 

My actual data set have few more columns associated to each valuation. Also the model and valuation column name is not exactly same. I am not able to use VVALUEX here. Modified data as below:

 

 

Please suggest.

 

Input :

 

ID ABVal ABScoreCDVal CDScoreDEVal DEScorem1m2 m3
11101810461058ABCDDE
12201921092115CDDEAB
13155101507  ABDECD
14    2019CD ABDE

 

Output:

 

IDValuationScoreModel
111018AB
122109CD
1315510AB
142019DE

 

 

PROC Star
Posts: 1,400

Re: Select the best model from multiple models using SAS Array

Posted in reply to PRAVIN_JAIN

Always a good idea to provide sample data that represents your actual data closely Smiley Happy

 

How many variables does your data set contain? Do you need a dynamic way of putting them into an array or are you able to type the variable names manually?

Occasional Contributor
Posts: 6

Re: Select the best model from multiple models using SAS Array

ID ABVal ABScoreCDVal CDScoreDEVal DEScorem1m2 m3
11101810461058ABCDDE
12201921092115CDDEAB
13155101507  ABDECD
14    2019CD ABDE

 

This table represents the actual data set closely. For each valuation I have associated score also. (so 2 columns for each valuation)

 

Models are : AB,CD,DE

Variables are : Val, Score

PROC Star
Posts: 1,400

Re: Select the best model from multiple models using SAS Array

[ Edited ]
Posted in reply to PRAVIN_JAIN

If your posted data resembles your actual data closely, you are still able to use the VVALUEX function like this

 

data have;
input ID ABVal ABScore CDVal CDScore DEVal DEScore m1$ m2$ m3$;
datalines; 
11 101 8  104 6 105 8 AB CD DE 
12 201 9  210 9 211 5 CD DE AB 
13 155 10 150 7 .   . AB DE CD 
14 .   .  .   . 201 9 CD AB DE
;

data want;
	set have;
	array m{*} m1-m3;

	do i=1 to dim(m);
		if vvaluex(cats(m[i], "Val")) ne . then do;
			Valuation=input(strip(vvaluex(cats(m[i], "Val"))), 8.);
			Score=input(strip(vvaluex(cats(m[i], "Score"))), 8.);
			Model=m[i];
			output;return;
		end;
	end;

	keep ID Valuation Score Model;
run;

 

Occasional Contributor
Posts: 6

Re: Select the best model from multiple models using SAS Array

Thanks for your help. 

This solution works for me.

PROC Star
Posts: 1,400

Re: Select the best model from multiple models using SAS Array

Posted in reply to PRAVIN_JAIN

Anytime, glad to help Smiley Happy

Occasional Contributor
Posts: 6

Re: Select the best model from multiple models using SAS Array

There are different scenarios in data as in below table, out of which how can we handle 4 th scenario using same code. 

Current code doesn't output 4th scenario. 

 

ID ABVal ABScoreCDVal CDScoreDEVal DEScorem1m2 m3Comments
11......ABCDDEIf all valuation missing then output row with all missing values
12201921092115CDDEABOutput first non missing based on model sequence
13155101507..ABDE Output first non missing based on model sequence
14....2019CDAB Model sequence given(CD,AB) doesn’t have valuation present,
but have value for model(DE) which is not present in model sequence - for this scenario need to output row with having missing values
Ask a Question
Discussion stats
  • 9 replies
  • 234 views
  • 0 likes
  • 3 in conversation