- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello
Lets say I have a data set with Long structure with all possible combinations of 9 categorical variables.
Lets say that I have another data set with information of coefficients for each category (for each variable)
My target- I want to add to data set possible_combinations_t the following columns:
X1_Coef
X2_Coef
X3_Coef
X4_Coef
X5_Coef
X6_Coef
X7_Coef
X8_Coef
X9_Coef
data possible_combinations_t;
do x1 = 1 to 3;
do x2 = 1 to 2;
do x3 = 1 to 2;
do x4 = 1 to 3;
do x5 = 1 to 2;
do x6 = 1 to 3;
do x7 = 1 to 4;
do x8 = 1 to 2;
do x9 = 1 to 3;
output;
end;
end;
end;
end;
end;
end;
end;
end;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't think you provided enough information to answer the question.
If you want to merge two datasets we need to see what the datasets look like. Do they have a common variable that be used in a BY statement? If not you might need to use PROC SQL so you can craft logic to find the matching observations.
Your use of the term COEFFIENT makes it sound like perhaps you want score data using a formula? Did you look into using PROC SCORE?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Make a dataset with _TYPE_ variable set to SCORE. Use the same variable names ( X1 to X9) to hold the coefficients. Add a _NAME_ variable to hold the name you want for the NEW variable with predicted score.
data score;
retain _name_ 'MyModel' _type_ 'SCORE' ;
input x1-x9;
cards;
1 2 3 4 5 6 7 8 9
;
So this example SCORE dataset is the formula:
MyModel = X1 +2*X2 +3*X3 +4*x4+5*x5+6*X6+7*X7+8*X8+9*X9;
Then use PROC SCORE to score your dataset.
proc score data=possible_combinations_t score=score out=results;
run;
First 10 results:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you please show how score data set is looking like?
Does it have only one row??
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry I dont understand
You created a data set called score with columns X1-X9 (see input statement)
then you used retain statement and wrote there columns _name_ myModel _type_ score.
I dont see these columns in your daat set
data score;
retain _name_ 'MyModel' _type_ 'SCORE' ;
input x1-x9;
cards;
1 2 3 4 5 6 7 8 9
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ronein wrote:
Sorry I dont understand
You created a data set called score with columns X1-X9 (see input statement)
then you used retain statement and wrote there columns _name_ myModel _type_ score.
I dont see these columns in your daat set
data score; retain _name_ 'MyModel' _type_ 'SCORE' ; input x1-x9; cards; 1 2 3 4 5 6 7 8 9 ;
Did you trying running PROC PRINT on the SCORE dataset?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry
Here are 2 data sets I want to merge
data possible_combinations_t;
do x1 = 1 to 3;
do x2 = 1 to 2;
do x3 = 1 to 2;
do x4 = 1 to 3;
do x5 = 1 to 2;
do x6 = 1 to 3;
do x7 = 1 to 4;
do x8 = 1 to 2;
do x9 = 1 to 3;
output;
end;
end;
end;
end;
end;
end;
end;
end;
end;
run;
Data coef_t;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are on this forum for so long that you could at least to try provide some "my own experiments" solutions in your questions.
Try this:
data possible_combinations_t;
do x1 = 1 to 3;
do x2 = 1 to 2;
do x3 = 1 to 2;
do x4 = 1 to 3;
do x5 = 1 to 2;
do x6 = 1 to 3;
do x7 = 1 to 4;
do x8 = 1 to 2;
do x9 = 1 to 3;
output;
end;
end;
end;
end;
end;
end;
end;
end;
end;
run;
Data coef_t;
input var $ value coefficient;
cards;
X1 1 -0.1
X1 2 0.25
X1 3 0
X2 1 0.35
X2 2 0
X3 1 0.75
X3 2 0
X4 1 0.95
X4 2 1.45
X4 3 0
X5 1 -1.2
X5 2 0
X6 1 -1.1
X6 2 2.2
X6 3 0
X7 1 -0.7
X7 2 0.8
X7 3 1.6
X7 4 0
X8 1 -1.1
X8 2 0
X9 1 0.6
X9 2 0.9
X9 3 0
INTER 1 -2.5
;
Run;
data want;
if 1=_N_ then
do;
if 0 then set coef_t;
declare hash H(dataset:"coef_t");
H.defineKey("var", "value");
H.defineData("coefficient");
H.defineDone();
drop i var value coefficient;
end;
set possible_combinations_t;
/* X1_Coef ... x9_Coef - stupid SAS manes,
that's why I'm caling them XCoef_1 = XCoef_9
*/
array XCoef_[9];
array x[9];
do i = 1 to dim(x);
if 0=H.find(key: strip(upcase(vname(x[i]))), key:x[i])
then XCoef_[i] = coefficient;
end;
/* INTER=-2.5; */ /* <-- uncomment if you need INTER */
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not clear where you are going with this, but to combine those two datasets I would convert the first one into a "tall" dataset so it has VAR and VALUE variables like the second one does. Might help to remember which observations the values came from.
data tall;
row+1;
set possible_combinations_t ;
length var $8 value 8;
array x[9] ;
do index=1 to dim(x);
var=upcase(vname(x[index]));
value=x[index];
output;
end;
drop x: ;
run;
Now you can sort and merge by VAR and VALUE. Or let PROC SQL do it for you.
proc sql ;
create table want as
select *
from tall natural join coef_t
;
quit;
Results:
...