BookmarkSubscribeRSS Feed
bcRam
Calcite | Level 5
Hello,

I am trying to run firm-year specific regressions in SAS to get to an expected value for industry-year. However, I need to exclude the i'th observation. For example, if there are 20 observations for 1995 for a specific industry (IND), the regression for obs 1 would be based on obs 2-20. The regression for obs 2 would be based on obs 1 and obs 3-20. I can't seem to find a way to exclude the variable individually. I found a reweight function that can exclude a specific observation by changing the weight to zero, but I would have to do that for every observation and run thousands of individual regressions. If I didn't need to exclude the i'th observation I know the code would be:

Proc Reg;
by fyear IND;
run;

Any ideas on how to exclude the observation? Thanks for the help.

BC
18 REPLIES 18
art297
Opal | Level 21
As long as you have (or create) a variable that contains the obs number, you can simply exclude any particular obs with a where statement.

Art
> Hello,
>
> I am trying to run firm-year specific regressions in
> SAS to get to an expected value for industry-year.
> However, I need to exclude the i'th observation. For
> example, if there are 20 observations for 1995 for a
> specific industry (IND), the regression for obs 1
> would be based on obs 2-20. The regression for obs 2
> would be based on obs 1 and obs 3-20. I can't seem to
> find a way to exclude the variable individually. I
> found a reweight function that can exclude a specific
> observation by changing the weight to zero, but I
> would have to do that for every observation and run
> thousands of individual regressions. If I didn't need
> to exclude the i'th observation I know the code would
> be:
>
> Proc Reg;
> by fyear IND;
> run;
>
> Any ideas on how to exclude the observation? Thanks
> for the help.
>
> BC
Ksharp
Super User
It looks like cross-verify test for cluster analysis.
You need to make a macro to process it iterative.
What does your original data look like?
I think it is not very difficulty task.


Ksharp
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12
Your question is related to the PRESS statistic (leave-one-out method) that is an option in REG. But I don't think there is an option to get a listing of the parameter estimates for each observation exclusion. Here is a quick macro that does what you want. It assume you are modeling y as a function of x. For this very simple macro, you must give the number of observations in the %do statement.Then, each observation is given, in turn, a weight of 0 (with 1 for the rest); the regression is done; the parameter estimates are stored (in parms); and then these estimates are stacked in a file called parms2. This is printed: there are two records for each observation (estimates of intercept and slope for each observation, when that observation is given a weight of 0). In this quick and dirty macro, the first record of parms2 has a missing value. You can delete. If you had four predictor variables, parms2 would have 5 records for each observation.

Right now, the individual regression output is suppressed. If you want to see the full results for each observation, comment out the ods listing statements at the start and end of the macro.

data a;
input x1 y;
id = _n_;
datalines;
0 2
1 4
2 3
3 6
4 6
5 5
6 8
7 10
8 9
9 9
;
run;

%macro jk;
ods listing exclude all;
data parms2;
%do i = 1 %to 10;
data b; set a;
if (id eq &i) then weight=0;
else weight=1;
run;
proc print data=b;run;
proc reg data=b ;
ods output parameterestimates=parms;
weight weight;
model y = x1;
run;
data parms2; set parms2 parms;
%end;
ods listing ;
proc print data=parms2;
%mend jk;

%jk;
run;
Ksharp
Super User
Hello . LVM
It is impressed for me to see the code you post.
In general,I think you are seasoned statistician about general linear model especiall for mixed model.
But in the future, I hope I will become seasoned statistician just like you ,That is the thing after ten years. 🙂


Ksharp
lkjjl
Fluorite | Level 6

Hello!

Excellent macro. I would like to create a macro for validate a multivariate mixed model. I tried to addapt this macro to a mixed model approach usind the leave-one-out method, but i did not get succesfull. Could you please to help me?.

Regards,

Julian

PaigeMiller
Diamond | Level 26

All you have to do is replace PROC REG in the macro with working code for some other PROC that performs a multivariate mixed model. Have you done that? Can you show us the code you tried?

--
Paige Miller
lkjjl
Fluorite | Level 6
Yes, I tried, however, I did not get sucess. For example, I would like to generate the predicted values for a mixed model. For example, using the data previously datalines sugested by Ivm, but including an additional "study" variable, I produced the following code:

data a;
input x1 y study;
id = _n_;
datalines;
0 2 1
1 4 1
2 3 1
3 6 2
4 6 2
5 5 2
6 8 3
7 10 3
8 9 3
9 9 3
;
run;

proc print data=a;
run;


%macro jk;
ods listing exclude all;
data parms2;
%do i = 1 %to 10;
data b; set a;
if (id eq &i) then weight=0;
else weight=1;
run;
proc print data=b;run;

proc mixed data=b;
ods output parameterestimates=parms;
class study;
weight weight;
model y = x1;
RANDOM study / SOLUTION;
run;
data parms2; set parms2 parms;
%end;
ods listing ;
proc print data=parms2;
%mend jk;

%jk;
run;

However, I did not get the parameters of mixed equations.
In addition, I am trying to get a dataset with predicted and observed values, using the leave-one-out cross validation approach.

Regards


PaigeMiller
Diamond | Level 26

However, I did not get the parameters of mixed equations.

What did you get?

 

In addition, I am trying to get a dataset with predicted and observed values, using the leave-one-out cross validation approach.

You need to create code that works and does what you want on your actual data set (forget the leave-one-out part). Once you have that, and it works properly, you should be able to plug it into the macro.

--
Paige Miller
lkjjl
Fluorite | Level 6
OK, inicially, my objective is to produce a dataset with predicted (derived from a leave-one-out process) and observed values.
To get it, I would like to begin with a simple random univariate model like that: y = X1 using study as random effect.
A simple code would be:

proc mixed data=a;
class study;
model Y = X1/ solution outpred=preditos;
RANDOM study / SOLUTION;
run;

Assuming 10 observations in my dataset, I would like to create a macro in which I delete an observation, and fit a new model with the rest of observations (n=9). Afterwards, I get predicted value for the observation I did not include in fitting process. I repeat this process until excluding all the observations, getting a data set with 10 observations and 10 predicted values (each one from each fitting step).

Regarding your question, when I run the code posted in previous question, I got:

ERROR: File WORK.PARMS.DATA does not exist.


When I run the aforementioned code
PaigeMiller
Diamond | Level 26

Yes, this happens because your code with PROC MIXED has not been modified to obtain the parameter estimates in a SAS data set. You need to get that correct, and then it should work in the macro loop as well. That's step one ... no macros ... no loop, just get the results you want from PROC MIXED.

--
Paige Miller
lkjjl
Fluorite | Level 6
The main mistake may be associated to proc mixed does not produce the parameters of the equations using parameterestimates statement. In proc mixed, What would be the equivalent statement?
lkjjl
Fluorite | Level 6
Coudl you please to help me to get the correct code for getting the parameter estimates in a SAS data set?
PaigeMiller
Diamond | Level 26

I think its something like this

 

ods output solutionF=parms;
--
Paige Miller
lkjjl
Fluorite | Level 6
I previously tried the suggested option, but problem remains:
WARNING: Output 'solutionF' was not created. Make sure that the output object name, label, or path is spelled correctly. Also,
verify that the appropriate procedure options are used to produce the requested output object. For example, verify that
the NOPRINT option is not used.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 3830 views
  • 2 likes
  • 6 in conversation