BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
A_Kh
Lapis Lazuli | Level 10

Dear Experts, 

While running proc logistics over a deterioration table, I get multiple Warning and Error messages due to missing control parameter or the same response values for all observations for certain time points etc..  However the procedure creates Odds Ratio and ParametEstimates tables with required statistics where I can get CI and P-values based on ConvergenceStatus. 

 

To avoid error and warning messages I'm explicitly removing specific timepoints from analysis which requires to run proc logistic for each parameter of BY statement ( I have 3 variables in BY statement with more than 5 level of categories for each) .  This makes the code very long as the single proc must run over 20 times. It could be shortened through macro, but again each parameter has issue with a specific timepoint that needs to be entered manually(I have around 33 timepoints, and the issue relates to around 7-8 of them, different per paramcd). 

 

The idea is running a single proc logistics with multiple parameters in BY statement and getting a clean log without explicitly removing observations. 

So my question is - Is there any proc logistics options that ignore useless observations, or create extra dummy obs, similar to SPARSE option in proc freq, to avoid error and warning messages? 

 

I created a dummy data (please ignore data values) that creates the same error and warning messages in the log for your information.  And the code I'm using is below (with less vars in BY statement). 


Any advice/help would be highly appreciated! 

 

data have;
input SUBJECT $ ARM $ PARAMCD $ BASELINE DETER VISITNUM;
cards;
1001 A ITEM1 1 0 1
1001 A ITEM2 2 0 1
1001 A ITEM3 3 0 1
1001 B ITEM1 2 1 1
1001 B ITEM2 3 1 1
1001 B ITEM3 4 1 1
1001 C ITEM1 3 0 1
1001 C ITEM2 4 0 1
1001 C ITEM3 5 0 1
1002 A ITEM1 1 1 1
1002 A ITEM2 2 1 1
1002 A ITEM3 3 1 1
1002 B ITEM1 2 0 1
1002 B ITEM2 3 0 1
1002 B ITEM3 4 0 1
1002 C ITEM1 3 1 1
1002 C ITEM2 4 1 1
1002 C ITEM3 5 1 1
1001 A ITEM1 2 0 2
1001 A ITEM2 3 0 2
1001 A ITEM3 4 0 2
1001 B ITEM1 3 1 2
1001 B ITEM2 4 1 2
1001 B ITEM3 5 1 2
1001 C ITEM1 4 0 2
1001 C ITEM2 5 0 2
1001 C ITEM3 5 0 2
1002 A ITEM1 3 1 2
1002 A ITEM2 4 1 2
1002 A ITEM3 5 0 2
1002 B ITEM1 4 1 2
1002 B ITEM2 5 0 2
1002 B ITEM3 5 1 2
1002 C ITEM1 5 0 2
1002 C ITEM2 5 0 2
1002 C ITEM3 5 0 2
1001 A ITEM1 3 1 3
1001 A ITEM2 4 1 3
1001 A ITEM3 5 0 3
1001 B ITEM2 4 0 3
1001 B ITEM3 5 0 3
1001 C ITEM1 5 0 3
1002 A ITEM1 4 0 3
1002 A ITEM2 5 0 3
1002 A ITEM3 5 0 3
1002 B ITEM1 5 0 3
1001 A ITEM1 5 0 4
1001 A ITEM2 5 0 4
1001 A ITEM3 5 0 4
1001 B ITEM2 5 0 4
1001 B ITEM3 5 0 4
1001 C ITEM1 5 0 4
1002 A ITEM1 5 0 4
1002 A ITEM2 5 0 4
1002 A ITEM3 5 0 4
;

proc sort data=have; by paramcd visitnum arm; proc print; run; 

*Getting ODDs Ratio, 95% CI and P-values;
ods exclude all;
proc logistic data=have;
	by paramcd visitnum;
	class arm (ref='C')/ param=ref;
	model deter (desc) = baseline arm;
	ods output convergencestatus=conv OddsRatios=or parameterestimates=pvalue;
run;
ods exclude none;
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

So my question is - Is there any proc logistics options that ignore useless observations, or create extra dummy obs, similar to SPARSE option in proc freq, to avoid error and warning messages? 

I don't think such an option exists.  Generally, a SAS PROC will fit a model, and if it encounters problems in the data it will throw warnings/errors as appropriate.  I can't think of a way to tell a PROC "please test if there will be a warning/error  when this model is fit and if so, then don't fit the model."  Of course you can test for those conditions in advance yourself.  And you could even run the model once to find all the problematic BY-groups, then exclude the problematic BY-groups from the data and run it again. If you're worried about getting a clean log, you could even use PROC PRINTTO to temporarily redirect the log when you run the step that you know will throw warnings/errors.

 

But 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

View solution in original post

14 REPLIES 14
PaigeMiller
Diamond | Level 26

Please show us the ENTIRE log for this PROC LOGISTIC so we can see the error messages and warnings.

 

In the future, please do not tell us you get WARNINGs and ERROR messages without showing us the log.

--
Paige Miller
A_Kh
Lapis Lazuli | Level 10

Hi @PaigeMiller , 

Thank you for a quick reply. I didn't post the entire log intentionally this time as it makes the post very long, which personally I don't like to see. Instead, running the provided code will generate the same messages that I get with my real data.  Hope this explanation makes sense to you too. 
Thank you!

PaigeMiller
Diamond | Level 26

We just need the log from PROC LOGISTIC. Please post it. If there are a lot of errors, just the first few will do.

--
Paige Miller
A_Kh
Lapis Lazuli | Level 10

Sure. Please see below.Capture.PNG

PaigeMiller
Diamond | Level 26

Your data is not compatible with logistic regression. Logistic regression will not tell you anything.

 

Complete separation and quasi-complete separation: https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faqwhat-is-complete-or-quasi-complete-separat....

 

All observations have the same response: well that should be self-explanatory, you cannot run logistic regression on this part of the data.

--
Paige Miller
A_Kh
Lapis Lazuli | Level 10

I know it is, and I asked not to pay attention to data since my concern was about the log caused by non-compatible part of data. But, thank you for taking time to look into it!

ballardw
Super User

It is preferable to copy log text from SAS and then on the forum open a text box using the </> icon that appears above the message window and paste the text.

With pasted text, not a picture, then we can copy/paste/edit to show changes to code or just include pieces for comments.

 

For example, your Invalid reference value for Arm.

YOU have specified a value that is not valid (repeatedly). YOU need to check on such and make sure your code makes sense in terms of the data.

Since you show code like:

proc logistic data=have;
	by paramcd visitnum;
	class arm (ref='C')/ param=ref;
	model deter (desc) = baseline arm;
	ods output convergencestatus=conv OddsRatios=or parameterestimates=pvalue;
run;

which shows a specific reference value of 'C', you would have to filter the data so that only by groups that include at least one value of 'C' for ARM are used by the procedure. That mean some diagnostics prior to this step to identify the valid by groups.

Something like this perhaps:

proc sql;
   create table need as
   select b.*
   from (select distinct paramcd, visitnum from have
         where arm='C') as a
         left join
         have as b
         on a.paramcd=b.paramcd
         and a.visitnum=b.visitnum
   ;
quit;

A second similar filter would be needed to be applied the NEED set just created reduce by groups where the all of the responses are the same (hint: since your response variable Deter is numeric if the range of the variable is 0 then all the non-missing responses are the same).

Then use that reduced set. Or use that to add a filter variable to the working set so that only these observations identified as valid are used in the Proc Logistic by using a WHERE for that filter variable.

 

Quentin
Super User

So my question is - Is there any proc logistics options that ignore useless observations, or create extra dummy obs, similar to SPARSE option in proc freq, to avoid error and warning messages? 

I don't think such an option exists.  Generally, a SAS PROC will fit a model, and if it encounters problems in the data it will throw warnings/errors as appropriate.  I can't think of a way to tell a PROC "please test if there will be a warning/error  when this model is fit and if so, then don't fit the model."  Of course you can test for those conditions in advance yourself.  And you could even run the model once to find all the problematic BY-groups, then exclude the problematic BY-groups from the data and run it again. If you're worried about getting a clean log, you could even use PROC PRINTTO to temporarily redirect the log when you run the step that you know will throw warnings/errors.

 

But 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
PaigeMiller
Diamond | Level 26

I think the OP has asked the wrong question, about how to get rid of the warnings and errors. I think the real question here is NOT how to get rid of warnings or errors, but what the user can learn from these warnings or errors. The user should learn that there are problems in the data that prevent logistic regression from working.

--
Paige Miller
A_Kh
Lapis Lazuli | Level 10

Thank you, @Quentin, and yes, I'm currently identifying the problematic BY-Groups and excluding from analysis.  At the same time I was looking for shortcuts (the laziest mode :)) to save time since the proc was producing the required report for non-problematic observations. Your PROC PRINTTO suggestion makes sense to me in this specific situation as my only concern is a clean log.    

Ksharp
Super User

That is really uneasy. 

You need to pick up those valid obs for building logistic model  to avoid these ERROR or WARNING.

 

data have;
input SUBJECT $ ARM $ PARAMCD $ BASELINE DETER VISITNUM;
cards;
1001 A ITEM1 1 0 1
1001 A ITEM2 2 0 1
1001 A ITEM3 3 0 1
1001 B ITEM1 2 1 1
1001 B ITEM2 3 1 1
1001 B ITEM3 4 1 1
1001 C ITEM1 3 0 1
1001 C ITEM2 4 0 1
1001 C ITEM3 5 0 1
1002 A ITEM1 1 1 1
1002 A ITEM2 2 1 1
1002 A ITEM3 3 1 1
1002 B ITEM1 2 0 1
1002 B ITEM2 3 0 1
1002 B ITEM3 4 0 1
1002 C ITEM1 3 1 1
1002 C ITEM2 4 1 1
1002 C ITEM3 5 1 1
1001 A ITEM1 2 0 2
1001 A ITEM2 3 0 2
1001 A ITEM3 4 0 2
1001 B ITEM1 3 1 2
1001 B ITEM2 4 1 2
1001 B ITEM3 5 1 2
1001 C ITEM1 4 0 2
1001 C ITEM2 5 0 2
1001 C ITEM3 5 0 2
1002 A ITEM1 3 1 2
1002 A ITEM2 4 1 2
1002 A ITEM3 5 0 2
1002 B ITEM1 4 1 2
1002 B ITEM2 5 0 2
1002 B ITEM3 5 1 2
1002 C ITEM1 5 0 2
1002 C ITEM2 5 0 2
1002 C ITEM3 5 0 2
1001 A ITEM1 3 1 3
1001 A ITEM2 4 1 3
1001 A ITEM3 5 0 3
1001 B ITEM2 4 0 3
1001 B ITEM3 5 0 3
1001 C ITEM1 5 0 3
1002 A ITEM1 4 0 3
1002 A ITEM2 5 0 3
1002 A ITEM3 5 0 3
1002 B ITEM1 5 0 3
1001 A ITEM1 5 0 4
1001 A ITEM2 5 0 4
1001 A ITEM3 5 0 4
1001 B ITEM2 5 0 4
1001 B ITEM3 5 0 4
1001 C ITEM1 5 0 4
1002 A ITEM1 5 0 4
1002 A ITEM2 5 0 4
1002 A ITEM3 5 0 4
;

proc sql;
create table have2 as
select * from have
 group by paramcd,visitnum
  having count(distinct catx('|',deter,arm))=2*count(distinct arm)  and sum(arm='C')
   order by   paramcd ,visitnum ,arm;
quit;



*Getting ODDs Ratio, 95% CI and P-values;
ods exclude all;
proc logistic data=have2;
	by paramcd visitnum;
	class arm (ref='C')/ param=ref;
	model deter (desc) = baseline arm;
	ods output convergencestatus=conv OddsRatios=or parameterestimates=pvalue;
run;
ods exclude none;

A_Kh
Lapis Lazuli | Level 10

Thank you, @Ksharp, this removes all observations causing ERROR in the log, but I need another pass to remove observations that cause WARNINGs (in production data there are still some obs not satisfying the convergence criterion).  As I have enough big number of observation (about 162000) it's a bit tedious for me to make sure that I excluded obs correctly during pre-process, hope i'll get there with more experience. For now, I prefer just using proc logistic as pre-processing step which tells me what combinations are problematic, then remove them from analysis, or simply redirecting proc logistic steps log.  

Ksharp
Super User
Post your real data and real sas code .
If I have time, I will take a look.
A_Kh
Lapis Lazuli | Level 10

I'm not allowed to share study data. Thank you again for your support, I appreciate it!

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 5719 views
  • 5 likes
  • 5 in conversation