BookmarkSubscribeRSS Feed
hagen85
Calcite | Level 5

Hi @ all,

for a project at university I would like to simulate a changing environment. Basically what I want to do is:

Build a prediction model (e.g. neural network) on a certain data set and simulate two things:

1.) How does the model performance change if  new data instances arrive (Performance will go down sooner or later).

2.) How does the model performance change if the model is adapted after a new instance arrives. (Performance should remain constant)

Can I simulate both scenarios in SAS?  How can I simulate a stream of new data instances arriving? How do I model the adaption loop mentioned under 2.) in SAS?

Thank you very much in advance for your ideas.

Best Regards

Hagen

1 REPLY 1
DougWielenga
SAS Employee

Basically what I want to do is:

 

Build a prediction model (e.g. neural network) on a certain data set and simulate two things:

1.) How does the model performance change if  new data instances arrive (Performance will go down sooner or later).

2.) How does the model performance change if the model is adapted after a new instance arrives. (Performance should remain constant)

 

For number 1 above, you can simulate data using Base SAS but you need to describe how you want to simulate new observations.  For example, you might try and draw random values from a certain distribution for each of your variables and then build new observations to score with the 'new' data.   If you believe there is a trend in the values (e.g. children's average height at a given age seems to increase each decade) then you can build that shift into the randomization.   SAS provides several probability distributions that you can use to simulate data.  Start with the RAND function in SAS and then (if desired) model the shift in your inputs over time and build that estimated drift/shift into your simulated data.   Then you can score the data.   Performance going down over time is expected since populations change, but you can monitor performance over time to see how much the performance has dropped at a given time.   When it gets too low, you can refit.

 

For number 2 above, there is really no such thing as 'tweaking' a model.   If you get new observations and run the model again, you are really just fitting a new model.  The benefit to using an existing model is that it allows you to score observations for which you don't know the answer yet.  If you continually update your model, you might get slightly better performance or the difference might be negligible which is why it is likely better to monitor model performance and then refit as needed.  The alternative might be with nearest neighbor modeling strategies where you are looking to cluster your observations and predict their value from their nearest neighbors.  As your data increases, you have more information to make these assessments.  There is no 'model' that is fit; you are just attempting to identify people who are the most like your new observation based on the available data.  Of course, once the new persons outcome is known, they just become part of the training for the next new observation.

 

I hope this helps!

Doug 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 666 views
  • 0 likes
  • 2 in conversation