BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
skumar46
Fluorite | Level 6
Dear All
I wanted to run Monte Carlo Simulation - Run 10 Iterations but dont know that why its giving me the same out for all 10 itertaions (output attached). I have used chatgpt & deepseek but its of no support. I have attached dataset as well as sas coding script. Would appreciate support by pasting code completely or answering me after checking 1st that its working in sas on demand. Thank you
 
/* Step 4: Monte Carlo Simulation - Run 10 Iterations for Random Loss Outcomes */
data montecarlo;
set portfolio_sim;
do sim_id = 1 to 10;
/* Introduce random variation to PD and LGD */
random_PD = rand("NORMAL", Sim_PD, 0.01);
random_LGD = rand("NORMAL", Sim_LGD, 0.05);
random_PD = max(0, min(random_PD, 1));
random_LGD = max(0, min(random_LGD, 1));
 
sim_loss = random_PD * random_LGD * EAD * Current_Balance;
 
output;
end;
run;
 
OR
 
data montecarlo;
    set portfolio_sim;
    /* Convert alphanumeric Borrower_ID to numeric seed */
    numeric_id = input(compress(Borrower_ID, , 'kd'), 8.);
    
    do sim_id = 1 to 10;
        /* Create unique seed for each borrower-simulation combination */
        seed = numeric_id * 100 + sim_id;
        call streaminit(seed);
        
        /* Generate random values */
        random_PD = rand("NORMAL", Sim_PD, 0.01);
        random_LGD = rand("NORMAL", Sim_LGD, 0.05);
        
        /* Constrain to valid ranges */
        random_PD = max(0, min(random_PD, 1));
        random_LGD = max(0, min(random_LGD, 1));
        
        /* Calculate loss */
        sim_loss = random_PD * random_LGD * EAD * Current_Balance;
        
        output;
    end;
    
    drop numeric_id seed; /* Clean up temporary variables */
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

You're welcome.

 

The (pseudo-)random values of random_PD and random_LGD created by the RAND function depend on the random seed used in the CALL STREAMINIT routine (the value 1 in our code), the random-number generator (MTHybrid is the default, but only since SAS 9.4M3), the order in which they are calculated and possibly even on the platform (hardware and operating system).

 

In our case the order of calculations might be the reason why we get different results. For example, if scenario "Adverse" came before "Baseline" in dataset PORTFOLIO_SIM, the random values for "Adverse" would be different from what they are with "Baseline" being the first scenario. The sort order of the borrower IDs is similarly important. Note that since PORTFOLIO_SIM was created by PROC SQL without an ORDER BY clause, the order of observations in PORTFOLIO_SIM can be different on different computers, even using the same program code!

 

I have just created one million pairs of random_PD and random_LGD values based on the Sim_PD and Sim_LGD values of the case in question and I found your values in observations 3901, 3902, ..., 3910 of this series. So, maybe 3900 other values had been created before due to sort order differences or other technical reasons (consider perhaps multi-threaded parallel computing on the server SAS OnDemand connects to).

 

As a matter of fact, you don't have to worry about those differences. Your values are valid and it is reassuring that I could reproduce them on my computer.

View solution in original post

20 REPLIES 20
PaigeMiller
Diamond | Level 26

First, most of us will not download Microsoft Office attachments, they can  be security threats. I personally will not download any attachment. If you want us to see something, paste it into your message rather than attaching a file. If you want to provide us with a SAS data set, please follow these instructions and examples.

 

Second, code should be pasted into the "code box", which is the window that appears when you click on the "little running man" icon.

 

Your problem of why all of this random numbers generates a constant results is that you have these lines in your code (see I have pasted the code into the code box)

 

random_PD = max(0, min(random_PD, 1));
random_LGD = max(0, min(random_LGD, 1));

 

On the lines before, you randomly generate random_PD and random_LGD. The minimum of your random_PD and 1 is always 1, so that's what you get. Similarly, the minimum of random_LGD and 1 is always 1. 

 

Perhaps you want to use the uniform distribution, which can force the randomly generated numbers to the range 0 and 1, and no need to adjust them further to be between 0 and 1. That's a guess on my part, you didn't explain the problem you are working on (which you should always do), and then we would understand what you are trying to do and we wouldn't have to guess (which is never a good thing for us to do).

--
Paige Miller
skumar46
Fluorite | Level 6

Sir, Its not working. I would appreciate if you provide me/paste the complete code here. I want 10  10 iterations with different result for PD, LGD, expected loss? OR how would you prepare code for Montecarlo Simulations? I can connect on  zoom or teams if you want to support?

PaigeMiller
Diamond | Level 26

There have been many suggestions in this thread, but you have provided no additional information about what you tried. What did you try? Show us the code. What happened? If there are errors in the log, please show us the log. If the results are wrong, show us the results and explain why you think these are incorrect. Please provide information! Remember, do not attach files.

--
Paige Miller
ballardw
Super User

You would need to provide the INPUT data set Portfolio_sim to mean anything.

 

Since you are using SIM_id as the MEAN of the NORMAL distribution you will have most of your values of the result of the RAND calls close to 2, 3, 4 etc. So pretty much 9 out of 10 of those mean that min(random_PD, 1) are returning 1 so the max(0,1) are returning the 1. And only half of the Sim_id=1 even have a chance of returning a value less than 1.

 

And then you overwrite the values of the random results so you cannot see that your original values are mostly way larger than 1. Modify these to create new variables to see why you get mostly 1's as results.

      /* Constrain to valid ranges */
        random_PD = max(0, min(random_PD, 1));
        random_LGD = max(0, min(random_LGD, 1));

Do you know what percentage of results of a call such as rand("NORMAL", Sim_PD, 0.01) will have a result withing +/- 0.02 difference from Sim_PD?

Until you understand what the standard deviation is doing to those calls to rand('Normal') (do note the default for Normal is an SD=1) you should be very careful specifying it.

 

Or better yet, describe what sort of result you expect to be returning as I thing @PaigeMiller is on track.

FreelanceReinh
Jade | Level 19

Also note that this


@skumar46 wrote:
(...)
    do sim_id = 1 to 10;
        /* Create unique seed for each borrower-simulation combination */
        seed = numeric_id * 100 + sim_id;
        call streaminit(seed);

(...)


does not achieve what you intend. Only the first call of the STREAMINIT routine in a DATA step takes effect. (See the "Tip" in the description of parameter seed  in the documentation of CALL STREAMINIT.) Hence, only the first created seed value is actually used. If the first value of numeric_id is, say, 123, you will obtain the same result as with a single

call streaminit(12301);

at the beginning of the DATA step.

 

Look into the documentation of CALL STREAM if you really need more control over the random number streams used. For example, you could use separate random number streams for each borrower:

data want;
call streaminit(27182818);
  set portfolio_sim;
  /* Convert alphanumeric Borrower_ID to numeric key */
  numeric_id = input(compress(Borrower_ID, , 'kd'), 8.);
  /* Use a separate random number stream for each borrower */
  call stream(numeric_id);    
  do sim_id = 1 to 10;
    /* Generate random values */
    random_PD = rand("NORMAL", Sim_PD, 0.01);
    ...

This would enable you to reproduce the simulated data for any given Borrower_ID in a later DATA step by just specifying the same combination of seed (here: 27182818) and key (here: numeric_id) values. It also includes the possibility of increasing the number of simulations (e.g., do sim_id = 1 to 15;) without changing the results for sim_id=1, ..., 10.

skumar46
Fluorite | Level 6

Sir, I appreciate you taking time to response. I am basic sas user and trying to learn with the help of chatgpt & deepseek neither i couldnt find solution there nor i am able to catch you. I would request you to review my codes in intitial enquiry and then just provide your complete code which should be workable by pasting in sas on demand then i will be able to check and share with you the logs as such i am not able to co-related your points. Pls paste here code/(s) starting from 1st line of code and run; word.

ballardw
Super User

@skumar46 wrote:

Sir, I appreciate you taking time to response. I am basic sas user and trying to learn with the help of chatgpt & deepseek neither i couldnt find solution there nor i am able to catch you. I would request you to review my codes in intitial enquiry and then just provide your complete code which should be workable by pasting in sas on demand then i will be able to check and share with you the logs as such i am not able to co-related your points. Pls paste here code/(s) starting from 1st line of code and run; word.


Strongly suggest you add a lot of description to what you are attempting. For example, why are you using the loop counter Sim_id as the MEAN of a normal distribution? Why are you trimming the values? Why did you overwrite the original values of the RAND calls with those min/max function results?

 

And I repeat, that since you are using an input data set that you provide an example of that set as working data step code.

 

Do also note that SAS offers a free introductory programming course.

I cringe at the idea of trying to learn from Chatgpt generated code considering some of that code that has appeared on this forum. One example had something like 6 syntax violations in the first 8 lines.

PaigeMiller
Diamond | Level 26

@skumar46 wrote:

Sir, I appreciate you taking time to response. I am basic sas user and trying to learn with the help of chatgpt & deepseek neither i couldnt find solution there nor i am able to catch you. I would request you to review my codes in intitial enquiry and then just provide your complete code which should be workable by pasting in sas on demand then i will be able to check and share with you the logs as such i am not able to co-related your points. Pls paste here code/(s) starting from 1st line of code and run; word.


We have asked you for additional information. You have not provided that information, so we can't help. (and unlikely AI chatbots will be able to guide you when you have errors in your programming)

 

People have provided suggestions about what you can try to fix the problem. Have you tried them?

 

By the way, our job here is to help you with SAS problems and help you learn certain aspects of SAS, but as stated we need information. Our job is not to do your assignments for you, which seems like what you are asking.

--
Paige Miller
FreelanceReinh
Jade | Level 19

@skumar46 wrote:
I wanted to run ... 10 Iterations but dont know that why its giving me the same out for all 10 itertaions (output attached).

Your output shows 10 out of 15,000 observations and only 4 out of 33 variables, whose (possibly formatted) values are constant on those 10 observations. None of these variables (Scenario, Sim_PD, Sim_LGD and Expected_Loss) is created in either of the two DATA steps that you posted. Hence, these variables must be contained in the input dataset PORTFOLIO_SIM. Their values are read by the SET statement, left unchanged by the other DATA step statements and written to the output dataset MONTECARLO by the OUTPUT statement. As this occurs in a DO loop with 10 iterations (do sim_id = 1 to 10;), those four variable values are necessarily the same in all 10 observations created from one observation of PORTFOLIO_SIM. They are just copies of the original values from the input dataset.

 

If you observed variable values that are equal contrary to expectations (e.g., values created by the RAND function calls in your code), then please show us.

skumar46
Fluorite | Level 6

Thank you for the response. The issue is not getting resolved. Further, i am trying to learn and its not for assignment. If anyone willing to help then i can email the dataset and coding script.  My questions is very simple that i wanted to have  10 random values for each row scnerio but i am getting the same result.  I have given code above, if anybody can competely amend the code and paste below, will be thankful.

FreelanceReinh
Jade | Level 19

In your initial post you wrote

(...) I have attached dataset

But there is no dataset attached. So let me create test data using some of the values shown in the partial output that you did attach.

data portfolio_sim;
input Borrower_ID $ Sim_PD Sim_LGD EAD Current_Balance;
cards;
B0001 0.2905 0.638 2 500
B0002 2.905  6.38  2 500
B0003 0.2905 0.638 2 500
;

If you run either of your two DATA steps on the above test data, you will see

  • for Borrower_ID B0001 and B0003: trivially constant values of Sim_PD, Sim_LGD, EAD and Current_Balance (as explained in my previous post), but randomly varying values of random_PD, random_LGD and sim_loss which differ between B0001 and B0003 (although the values of Sim_PD, Sim_LGD, EAD and Current_Balance are the same for both borrower IDs)
  • for Borrower_ID B0002: constant values also of random_PD, random_LGD and sim_loss because all the initial random values of random_PD and random_LGD were greater than 1 and therefore replaced with 1 by the assignment statements with the comment "Constrain to valid ranges" (as pointed out in @PaigeMiller's first reply).

 


@skumar46 wrote:

i am getting the same result.


Again, if you observed variable values that are "the same" contrary to expectations, then please show us.

skumar46
Fluorite | Level 6

Thank you for the reply. I have attached the dataset and sas coding script. There are 500 ids i.e B00000 to B00499.  Pls help me with Montecarlo Simulations i.e having 10 different random values for PG, LGD  i.e there will be 10 different rows for each borrower id like B00000  showing different PG, LGD values.

 

I am not doing any assignment but trying to learn. OR based on attached datset , how would you do Montecarlo Simulations looping PG, LGD 10 times so that we have random values?

 

 

FreelanceReinh
Jade | Level 19

It looks like you work with Excel data on a regular basis so that you think of an .xlsx-file as a dataset. This is good because I can assume that you know how to read your attached Excel table into a SAS dataset named portfolio_sim. (I don't have Excel installed on my SAS workstation, nor is the SAS/ACCESS Interface to PC Files part of my SAS license, so I can't work with .xlsx data directly.)

 

Then you can slightly adapt the beginning of the first DATA step (called "Step 4" in a comment) from your initial post as follows

data montecarlo(rename=(random_PD=PD random_LGD=LGD Sim_PD=original_PD Sim_LGD=original_LGD));
call streaminit(1);
set portfolio_sim(rename=(PD=Sim_PD LGD=Sim_LGD));
...

and run the DATA step on the newly created dataset portfolio_sim. The resulting output dataset montecarlo should contain 5000 observations: 10 per borrower ID, with randomly varying PD and LGD values, as desired. (The "constraining to valid ranges" will likely be applied to only very few values and should not be a problem for learning purposes.)

skumar46
Fluorite | Level 6

Thank you for taking time to help me out but its not going through. If you are from Sas team, can we connect through zoom/ teams?

yes, data is read correctly in sas on demad. I understand that you wanted me to modiy the step # 4 of code and i modified in diff ways but its not working. I believe its wasting both of our time probably because i am basic user so wanted to reach out for the last time. Providing a incomplete code creating confusuin for me as such i am not good in connecting the things etc. this is pretty advanced topic me but i am trying to learn. 

 

I understand that you are unable to download excel dataset. So, for your ease, i am pasting datasnapshot in msword, along code script details. I want you to share complete code for step # 4 rather than 2-3 lines which helps me working the code by pasting;-

 

if couldnt get the complete code for step # 4 /not working after this , then i would go for closure of this unsolved.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 7288 views
  • 3 likes
  • 4 in conversation