BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PaigeMiller
Diamond | Level 26

@koyelghosh wrote:

@PaigeMiller @Thank you
You definitely have a valid argument against the suggested approach. However I was thinking, as the length of the set of numbers to choose from increases, the probability that the excluded number will appear will decrease. Thus in very few cases, it will actually enter the loop and spend time there.


When you use RAND('INTEGER',0,7), then the probability of getting the un-desired number 3 is 1/8, regardless of the size of the data set.

--
Paige Miller
FreelanceReinh
Jade | Level 19

@koyelghosh: I think this technique (known as acceptance-rejection method) can be particularly useful in situations where the condition for rejection (here: y=3) is more complex so that it can't be replaced by a simple definition like y=r+(r>=3); (with a suitable random number r) as in your example.

FreelanceReinh
Jade | Level 19

@Anita_n wrote:

I am tring to assign  numbers  0,1, 2, 4, 5, 6, 7 (pls note here values are without 3) to a variable y (...)

      y=rand("integer" 0, 7);

Hello @Anita_n,

 

The missing comma after "integer" is a syntax error. Since y is a character variable, you may want to calculate the numeric value (implementing @PaigeMiller's idea) using a temporary numeric variable, say _n_, and then assign the result via PUT function as a character value to y:

     _n_=rand('integer', 0, 6);
     y=put(_n_+(_n_>2),1.);

 

Edit: Also, I recommend using the CALL STREAMINIT routine to define a seed value. Otherwise you can't replicate your results (from the RAND function).

Astounding
PROC Star

Using character values for Y:

 

data want;
set have;
array yvals {7} _temporary_ ('0' '1' '2' '4' '5' '6' '7');
if x= "8000" and a="0" and y= " " then
      y=yvals{ceil(ranuni(12345) * 7))};
run;
Anita_n
Pyrite | Level 9

Hello all,

Thanks a lot for your contributions. I tested all your suggestions and found out that all of them

gives me the desired output. It depends on which choice one prefers. I'm very greatfull for the help.

 

Now my question is, is it possible to accept all this possibilities as a solution, since they all work??

FreelanceReinh
Jade | Level 19

Hello @Anita_n,

 

Glad to hear that the solutions worked for you. You could accept that suggestion as the solution which you eventually adopted, i.e. used in your code, and give likes to the other posts you found helpful.

Anita_n
Pyrite | Level 9

I still have a question relating to this topic, why does the dataset increases after executing the do loop. if there anything to add to my code to stop that.

For example I had 1000 datasets in my file after excuting the do loop it increases to 1060 why??

Anita_n
Pyrite | Level 9

yes there is an output statement in the code, or else the values wouldn't show

Kurt_Bremser
Super User

Quote from my previous post:

"Please show the code you ran".

 

Keep in mind that you only need an output statement in a data step if you want to change the default behaviour of the data step, which is to do an implicit output at the end of each data step iteration. But we can only tell you what's the exact reason for your problem if we see the code.

Anita_n
Pyrite | Level 9

okay here is a sample code:

data want;
set have;

a=.; b=.; c=.; d=;

if (typ_ill = :"cerv" or typ_ill= :"oval") then sex=2 ;
else if( typ_ill= :"pen" or typ_ill= :"test") then sex =1;
else if (typ_ill ^=:"cerv" or typ_ill ^= :"oval" or  typ_ill ^= :"pen" or typ_ill ^= :"test")
then do ;
         sex= rand("integer", 1, 2);
         put sex;
        end;

if ( a=.) and (b=.) then do;
    a=rand("integer", 1, 12);
    b=rand("integer", 1980, 2000);
    c= rand("integer", 1, 2);
   d= rand("integer", 1999, 2019);
output;
end;

if x= "8000" and a="0" and y= " " then
   do;
      y=rand("integer" 0, 7);
    do while( y=3);
       y=rand("integer" 0, 7);
    end;
      output;
    end;
run;
Kurt_Bremser
Super User

See my annotations:

data want;
set have;

a=.; b=.; c=.; d=;

if (typ_ill = :"cerv" or typ_ill= :"oval") then sex=2 ;
else if( typ_ill= :"pen" or typ_ill= :"test") then sex =1;
else if (typ_ill ^=:"cerv" or typ_ill ^= :"oval" or  typ_ill ^= :"pen" or typ_ill ^= :"test")
/* I guess that the condition immediately above will always be true and is not necessary */
then do ;
         sex= rand("integer", 1, 2);
         put sex;
        end;

if ( a=.) and (b=.) then do;
/* Since you did not change any of the variables a to d, they're still missing,
  so this condition will always be TRUE */
    a=rand("integer", 1, 12); /* a can only get values between 1 and 12, but will never be zero */
    b=rand("integer", 1980, 2000);
    c= rand("integer", 1, 2);
   d= rand("integer", 1999, 2019);
output;
end;

if x= "8000" and a="0" and y= " " then
/* because of the above, 'a="0"' will never be true, but will cause a NOTE
about the conversion character <> numeric */
/* y will therefore never be set */
   do;
      y=rand("integer" 0, 7);
    do while( y=3);
       y=rand("integer" 0, 7);
    end;
      output;
    end;
run;

Because of that, I seriously doubt that this is the complete code you ran, as each incoming observation will enter the first branch that contains an output statement, but never the second.

And you should REALLY start to visually format your code, with consistent indentation to easily identify functional blocks.

 

So I ask you to provide example data and real code that causes the "increasing observations effect"; just enough observations, in a data step with datalines.

Anita_n
Pyrite | Level 9

sorry, I can see there was a mistake in the code I sent. Its just that I can't send the real code or data

due to it's delicacy. My only concern is why the output statement is producing duplicates.

Kurt_Bremser
Super User
  1. find the duplicates
  2. with at least one of them, identify the source observation
  3. create a subset with only that observation
  4. add put statements troughout your code to reveal how variable values change
  5. run the code against the subset
  6. read the log

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 32 replies
  • 8438 views
  • 6 likes
  • 6 in conversation