BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
awesome_opossum
Obsidian | Level 7

Hi there, so I'm trying to use this little doo-wop to flag variables that have values within certain range(s) (>= |.35|): 

 

data flags (drop= i); set data; 
array factor {10} factor1-factor10; 
array place {10} place1-place10; 
do i = 1 to 10; 
	place{i} = 0; 
	if factor{i} < .35 and factor{i} > -.35 then place{i} = 1; /* >= |.35| threshold is somewhat arbitrary, 
																	may be changed */ 
end; 
noload_ct = sum(of place1-place10); 
noload = 0; 
if noload_ct = 10 then noload = 1; 
run; 

Now, I've done this before with other data, and it appeared to work as expected.  But in this particular instance, upon inspection of the values I'm trying to filter, it is flagging cases that are .346, which are obviously are not >=|.35|.  Such "close" cases perhaps simply did not exist in prior use of this approach; hence I did not notice.

 

I suppose I can chalk this up to some sort of rounding that is going on behind the scenes.  For my present purposes, I imagine this is fine, but my concern is still that it would be technically inaccurate to say that I used a threshold of >= .35, when the output retains values that are technically, say, > .345 (or whatever it is that's happening behind the scenes).  That is, if someone replicated the analysis with the same data, they might observe this inconsistency, and say, awesome_opossum, "you're wrong, and a liar; how dare you.", which is principally silly enough that it is something I wish to avoid.

 

Does anyone have an explanation for this, that I can at least include in a footnote (e.g. is it actually > .345 -- values rounded to the hundredth?), or any recommendations? 

 

For clarification:  I'm making two lists: one is a list of all the items (rows) that are >= |.35| on at least one variable; the other is a list of items (rows) that are < |.35| on all the variables. Items (rows) with a highest value of |.346| on any variable end up on the list that is supposed to be >= |.35|.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@awesome_opossum wrote:

Hi there, so I'm trying to use this little doo-wop to flag variables that have values within certain range(s) (>= |.35|): 

 

data flags (drop= i); set data; 
array factor {10} factor1-factor10; 
array place {10} place1-place10; 
do i = 1 to 10; 
	place{i} = 0; 
	if factor{i} < .35 and factor{i} > -.35 then place{i} = 1; /* >= |.35| threshold is somewhat arbitrary, 
																	may be changed */ 
end; 
noload_ct = sum(of place1-place10); 
noload = 0; 
if noload_ct = 10 then noload = 1; 
run; 

Now, I've done this before with other data, and it appeared to work as expected.  But in this particular instance, upon inspection of the values I'm trying to filter, it is flagging cases that are .346, which are obviously are not >=|.35|.  Such "close" cases perhaps simply did not exist in prior use of this approach; hence I did not notice.

 

I suppose I can chalk this up to some sort of rounding that is going on behind the scenes.  For my present purposes, I imagine this is fine, but my concern is still that it would be technically inaccurate to say that I used a threshold of >= .35, when the output retains values that are technically, say, > .345 (or whatever it is that's happening behind the scenes).  That is, if someone replicated the analysis with the same data, they might observe this inconsistency, and say, awesome_opossum, "you're wrong, and a liar; how dare you.", which is principally silly enough that it is something I wish to avoid.

 

Does anyone have an explanation for this, that I can at least include in a footnote (e.g. is it actually > .345 -- values rounded to the hundredth?), or any recommendations? 


Check what you think you have written. 0.346 is indeed < 0.35 and  0.346 > -0.35 as shown in your code.

Or provide some example data in the form of a data step and what you expect for results.

 

 

Is there some reason that you did not use the ABS function?

 

Abs(factor[i]) ge 0.35

is the equivalent to test ">= |.35|".

Or if you want the negation

abs(factor[i]) lt 0.35

View solution in original post

8 REPLIES 8
yabwon
Onyx | Level 15

Your condition is wrong.

Should be:

 factor{i} > .35 and factor{i} < -.35

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



awesome_opossum
Obsidian | Level 7

I narrated it as the opposite; the inaccurate flagging remains the issue.

yabwon
Onyx | Level 15

Just to double confirm, your condition is:

 

flag elements which are either greater then 0.35 or less than -0.35

 

right?

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



awesome_opossum
Obsidian | Level 7

I'm making two lists:  one is a list of all the items (rows) that are >= .35 on at least one variable; the other is a list of items (rows) that are < .35 on all the variables.  Items (rows) with a highest value of  .346 on any variable end up on the list that is supposed to be >= .35. 

Tom
Super User Tom
Super User

Something else must be going on.

Example:

data test;
  do factor=.,0,.346,.36,1 ;
    *factor = round(factor,0.01);
    lt0_35 = factor < 0.35;
    ge0_35 = factor >= 0.35;
    output;
  end;
run;
proc print;
run;
Obs    factor    lt0_35    ge0_35

 1       .          1         0
 2      0.000       1         0
 3      0.346       1         0
 4      0.360       0         1
 5      1.000       0         1

yabwon
Onyx | Level 15
data data; 
  array factor {10} factor1-factor10 (.344 .345 .346 .347 .348 .349 .350 .351 .352 .353); 
run;


data flags (drop= i); 
  set data; 
  array factor {10} factor1-factor10; 
  array ge {10} (10*0); /* >= .35 */ 
  array ls {10} (10*0); /*  < .35 */


    do i = 1 to 10; 
    	ge{i} = (factor{i}  >= .35 );
      ls{i} = (factor{i}   < .35 );
    end;
  put (_ALL_) (=/);
run; 

Log:

255
256  data data;
257    array factor {10} factor1-factor10 (.344 .345 .346 .347 .348 .349 .350 .351 .352
257! .353);
258  run;

NOTE: The data set WORK.DATA has 1 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


259
260
261  data flags (drop= i);
262    set data;
263    array factor {10} factor1-factor10;
264    array ge {10} (10*0); /* >= .35 */
265    array ls {10} (10*0); /*  < .35 */
266
267
268      do i = 1 to 10;
269        ge{i} = (factor{i}  >= .35 );
270        ls{i} = (factor{i}   < .35 );
271      end;
272    put (_ALL_) (=/);
273  run;


factor1=0.344
factor2=0.345
factor3=0.346
factor4=0.347
factor5=0.348
factor6=0.349
factor7=0.35
factor8=0.351
factor9=0.352
factor10=0.353
ge1=0
ge2=0
ge3=0
ge4=0
ge5=0
ge6=0
ge7=1
ge8=1
ge9=1
ge10=1
ls1=1
ls2=1
ls3=1
ls4=1
ls5=1
ls6=1
ls7=0
ls8=0
ls9=0
ls10=0
i=11
NOTE: There were 1 observations read from the data set WORK.DATA.
NOTE: The data set WORK.FLAGS has 1 observations and 30 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



ballardw
Super User

@awesome_opossum wrote:

Hi there, so I'm trying to use this little doo-wop to flag variables that have values within certain range(s) (>= |.35|): 

 

data flags (drop= i); set data; 
array factor {10} factor1-factor10; 
array place {10} place1-place10; 
do i = 1 to 10; 
	place{i} = 0; 
	if factor{i} < .35 and factor{i} > -.35 then place{i} = 1; /* >= |.35| threshold is somewhat arbitrary, 
																	may be changed */ 
end; 
noload_ct = sum(of place1-place10); 
noload = 0; 
if noload_ct = 10 then noload = 1; 
run; 

Now, I've done this before with other data, and it appeared to work as expected.  But in this particular instance, upon inspection of the values I'm trying to filter, it is flagging cases that are .346, which are obviously are not >=|.35|.  Such "close" cases perhaps simply did not exist in prior use of this approach; hence I did not notice.

 

I suppose I can chalk this up to some sort of rounding that is going on behind the scenes.  For my present purposes, I imagine this is fine, but my concern is still that it would be technically inaccurate to say that I used a threshold of >= .35, when the output retains values that are technically, say, > .345 (or whatever it is that's happening behind the scenes).  That is, if someone replicated the analysis with the same data, they might observe this inconsistency, and say, awesome_opossum, "you're wrong, and a liar; how dare you.", which is principally silly enough that it is something I wish to avoid.

 

Does anyone have an explanation for this, that I can at least include in a footnote (e.g. is it actually > .345 -- values rounded to the hundredth?), or any recommendations? 


Check what you think you have written. 0.346 is indeed < 0.35 and  0.346 > -0.35 as shown in your code.

Or provide some example data in the form of a data step and what you expect for results.

 

 

Is there some reason that you did not use the ABS function?

 

Abs(factor[i]) ge 0.35

is the equivalent to test ">= |.35|".

Or if you want the negation

abs(factor[i]) lt 0.35
awesome_opossum
Obsidian | Level 7
The abs() function actually did solve the problem! Still strange my method didn't work; but elegant solution; I appreciate it!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 647 views
  • 0 likes
  • 4 in conversation