DATA Step, Macro, Functions and more

Multiple Array in SAS

Reply
Regular Contributor
Posts: 199

Multiple Array in SAS

[ Edited ]

I am trying to create a dataset that contains a set of variables where values greater than or equal to the third quartile and less than or equal to the first quartiles are deleted. I'm inexperienced using arrays. The code below is just my best guess. No observations are written to the new dataset using this code. Any help would be appreciated!

 

data have;
input age weight height Q3_age Q1_age Q3_weight Q1_weight Q3_height Q1_height
;
datalines;
23 165 72 55 44 222 145 72 65
45 222 65 55 44 222 145 72 65
57 165 45 55 44 222 145 72 65
44 156 80 55 44 222 145 72 65
44 109 99 55 44 222 145 72 65
55 118 37 55 44 222 145 72 65
66 145 66 55 44 222 145 72 65
76 444 56 55 44 222 145 72 65
99 234 76 55 44 222 145 72 65
;
run; DATA want; SET have; ARRAY array_one (*) age weight height ; ARRAY array_two (*) Q3_age Q1_age
Q3_weight Q1_weight Q3_height Q1_height ; DO i=1 to DIM(array_one); IF array_one(i)GE array_two(i) THEN DELETE; IF array_one(i)LE array_two(i) THEN DELETE; END; DROP i; RUN;
Super User
Posts: 10,539

Re: Multiple Array in SAS

[ Edited ]

where are the values for the Q variables supposed to be? Is that what you are asking for help on?

Since they are missing in your data then in this line of code:

IF array_one(i) GE array_two(i) THEN DELETE;

 

Array_one is ALWAYS > missing so everything is deleted.

 

Since you may be looking to join every record in Have to a summary dataset this might be close to what you want

proc summary data=have;
   var _numeric_;
   output out=HaveSum q1= q3= /autoname;
run; 
	
proc sql;
   create table want as
   select a.*
   from have as a, havesum as b
   where (age between age_q1 and age_q3)
      and (weight between weight_q1 and weight_q3)
      and (height between height_q1 and height_q3)
   ;
quit;
Regular Contributor
Posts: 199

Re: Multiple Array in SAS

@ballardw 

 

<where are the values for the Q variables supposed to be?>

 

Sorry. I just updated the data...Thanks for the help.

 

 

Super User
Posts: 10,539

Re: Multiple Array in SAS

Why do you want to compare Weight (the second element of Array_one) to the Q1_age (the second element array_two)?

Unless you have a LARGE number of variables like this arrays aren't really helpful and the logic is going to be abysmal.

 

Regular Contributor
Posts: 199

Re: Multiple Array in SAS

[ Edited ]

@ballardw 

 

<Why do you want to compare Weight (the second element of Array_one) to the Q1_age (the second element array_two)>

 

I don't. I just wasn't sure how the array would work. I changed the order of the elements. 

 

I have many more variables. The data included in the original post was just for example purposes.

Respected Advisor
Posts: 4,659

Re: Multiple Array in SAS

If your goal is to compute trimmed means, standard errors, or confidence intervals, consider using proc univariate with option trimmed= 

PG
Regular Contributor
Posts: 199

Re: Multiple Array in SAS

@PGStats I'm not sure I understand the documentation on this function. If I wanted means w/o outliers (e.g. 1.5 x the Q3 and 1.5 x the Q1, how would I specify these parameters in the option? 

 

TRIMMED=values <(<TYPE=keyword> <ALPHA=>)>

 


TRIM=values <(<TYPE=keyword> <ALPHA=>)>
requests a table of trimmed means, where value specifies the number or the proportion of observations that PROC UNIVARIATE trims. If the value is the number of trimmed observations, must be between 0 and half the number of nonmissing observations. If value is a proportion between 0 and 1/2, the number of observations that PROC UNIVARIATE trims is the smallest integer that is greater than or equal to , where is the number of observations. To include confidence limits for the mean and the Student’s test in the table, you must use the default value of VARDEF=, which is DF. For details concerning the computation of trimmed means, see the section Trimmed Means. The TRIMMED= option does not apply if you use a WEIGHT statement.

Super User
Posts: 10,539

Re: Multiple Array in SAS

Trimmed= 0.05 would be a 5% trim, the Type controls which end (or both) to trim.

Trimmed= 0.25 type=twosided

I think is what you may be looking for.

Respected Advisor
Posts: 4,659

Re: Multiple Array in SAS

@ballardw, type= option controls the type of confidence interval that is computed, not the trimming. Trimming is always done symmetrically. I had to do a little test to confirm this, as the doc is ambiguous.

PG
Super User
Posts: 10,539

Re: Multiple Array in SAS


PGStats wrote:

@ballardw, type= option controls the type of confidence interval that is computed, not the trimming. Trimming is always done symmetrically. I had to do a little test to confirm this, as the doc is ambiguous.


Thanks for the heads up. I don't use arbitrarily trimmed data and the documentation got me. I've always been skeptical of trimming since the first stats program I used in class where the professor also wrote the program (that we had to buy...) but that professor loved trimming data.

Ask a Question
Discussion stats
  • 9 replies
  • 367 views
  • 4 likes
  • 3 in conversation