SAS Programming

Koke · Posted 07-27-2017 01:00 PM

Hello expert

I want to trim my data by eliminate highest and lowest (G1-G13) for each observation from my data set .

data grades;
infile '/folders/myshortcuts/SASUniversityEdition/module 7/grades.txt';
input id 4. @6 G1 1. G2 1. G3 1. 
G4 1. G5 1. G6 1. G7 1. G8 1. 
G9 1. G10 1. G11 1. G12 1. G13 1.
;
total = G1+G2+G3+G4+G5+G6+G7
+G8+G9+G10+G11+G12+G13;
proc
print
data = grades;

here is my output

Obs id G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 total 1 2 3

1105	8	7	5	8	3	7	7	8	8	8	8	8	6	91
1294	8	8	8	7	5	7	8	8	8	8	5	8	2	90
2009	8	6	3	8	4	6	7	7	8	7	0	4	7	75

Astounding · Posted 07-27-2017 03:48 PM

It's not clear what "trim" your data means. You can't get rid of variables, because G1 might contain the largest value on one observation, but a middling value on another observation.

GIven what you wrote so far, you could conceivably use:

trimmed_total = total - ( max(of g1-g13) + min(of g1-g13) ) ;

View solution in original post

Reeza · Posted 07-27-2017 01:05 PM

What do you want as output?

HB · Posted 07-27-2017 02:43 PM

I think you have an unfortunate nonnormalized data structure that makes the task harder.

If I take

data grades;
   input id grade1 grade2 grade3 grade4;
datalines;
1105 1 2 3 4
1106 2 2 2 2
1107 3 3 4 6
;
run;

and flip it to a normalized structure like

proc sql;
    create table grades_normalized as 
    select id as id, 'assignment1' as assignment, grade1 as grade
    from grades
    union
    select id, 'assignment2', grade2
    from grades
    union
    select id, 'assignment3', grade3
    from grades
    union
    select id, 'assignment4', grade4
    from grades;
quit;

I get a table that looks like

                              Obs     id     assignment     grade

                                1    1105    assignment1      1
                                2    1105    assignment2      2
                                3    1105    assignment3      3
                                4    1105    assignment4      4
                                5    1106    assignment1      2
                                6    1106    assignment2      2
                                7    1106    assignment3      2
                                8    1106    assignment4      2
                                9    1107    assignment1      3
                               10    1107    assignment2      3
                               11    1107    assignment3      4
                               12    1107    assignment4      6

With that structure, I can sort (not needed for this data, but you never know) and trim.

proc sort data= grades_normalized;
 	by id grade;
run;

data trimmed_grades;
	set grades_normalized;
	by id;
	if first.id then delete;
	if last.id then delete;
run;

then summarize

proc sql;
	create table grade_report as 
	select id as id, sum(grade) as grade_total
	from trimmed_grades
	group by id
	order by id;
quit;

and end up with

                                                     grade_
                                     Obs     id      total

                                      1     1105       5
                                      2     1106       4
                                      3     1107       7

Now one of the experts around here will post a solution that does this all in the data step or with a macro or something but they haven't replied so you are stuck with me.

EDIT:

Did I not say somebody better than me would come along and do it in the data step?

data grades;
input id 4. g1-g13;
datalines;
1105 8 7 5 8 3 7 7 8 8 8 8 8 6 
1294 8 8 8 7 5 7 8 8 8 8 5 8 2 
2009 8 6 3 8 4 6 7 7 8 7 0 4 7 
;
run;

data want;
   set grades;
   total = sum( of g1-g13);
   trim_total = total - max(of g1-g13) - min(of g1-g13);
run;

ballardw · Posted 07-27-2017 06:38 PM

Simpler to read the data differently perhaps to get the normalized structure:

data grades;
   input id 4. @ ;
   do assignment=1 to 13;
      input grade @;
      output;
   end;
   input;
datalines;
1105 8 7 5 8 3 7 7 8 8 8 8 8 6 91 
1294 8 8 8 7 5 7 8 8 8 8 5 8 2 90 
2009 8 6 3 8 4 6 7 7 8 7 0 4 7 75 
;
run;

which ignores the "total" in the example data provided by OP.

ballardw · Posted 07-27-2017 03:41 PM

It really isn't obvious what your desired output is.

If the resulting order of the G variables is not critical here's a possible start;

data grades;
input id 4. g1-g13;
datalines;
1105 8 7 5 8 3 7 7 8 8 8 8 8 6 91 
1294 8 8 8 7 5 7 8 8 8 8 5 8 2 90 
2009 8 6 3 8 4 6 7 7 8 7 0 4 7 75 
;
run;

data want;
   set grades;
   array g g1-g13;
   call sortn(of g(*));
   total = sum( of g2-g12);
run;

Note that I have posted a way to duplicate your data (at least your example output) since we do not have your file. Note that your data step as written may yield surprising results if your data contains values like 10 or 100.

The second data step, which could actually be done in the step that reads the data, sorts the values of G1 - G13.

If you want a trimmed total that is what I show. If you want to actually remove the the largest and smallest values the you could drop the variables G1 and G13.

However this approach may only work if all of G1 to G13 exist.

Reeza · Posted 07-27-2017 03:48 PM

And how do you want to handle ties? If there are 4 8's are you removing all?

I would use LARGEST/SMALLEST to remove the values not needed but without what you want as output and what you have as input we can't really connect all the dots yet.

Astounding · Posted 07-27-2017 03:48 PM

It's not clear what "trim" your data means. You can't get rid of variables, because G1 might contain the largest value on one observation, but a middling value on another observation.

GIven what you wrote so far, you could conceivably use:

trimmed_total = total - ( max(of g1-g13) + min(of g1-g13) ) ;

SAS Programming

trimmed mean

Re: trimmed mean

Re: trimmed mean

Re: trimmed mean

Re: trimmed mean

Re: trimmed mean

Re: trimmed mean

Re: trimmed mean

Prevent input to trim values

Re: Difference between strip, compress and trim

proc sql and trim

proc sql trim

trim function

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...