I'd like to create a new data set that sums together all the responsetime readings at trial_line 4 and 5. So, in the sample data below, nothing would change for the variables subject and item, but trial_line would go (1, 2, 3, 4, 6, 1, 2, 3, 4, 6) and responsetime would go (395, 409, 398, 766, 401, 343, 343, 343, 679, 409) (bolded for emphasis). I think this is fairly simple to do, but I am unsure how to go about it.
data reading2;
input subject $ item $ trial_line responsetime;
lines;
438 5 1 395
438 5 2 409
438 5 3 398
438 5 4 380
438 5 5 386
438 6 6 401
438 6 1 343
438 6 2 343
438 6 3 343
438 6 4 311
438 6 5 368
438 6 6 409
;
There are several approaches to this type of problem. Questions to answer first:
Do you have other variables in the data set than those shown? If so what to do with them?
Do you always have a 5 and 6 or is it possible that you have a 5 without a 6? Do ever have other values for trial_line other than 1 through 6?
Is the data actually sorted by subject, item and trial_line?
There are ways depending on the answers above that would use the Retain or Lag and Output statements but I am fond of using a custom format to create groups when the rule is simple and then a summary procedure to the accumulation.
proc format; value trial_line 4,5 = 4 ; proc summary data=have nway; class subject item trial_line ; format trial_line trial_line.; var responsetime; output out=want (drop=_type_ _freq_) sum=; run;
The format trial_line will group values of 4 and 5 under a single display value of 4 and use that grouping when used as a class variable in proc summary (or proc means ), class variables are used for grouping and the variable(s) to sum (or average or what have you) is on the var statement. The single statistic sum= says that the variable is summed but the result has the same name. If multiple statistics are requested you need to either specify the name of the result or use an option like /autoname to generate names with the statistic abbreviation added. The drop statement removes variables that would be added to the data set that indication properties of the values, Nway says to only have the output for combinations of all the class variables. Proc summary can generate different combinations of summaries.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.