Hello
I want to add two rows of data and merge into one row. This is a big dataset and I just want to add two specific rows based on an 'If' condition. Something like this;
Have
Course | Spring 2022 | Fall 2022 | Spring 2023 | Fall 2023 |
MAT156 | 1 | 1 | 7 | 559 |
MAT157 | 439 | 529 | 433 | 8 |
Want
Course | Spring 2022 | Fall 2022 | Spring 2023 | Fall 2023 |
MAT158 | 440 | 530 | 440 | 567 |
I am adding MAT156 and MAT157 values for each field and creating a new row and calling it MAT158. I searched the library but couldn't find anything of this sort. Any help is appreciated. Thanks.
IF the numbers for the seasons are to be summed and if you are likely to have many of these sorts of combinations then you may want to learn about custom formats. You can create all sorts of groups using the existing values.
data have; input Course $ Spring2022 Fall2022 Spring2023 Fall2023; cards; MAT156 1 1 7 559 MAT157 439 529 433 8 ; run; proc format; value $coursegrp 'MAT156', 'MAT157' = 'MAT158' ; run; proc summary data=have nway; class course; format course $coursegrp.; var Spring2022 Fall2022 Spring2023 Fall2023; output out=want (drop=_type_ _freq_) sum=; run;
/* or a report*/
Proc report data=have;
columns course Spring2022 Fall2022 Spring2023 Fall2023;
define course/group format=$coursegrp.;
run;
One advantage to a format is that it does not actually change values of the variable(s). So if you make a mistake you don't have to undo anything (which this sort of example would be pretty hard to separate). You can also assign MANY different values to different new displayed values. So one format could address changes in ENG and SCI (or what have you).
A third less obvious advantage to a format is that you can use different different formats at use time. I have about a dozen formats for different age groups such as pre-teen/teen/adult, or 5-year and 10-year age bands or the ages that a program uses for decision points such as a recommended medical procedure.
Data step.
data have;
input Course $ Spring2022 Fall2022 Spring2023 Fall2023;
cards;
MAT156 1 1 7 559
MAT157 439 529 433 8
;
run;
data temp1(keep= course spring: fall:) temp2(keep= course spring: fall:) temp3(keep= _: rename=(_course=course _Spring2022=Spring2022 _Fall2022=Fall2022 _Spring2023=Spring2023 _Fall2023=Fall2023));
set have;
retain temp1-temp4;
if course='MAT156' then do;
temp1=Spring2022;
temp2=Fall2022;
temp3=Spring2023;
temp4=Fall2023;
end;
if course='MAT157' then do;
_Course='MAT158';
_Spring2022= sum(temp1,Spring2022);
_Fall2022= sum(temp2, Fall2022);
_Spring2023= sum(temp3, Spring2023);
_Fall2023= sum(temp4,Fall2023);
end;
drop temp:;
if course='MAT156' then output temp1;
if course='MAT157' then output temp2 temp3;
run;
data want;
set temp1-temp3;
proc print; run;
Thank you for your response.
IF the numbers for the seasons are to be summed and if you are likely to have many of these sorts of combinations then you may want to learn about custom formats. You can create all sorts of groups using the existing values.
data have; input Course $ Spring2022 Fall2022 Spring2023 Fall2023; cards; MAT156 1 1 7 559 MAT157 439 529 433 8 ; run; proc format; value $coursegrp 'MAT156', 'MAT157' = 'MAT158' ; run; proc summary data=have nway; class course; format course $coursegrp.; var Spring2022 Fall2022 Spring2023 Fall2023; output out=want (drop=_type_ _freq_) sum=; run;
/* or a report*/
Proc report data=have;
columns course Spring2022 Fall2022 Spring2023 Fall2023;
define course/group format=$coursegrp.;
run;
One advantage to a format is that it does not actually change values of the variable(s). So if you make a mistake you don't have to undo anything (which this sort of example would be pretty hard to separate). You can also assign MANY different values to different new displayed values. So one format could address changes in ENG and SCI (or what have you).
A third less obvious advantage to a format is that you can use different different formats at use time. I have about a dozen formats for different age groups such as pre-teen/teen/adult, or 5-year and 10-year age bands or the ages that a program uses for decision points such as a recommended medical procedure.
@anandas wrote:
Hello
I want to add two rows of data and merge into one row. This is a big dataset and I just want to add two specific rows based on an 'If' condition. Something like this;
Have
Course Spring 2022 Fall 2022 Spring 2023 Fall 2023 MAT156 1 1 7 559 MAT157 439 529 433 8
Want
Course Spring 2022 Fall 2022 Spring 2023 Fall 2023 MAT158 440 530 440 567
I am adding MAT156 and MAT157 values for each field and creating a new row and calling it MAT158. I searched the library but couldn't find anything of this sort. Any help is appreciated. Thanks.
You haven't stated what IF condition you want to use.
The if condition would be to select MAT156 and MAT157 from a series of other courses. The course column has many more course acronyms and I would nee to change only certain courses and MAT156 and MAT157 is one example of the many where I need an if statement specifying the courses that I want changed.
@anandas wrote:
The if condition would be to select MAT156 and MAT157 from a series of other courses. The course column has many more course acronyms and I would nee to change only certain courses and MAT156 and MAT157 is one example of the many where I need an if statement specifying the courses that I want changed.
data have2;
set have;
weight=0;
if course in ('MAT156','MAT157') then weight=1;
run;
proc summary data=have2;
var spring2022--fall2023;
weight weight;
output out=want sum=;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.