I have two datasets. One old and one new...if there are new increase in numbers for selected columns (i.e. Enrolled and Events), I would like to display "(+<Number>).
old export: Dataset_a
Group | Screened | Enrolled | Events |
ABC | 40 | 21 | 3 |
DEF | 24 | 21 | 4 |
GHI | 55 | 41 | 5 |
JKL | 24 | 2 | 6 |
MNO | 42 | 23 | 8 |
new export: Dataset_b
Group | Screened | Enrolled | Events |
ABC | 89 | 33 | 13 |
DEF | 56 | 43 | 4 |
GHI | 88 | 45 | 22 |
JKL | 67 | 23 | 34 |
MNO | 68 | 23 | 23 |
Want:
Group | Screened | Enrolled | Events |
ABC | 89 | 33 (+12) | 13 (+10) |
DEF | 56 | 43 (+22) | 4 |
GHI | 88 | 45 (+4) | 22 (+17) |
JKL | 67 | 23 (+21) | 34 (+28) |
MNO | 68 | 23 | 23 (+15) |
I have merged two datasets and used subtraction with concentrate character so was wondering if this is quick solution
Show us, code, how you "merged" the two data sets.
And why are these named "export" if that has any bearing on the actual problem.
A general comment: having two values in a single variable such as that 33 (+12) almost always means extra work for very little gain over having two variables, such as Enrolled with a value of 33 and EnrollmentChange or similar name with a value of 12.
Please provide example data in the form of data step code.
This is what I would do if this were my project:
data dataset_a; input Group $ Screened Enrolled Events; datalines; ABC 40 21 3 DEF 24 21 4 GHI 55 41 5 JKL 24 2 6 MNO 42 23 8 ; data dataset_b; input Group $ Screened Enrolled Events; datalines; ABC 89 33 13 DEF 56 43 4 GHI 88 45 22 JKL 67 23 34 MNO 68 23 23 ; /* sort both sets by Group variable if not in order*/ data want; merge dataset_a (rename=(enrolled=enr1 events=event1) ) dataset_b ; by group; enrollchange= enrolled-enr1; eventchange = events - event1; drop enr1 event1; run;
Why? Because someone may want statistics on the changes such a maximim, minimum, mean or similar that can't be done with values that would have to be character such as 33 (+12). Just finding which of groups had the largest/smallest change with that character value would be an exercise in parsing the data back to a change value to compare them.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.