Hi All,
I have to subset a large data set with multiple to parent to child item combination.
Of which I have to only subset the one which should belong to the first combination of the variables only.
Ex. One parent_item and multiple child_items and I just want to dynamically select the first of every combination. I TRIED SORT AND FIRST.VARIABLE AND LAST.VARIABLE BUT IN ALL THOSE CASES THE FIRST COMBINATION APPEARING INORIGINAL DATA SET MAY CHANGE BECAUSE OF SORTING. How can it be achieved.
parent_item | desc | child item | value |
A11233 | harry potter | a11233a | 1200 |
A11233 | harry potter | a11233a | 565 |
A11233 | harry potter | a11233a | 45458 |
A11233 | harry potter | a11233a | 648 |
A11233 | harry potter | a11233a | 9889 |
A11233 | harry potter1 | a11233b | 44 |
A11233 | harry potter1 | a11233b | 44848 |
A11233 | harry potter1 | a11233b | 10165 |
A11233 | harry potter1 | a11233b | 4547 |
A11233 | harry potter2 | a11233b | 7950 |
A11233 | harry potter2 | a11233c | 6608 |
A11233 | harry potter2 | a11233c | 54589 |
A11233 | harry potter2 | a11233c | 445 |
A11233 | harry potter2 | a11233c | 45 |
B11233 | batman | b11233a | 1200 |
B11233 | batman | b11233a | 565 |
B11233 | batman | b11233a | 45458 |
B11233 | batman | b11233a | 648 |
B11233 | batman | b11233a | 9889 |
B11233 | batman1 | b11233b | 44 |
B11233 | batman1 | b11233b | 44848 |
B11233 | batman1 | b11233b | 10165 |
B11233 | batman1 | b11233b | 4547 |
B11233 | batman1 | b11233b | 7950 |
B11233 | batman2 | b11233c | 6608 |
B11233 | batman2 | b11233c | 54589 |
B11233 | batman2 | b11233c | 445 |
B11233 | batman2 | b11233c | 45 |
If you want to be consistent in selecting an observation out of a group, you need to define a rule and force that during the sorting step.
Everything you leave to chance will happen by chance.
IE if you want that the original order to be preserved:
data inter;
set have;
obsno = _N_;
run;
proc sort data=inter;
by parent_item desc child_item obsno;
run;
data want (drop=obsno);
set inter;
by parent_item desc child_item;
if first.child_item;
run;
Here's an approach. If it doesn't do exactly what you want, it's probably close enough that you can adjust it.
data want;
set have;
by parent_item notsorted;
if first.parent_item then first_child = child_item;
retain first_child;
if child_item = first_child;
drop first_child;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.