Hi All,
I have to subset a large data set with multiple to parent to child item combination.
Of which I have to only subset the one which should belong to the first combination of the variables only.
Ex. One parent_item and multiple child_items and I just want to dynamically select the first of every combination. I TRIED SORT AND FIRST.VARIABLE AND LAST.VARIABLE BUT IN ALL THOSE CASES THE FIRST COMBINATION APPEARING INORIGINAL DATA SET MAY CHANGE BECAUSE OF SORTING. How can it be achieved.
parent_item | desc | child item | value |
A11233 | harry potter | a11233a | 1200 |
A11233 | harry potter | a11233a | 565 |
A11233 | harry potter | a11233a | 45458 |
A11233 | harry potter | a11233a | 648 |
A11233 | harry potter | a11233a | 9889 |
A11233 | harry potter1 | a11233b | 44 |
A11233 | harry potter1 | a11233b | 44848 |
A11233 | harry potter1 | a11233b | 10165 |
A11233 | harry potter1 | a11233b | 4547 |
A11233 | harry potter2 | a11233b | 7950 |
A11233 | harry potter2 | a11233c | 6608 |
A11233 | harry potter2 | a11233c | 54589 |
A11233 | harry potter2 | a11233c | 445 |
A11233 | harry potter2 | a11233c | 45 |
B11233 | batman | b11233a | 1200 |
B11233 | batman | b11233a | 565 |
B11233 | batman | b11233a | 45458 |
B11233 | batman | b11233a | 648 |
B11233 | batman | b11233a | 9889 |
B11233 | batman1 | b11233b | 44 |
B11233 | batman1 | b11233b | 44848 |
B11233 | batman1 | b11233b | 10165 |
B11233 | batman1 | b11233b | 4547 |
B11233 | batman1 | b11233b | 7950 |
B11233 | batman2 | b11233c | 6608 |
B11233 | batman2 | b11233c | 54589 |
B11233 | batman2 | b11233c | 445 |
B11233 | batman2 | b11233c | 45 |
If you want to be consistent in selecting an observation out of a group, you need to define a rule and force that during the sorting step.
Everything you leave to chance will happen by chance.
IE if you want that the original order to be preserved:
data inter;
set have;
obsno = _N_;
run;
proc sort data=inter;
by parent_item desc child_item obsno;
run;
data want (drop=obsno);
set inter;
by parent_item desc child_item;
if first.child_item;
run;
Here's an approach. If it doesn't do exactly what you want, it's probably close enough that you can adjust it.
data want;
set have;
by parent_item notsorted;
if first.parent_item then first_child = child_item;
retain first_child;
if child_item = first_child;
drop first_child;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.