Hello,
I am trying to pick up the specified data from the HAVE dataset.
By using the HAVE table as an example,
Obs | HRM_L2 | PERSON_CTRY_CODE | Type | GEOGRAPHIC_DESCR_ | degree |
1 | A | 0 | |||
2 | A | CN | 0 | ||
3 | A | TW | 31192X | TW | 9 |
4 | B | 3 | |||
5 | B | FR | 0 | ||
6 | B | IT | 0 | ||
7 | B | KR | 745666 | KR | 8 |
8 | B | US | 745666 | US | 9 |
9 | C | 745666 | TH | 3 | |
10 | C | GB | 0 | ||
11 | C | US | 0 |
1.For each 'HRM_L2' group, if one of the degrees is larger or equal than 6 (i.e.,6,7,8,9) then I would like to have that observation. and has a note as 1
for example, for 'A' group, I expect to only keep the 'Obs' 3 as the degree is 9.
2.in the 'B' group, both 'Obs' 7 and 'Obs' 8 are larger than 6 then I expect to keep both of them, and have a note as 2.
3. if all of degree is less than 6 in the 'HRM_L2' group, then I expect to keep all of them and with a note 3.
the result is expected to look like below.
Obs | HRM_L2 | PERSON_CTRY_CODE | Type | GEOGRAPHIC_DESCR_ | degree | Note |
3 | A | TW | 31192X | TW | 9 | 1 |
7 | B | KR | 745666 | KR | 8 | 2 |
8 | B | US | 745666 | US | 9 | 2 |
9 | C | 745666 | TH | 3 | 3 | |
10 | C | GB | 0 | 3 | ||
11 | C | US | 0 | 3 |
Could you please give me some suggestion about this?
thanks in advance
data have ;
infile cards dsd dlm=",";
input HRM_L2 $ PERSON_CTRY_CODE $ Type $ GEOGRAPHIC_DESCR_ $ degree ;
cards ;
A,,,,0
A,CN,,,0
A,TW,31192X,TW,9
B,,,,3
B,FR,,,0
B,IT,,,0
B,KR,745666,KR,8
B,US,745666,US,9
C,,745666,TH,3
C,GB,,,0
C,US,,,0
;
run;
.
proc sql;
create table want as
select *,
(case when (sum(degree lt 6) = n(HRM_L2)) then n(HRM_L2)
else sum(degree ge 6)
end) as note
from have
group by HRM_L2
having sum(degree ge 6)=0 | degree ge 6;
quit;
Result:
HRM_L2 PERSON_CTRY_CODE Type GEOGRAPHIC_DESCR_ degree note A TW 31192X TW 9 1 B KR 745666 KR 8 2 B US 745666 US 9 2 C US 0 3 C GB 0 3 C 745666 TH 3 3
data want;
set have;
if hrm_l2 = "A" then do;
if degree >= 6 then note = 1;
end;
else if hrm_l2 = "B" then do;
if degree >= 6 then note = 2;
end;
else do;
if degree < 6 then note = 3;
end;
if note > 0;
run;
proc sql;
create table want as
select *,
(case when (sum(degree lt 6) = n(HRM_L2)) then n(HRM_L2)
else sum(degree ge 6)
end) as note
from have
group by HRM_L2
having sum(degree ge 6)=0 | degree ge 6;
quit;
Result:
HRM_L2 PERSON_CTRY_CODE Type GEOGRAPHIC_DESCR_ degree note A TW 31192X TW 9 1 B KR 745666 KR 8 2 B US 745666 US 9 2 C US 0 3 C GB 0 3 C 745666 TH 3 3
Hi @Alexxxxxxx I'd think you need a double pass, 1st to identify and 2nd to select
data have ;
infile cards dsd dlm=",";
input HRM_L2 $ PERSON_CTRY_CODE $ Type $ GEOGRAPHIC_DESCR_ $ degree ;
cards ;
A,,,,0
A,CN,,,0
A,TW,31192X,TW,9
B,,,,3
B,FR,,,0
B,IT,,,0
B,KR,745666,KR,8
B,US,745666,US,9
C,,745666,TH,3
C,GB,,,0
C,US,,,0
;
run;
data want;
do _n_=1 by 1 until(last.HRM_L2);
set have;
by HRM_L2;
_k=max(degree,_k);
end;
_f=_k<6;
note+1;
do _n_=1 to _n_;
set have;
if _f or degree>=6 then output;
end;
drop _:;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.