proc sql;
create table TEST_STATUS (MEMB_ID num, PAD_FLAG char(1), DECILE num, DOS_YEAR num, PROGRESSION char(6), STATUS char(8));
insert into TEST_STATUS
values(1231, 'N', 8, 2019,'STAYER',' DROPPED')
values(1231, 'N', 8, 2020,'STAYER',' DROPPED')
values(1241, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1241, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1251, 'N', 9, 2019,'STAYER',' FOUND ')
values(1251, 'N', 9, 2020,'STAYER',' FOUND ')
values(1261, 'N', 2, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2020,'STAYER',' DROPPED')
values(1281, 'N', 9, 2019,'STAYER',' DROPPED')
values(1281, 'N', 9, 2020,'STAYER',' DROPPED')
values(1291, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1291, 'Y', 1, 2020,'STAYER',' FOUND ')
values(1211, 'Y', 8, 2019,'STAYER',' NEW ')
values(1211, 'Y', 8, 2020,'STAYER',' DROPPED')
values(1201, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1201, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1301, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1301, 'Y', 1, 2020,'STAYER',' DROPPED')
values(1311, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1311, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1331, 'N', 1, 2019,'STAYER',' DROPPED')
values(1331, 'N', 1, 2020,'STAYER',' DROPPED')
;
run;
/* EXAMPLE OF LOGIC I NEED, this code does not work */
data WANT;
set TEST_STATUS;
if DOS_YEAR = 2019 and STATUS in ('NEW','FOUND') and DOS_YEAR = 2020 and STATUS in ('DROPPED') then TARGET=1; ELSE TARGET=0;
if DOS_YEAR = 2019 and STATUS = 'DROPPED' and DOS_YEAR = 2020 and STATUS = 'DROPPED' then TARGET=1; ELSE TARGET=0;
run;
I'm trying to find a way to weed out non targeted membership. I need to exclude members with a status of 'NEW' or 'FOUND' in 2020. I'm also trying to target members where 'DROPPED' status or 'NEW','FOUND' status was in 2019 and contained a 'DROPPED' status in 2020. When i add multiple conditions it does not work. Any help would be appreciated. It's also importan to note there are scenarios where a member only shows up for 1 year, in this case I need to target where if member status in 2019 = 'DROPPED' I need to target.
These are the results i'm looking for
MEMB_ID | PAD_FLAG | DECILE | DOS_YEAR | PROGRESSION | STATUS | TARGET |
1231 | N | 8 | 2019 | STAYER | DROPPED | 1 |
1231 | N | 8 | 2020 | STAYER | DROPPED | 1 |
1241 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1241 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1251 | N | 9 | 2019 | STAYER | FOUND | 0 |
1251 | N | 9 | 2020 | STAYER | FOUND | 0 |
1261 | N | 2 | 2019 | STAYER | DROPPED | 1 |
1271 | N | 7 | 2019 | STAYER | DROPPED | 1 |
1271 | N | 7 | 2020 | STAYER | DROPPED | 1 |
1281 | N | 9 | 2019 | STAYER | DROPPED | 1 |
1281 | N | 9 | 2020 | STAYER | DROPPED | 1 |
1291 | Y | 1 | 2019 | STAYER | FOUND | 0 |
1291 | Y | 1 | 2020 | STAYER | FOUND | 0 |
1211 | Y | 8 | 2019 | STAYER | NEW | 1 |
1211 | Y | 8 | 2020 | STAYER | DROPPED | 1 |
1201 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1201 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1301 | Y | 1 | 2019 | STAYER | FOUND | 1 |
1301 | Y | 1 | 2020 | STAYER | DROPPED | 1 |
1311 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1311 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1331 | N | 1 | 2019 | STAYER | DROPPED | 1 |
1331 | N | 1 | 2020 | STAYER | DROPPED | 1 |
And the same in datastep in EG
copy ' DROPPED'
from
values(1231,'N
', 8, 2019,'STAYER',' DROPPED')
and paste it in the code
data want;
do _iorc_=1 by 1 until(last.memb_id);
set test_status;
by memb_id notsorted;
if strip(status)=" DROPPED" then _n_=0;
end;
do _iorc_=1 to _iorc_;
set test_status;
Target=not _n_;
output;
end;
run;
HI @bknitch To my mind it seems you are likely after this?
proc sql;
create table want(drop=rn) as
select *,max(strip(status)='DROPPED') as target
from (select *,monotonic() as rn from test_status)
group by memb_id
order by rn;
quit;
HI @novinosrin this produced the same results, is the montonic function standard in SAS EG? I've never used it before in SQL ?
Well the MONOTONIC() was used to get Record number(RN) for the dataset as I thought you probably wanted to the dataset in the order posted in the sample. You can do without record number in SQL by creating the same using a datastep. It's up to you.
The logic though isn't to do with MONOTONIC() /record number, rather to find if any of the record for a MEMB_ID is 'DROPPED'. Should this be true, the entire group has a Target value of 1, else 0. Basically a simple boolean expression. Makes sense?
HI @novinosrin , this makes sense but the target list I feel would not be all =1, as there are scenarios where there are founds or news for both 2019 and 2020. Maybe i'm misunderstanding you. The datastep I've attempted failed multiple times. If you look at my expected output there are scenarios where a target=0 .
@bknitch Yes There are some Target=0's in the results the SQL code that I gave you produced. Did you check?
Hi @novinosrin The results provided in my output are below, they are all 0's. That's why i was initially confused.
Hi again @bknitch Here is my test using the sample you gave me. Also your status value in the sample was truncating because of shorter length, so modified to STATUS char(15));
proc sql;
create table TEST_STATUS (MEMB_ID num, PAD_FLAG char(1), DECILE num, DOS_YEAR num, PROGRESSION char(6), STATUS char(15));
insert into TEST_STATUS
values(1231, 'N', 8, 2019,'STAYER',' DROPPED')
values(1231, 'N', 8, 2020,'STAYER',' DROPPED')
values(1241, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1241, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1251, 'N', 9, 2019,'STAYER',' FOUND ')
values(1251, 'N', 9, 2020,'STAYER',' FOUND ')
values(1261, 'N', 2, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2020,'STAYER',' DROPPED')
values(1281, 'N', 9, 2019,'STAYER',' DROPPED')
values(1281, 'N', 9, 2020,'STAYER',' DROPPED')
values(1291, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1291, 'Y', 1, 2020,'STAYER',' FOUND ')
values(1211, 'Y', 8, 2019,'STAYER',' NEW ')
values(1211, 'Y', 8, 2020,'STAYER',' DROPPED')
values(1201, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1201, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1301, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1301, 'Y', 1, 2020,'STAYER',' DROPPED')
values(1311, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1311, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1331, 'N', 1, 2019,'STAYER',' DROPPED')
values(1331, 'N', 1, 2020,'STAYER',' DROPPED')
;
quit;
proc sql;
create table want(drop=rn) as
select *,max(strip(status)='DROPPED') as target
from (select *,monotonic() as rn from test_status)
group by memb_id
order by rn;
quit;
proc print noobs;run;
RESULTS:
MEMB_ID | PAD_FLAG | DECILE | DOS_YEAR | PROGRESSION | STATUS | target |
---|---|---|---|---|---|---|
1231 | N | 8 | 2019 | STAYER | DROPPED | 1 |
1231 | N | 8 | 2020 | STAYER | DROPPED | 1 |
1241 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1241 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1251 | N | 9 | 2019 | STAYER | FOUND | 0 |
1251 | N | 9 | 2020 | STAYER | FOUND | 0 |
1261 | N | 2 | 2019 | STAYER | DROPPED | 1 |
1271 | N | 7 | 2019 | STAYER | DROPPED | 1 |
1271 | N | 7 | 2020 | STAYER | DROPPED | 1 |
1281 | N | 9 | 2019 | STAYER | DROPPED | 1 |
1281 | N | 9 | 2020 | STAYER | DROPPED | 1 |
1291 | Y | 1 | 2019 | STAYER | FOUND | 0 |
1291 | Y | 1 | 2020 | STAYER | FOUND | 0 |
1211 | Y | 8 | 2019 | STAYER | NEW | 1 |
1211 | Y | 8 | 2020 | STAYER | DROPPED | 1 |
1201 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1201 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1301 | Y | 1 | 2019 | STAYER | FOUND | 1 |
1301 | Y | 1 | 2020 | STAYER | DROPPED | 1 |
1311 | Y | 9 | 2019 | STAYER | FOUND | 1 |
1311 | Y | 9 | 2020 | STAYER | DROPPED | 1 |
1331 | N | 1 | 2019 | STAYER | DROPPED | 1 |
1331 | N | 1 | 2020 | STAYER | DROPPED | 1 |
HI @novinosrin ,
I think its an issue with the monotonic function, my SAS EG may not have this function? I'm still getting 0's in my target results. Even running your code.
@bknitch Let's get rid of the monotonic() and do the same in two steps to check if this works
data temp;
set test_status curobs=k;
n=k;
run;
proc sql;
create table want(drop=n) as
select *,max(strip(status)='DROPPED') as target
from temp
group by memb_id
order by n;
quit;
proc print noobs;run;
hi @novinosrin ,
Unfortunately it's still only producing 0's. The curobs added the observational numbers to the new field n, but on the output it's still only giving me 0's. I'm so sorry for the back and forth.
Hi @bknitch You have a point. It's the damn EG. Here is a fix that I did in EG and it works in EG.
What i did was copy and paste the value dropped from the SQL insert at the top the SQL solution below. Try the below
proc sql;
create table TEST_STATUS (MEMB_ID num, PAD_FLAG char(1), DECILE num, DOS_YEAR num, PROGRESSION char(6), STATUS char(15));
insert into TEST_STATUS
values(1231, 'N', 8, 2019,'STAYER',' DROPPED')
values(1231, 'N', 8, 2020,'STAYER',' DROPPED')
values(1241, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1241, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1251, 'N', 9, 2019,'STAYER',' FOUND ')
values(1251, 'N', 9, 2020,'STAYER',' FOUND ')
values(1261, 'N', 2, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2019,'STAYER',' DROPPED')
values(1271, 'N', 7, 2020,'STAYER',' DROPPED')
values(1281, 'N', 9, 2019,'STAYER',' DROPPED')
values(1281, 'N', 9, 2020,'STAYER',' DROPPED')
values(1291, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1291, 'Y', 1, 2020,'STAYER',' FOUND ')
values(1211, 'Y', 8, 2019,'STAYER',' NEW ')
values(1211, 'Y', 8, 2020,'STAYER',' DROPPED')
values(1201, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1201, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1301, 'Y', 1, 2019,'STAYER',' FOUND ')
values(1301, 'Y', 1, 2020,'STAYER',' DROPPED')
values(1311, 'Y', 9, 2019,'STAYER',' FOUND ')
values(1311, 'Y', 9, 2020,'STAYER',' DROPPED')
values(1331, 'N', 1, 2019,'STAYER',' DROPPED')
values(1331, 'N', 1, 2020,'STAYER',' DROPPED')
;
quit;
data temp;
set test_status curobs=k;
n=k;
run;
proc sql;
create table want(drop=n) as
select *,max(strip(status)=' DROPPED') as target
from temp
group by memb_id
order by n;
quit;
proc print noobs;run;
And the same in datastep in EG
copy ' DROPPED'
from
values(1231,'N
', 8, 2019,'STAYER',' DROPPED')
and paste it in the code
data want;
do _iorc_=1 by 1 until(last.memb_id);
set test_status;
by memb_id notsorted;
if strip(status)=" DROPPED" then _n_=0;
end;
do _iorc_=1 to _iorc_;
set test_status;
Target=not _n_;
output;
end;
run;
Nothing to be sorry about. That was fun! Btw, I am one person who has enormous patience to deal with any complexity going back and forth no matter how long it takes. If something I can be proud of my own self , i guess this quality while for everything else I am not the person to know. lol Have a good day!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.