Help using Base SAS procedures

flagging dups on multiple vars

Reply
N/A
Posts: 0

flagging dups on multiple vars

Hello,
I'm in the early days of converting from SPSS to SAS.

My file has the first four columns. I want to add the last two columns--LAST and COUNT.

LAST indicates if a record is unique based on id, v2, and flag.
COUNT would be a cumulative count based on id and v2 ordered by date where 0 means unique record.

id v2 flag date LAST COUNT
1, a, 0, 1/1/2008, 1, 1
1, a, 1, 1/7/2009, 1, 2
1, b, 1, 1/1/2009, 1, 0
1, c, 0, 1/1/2007, 0, 1
1, c, 0, 1/1/2008, 1, 2
1, c, 1, 1/1/2009, 1, 3
1, d, 0, 6/1/2007, 1, 0

I'm totally stumped on how to create last. I think I've figured out a really inelegant way to get the count var, but an easier way would be appreciated.

Thanks for any help!

Dana
Super Contributor
Super Contributor
Posts: 3,174

Re: flagging dups on multiple vars

Posted in reply to deleted_user
The SAS DATA step and using FIRST. and LAST. against your sorted file, and specifying the required BY statement with your variable list. The SAS support http://support.sas.com/ website has DOC and supplemental technical papers.

Scott Barry
SBBWorks, Inc.

How the DATA Step Identifies BY Groups
http://support.sas.com/documentation/cdl/en/lrcon/61722/HTML/default/a000761931.htm
N/A
Posts: 0

Re: flagging dups on multiple vars

Posted in reply to deleted_user
Hello lockwoo,

Could this help?

yoba

data T01_input;
length v2 $1;
infile cards delimiter=',' firstobs=2;
input id v2 flag date last count;
informat date ddmmyy10.; format date date9.;
cards;
id, v2, flag, date, LAST, COUNT
1, a, 0, 1/1/2008, 1, 1
1, a, 1, 1/7/2009, 1, 2
1, b, 1, 1/1/2009, 1, 0
1, c, 0, 1/1/2007, 0, 1
1, c, 0, 1/1/2008, 1, 2
1, c, 1, 1/1/2009, 1, 3
1, d, 0, 6/1/2007, 1, 0
;
run;

proc sort data=T01_input;
by id v2 flag;
run;

data T02_temp;
set T01_input;
by id v2 flag;
if first.flag=last.flag=1 then LAST2=1; else LAST2=0;
run;

proc sort data=T02_temp;
by id v2 date;
run;

data T03_output;
set T02_temp;
by id v2 date;
if first.V2 then COUNT2=0;
if not (first.V2=last.V2=1) then COUNT2+1;
run;

proc print data=T03_output;
run;
N/A
Posts: 0

Re: flagging dups on multiple vars

Posted in reply to deleted_user
Terrific! I knew it had to be easier than I was making it.
Thank you thank you

Dana
Ask a Question
Discussion stats
  • 3 replies
  • 125 views
  • 0 likes
  • 2 in conversation