Hi All,
I need a help with a creation of a variable. I have the following information.
Company Date IC ACCT ID
A 01/01/2006 YES 100
A 01/05/2006 YES 100
A 01/10/2006 100
B 07/15/2006 YES 110
B 07/21/2006 YES 110
B 07/25/2006 110
B 07/30/2006 110
C 10/01/2006 115
C 10/21/2006 115
IC and ACCT are character variables. I want a dataset where for first.ID if IC = "YES" or ACCT ="YES" then select all the observations for that ID. In other words, I want my output to look as follows:
Company Date IC ACCT ID
A 01/01/2006 YES 100
A 01/05/2006 YES 100
A 01/10/2006 100
B 07/15/2006 YES 110
B 07/21/2006 YES 110
B 07/25/2006 110
B 07/30/2006 110
Can someone please provide me with the code?
Thank you in Advance
presuming your data is sorted by id
data want;
set have;
by id;
retain output_YN='N';
if first.id and (ic='YES' or ACCT='YES' then Output_YN='Y';
else if first.id then output_yn='N';
if Output_YN+'Y' then output;
run;
Data step maybe more efficient, but syntax wise, Proc SQL is simpler:
data have;
input (Company Date IC ACCT ID) (:$20.);
cards;
A 01/01/2006 YES . 100
A 01/05/2006 YES . 100
A 01/10/2006 . . 100
B 07/15/2006 . YES 110
B 07/21/2006 . YES 110
B 07/25/2006 . . 110
B 07/30/2006 . . 110
C 10/01/2006 . . 115
C 10/21/2006 . . 115
;
proc sql;
create table want as
select * from have
group by company
having sum(ic="YES")>0 or sum(acct='YES')>0
;
quit;
UPdate: Please ignore this reply. it is faulty. The one from DBailey is the correct one.
Haikuo
presuming your data is sorted by id
data want;
set have;
by id;
retain output_YN='N';
if first.id and (ic='YES' or ACCT='YES' then Output_YN='Y';
else if first.id then output_yn='N';
if Output_YN+'Y' then output;
run;
Borrowed Haikuo's data:
data have;
input (Company Date IC ACCT ID) (:$20.);
cards;
A 01/01/2006 YES . 100
A 01/05/2006 YES . 100
A 01/10/2006 . . 100
B 07/15/2006 . YES 110
B 07/21/2006 . YES 110
B 07/25/2006 . . 110
B 07/30/2006 . . 110
C 10/01/2006 . . 115
C 10/21/2006 . . 115
;
data want;
set have;
by company notsorted;
if first.company then group=0;
group+(ic='YES' or acct='YES') ;
if group>0;
proc print;run;
Hey there,
I have an SQL approach if you're so inclined:
data data_have;
infile datalines dsd dlm="," missover;
input Company : $char1.
Date : MMDDYY10.
IC : $char4.
ACCT : $char4.
ID : 8.
;
datalines;
A,01/01/2006,YES,,100
A,01/05/2006,YES,,100
A,01/10/2006,,,100
B,07/15/2006,,YES,110
B,07/21/2006,,YES,110
B,07/25/2006,,,110
B,07/30/2006,,,110
C,10/01/2006,,,115
C,10/21/2006,,,115
;
run;
proc sql;
create table data_want as
select a.*
from data_have as a,
(select distinct Company
from data_have
where IC='YES' or ACCT='YES') as tmp
where a.Company = tmp.Company;
quit;
Thanks,
Huey
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.