I need to know how to check and output data where observation 1=observation 2=observation 3 for variable X by variable Y (IE variable Y has customer IDs, so I want to check whether the codes in variable X equal each other. Some customers have from 2 to 5 observations.)
so for example:
cust ID code
1 5
1 6
1 10
2 3
2 3
2 3
SO i would want cust ID 1 outputted to one particular data set and cust ID 2 outputted to a different data set.
Is this something I would need an array for?
Hi,
Are you trying to subset your data set by 'ID's?
Haikuo
I want all the data in which the codes are equal to be subsetting into a dataset and all the codes that dont match to go into a different dataset. I hope that makes sense. I do not want each ID to have its own dataset, but I am examining the codes by ID.
Easy to do with PROC SQL code.
proc sql noprint;
create table one as select * from have where count(distinct code) = 1 group by custid;
create table two as select * from have where count(distinct code) > 1 group by custid;
quit;
If you want to do it with a data step here is a way do it using a couple of do loops to check if the given id value has multiple code values and then output the records to right database. If you only want the id values output then you can eliminate the second do loop.
data one two ;
do until (last.custid);
set have;
by custid ;
if not first.custid and _previous ne code then different=1;
_previous=code;
end;
do until (last.custid);
set have ;
by custid;
if different=1 then output two;
else output one;
end;
run;
Hmm I do get what you are saying for the SQL step,
proc sql noprint;
create table team3.testing1 as select * from team3.multi_reward
where count(distinct reward_r) = 1 group by cust_id;
/*create table two as select * from have where count(distinct code) > 1 group by custid;*/
quit;
When I put that in, it says
ERROR: Summary functions are restricted to the SELECT and HAVING clauses only.
What am I doing wrong?
That's what I get for posting untested code.
create table team3.testing1 as
select *
from team3.multi_reward
group by cust_id
having count(distinct reward_r) = 1
;
Haha I feel dumb now Thanks for the help.
data have ;
input custID code ; cards ;
1 5
1 6
1 10
2 3
2 3
2 3
;
proc sql ;
create view all as
select range(code) as not_same , *
from have
group by custID
order by custID , code ;
quit ;
data same not_same ;
set all ;
if not_same then output not_same ;
else output same ;
drop not_same ;
run ;
The more, the merrier: (original idea was first seen from Mike zdeb's post, I think)
data have ;
input custID code ; cards ;
1 5
1 6
1 10
2 3
2 3
2 3
;
data same notsame;
set have (in=up) have;
by custid;
retain _t;
if up and first.custid then do; call missing(_t,_f);_t=code;end;
if code ne _t then _f+1;
if not up then do;
if _f ge 1 then output notsame;
else output same;
end;
drop _:;
run;
Haikuo
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.