Hello SAS community!
I love and appreciate all the help from this community. I’m seeking help with the following. I have a database containing individuals submitting information during 6 different periods. I am looking to 1) create a new string variable (24 bytes) which combines the data together from all 6 periods (4 bytes each). Maybe a concatenation? However, there may be blanks during a period in which I would like to input as ‘9999’. How can I do this? 2) I would also like to identify the individuals who submit the same receipts during different terms? Examples are the last two individuals (Term1 & Term5) (Term3 and Term6). I would like to create a separate dataset to identify those individuals for further analysis. Can anyone help me? Thank you so much!!
ID1 | Term1 | Term2 | Term3 | Term4 | Term5 | Term6 |
a_78076 | 7600 | 9875 | 4388 | 3709 | ||
er_866455 | 5436 | 9468 | 4009 | 4388 | 7600 | 3288 |
ik_33442 | 9865 | 4009 | 6783 | 9865 | 3288 | |
ev_965747 | 5436 | 9468 | 3288 | 4388 | 7600 | 3288 |
Want output and separate dataset for the last two individuals who submitted same receipts:
a_78076 | 760099999875438837099999 | ||
er_866455 | 543694684009438876003288 | ||
ik_33442 | 986540096783999998653288 | ||
ev_965747 | 543694683288438876003288 |
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
You may try next code:
data want;
set have;
keep ID1 concat;
array tm {6} term1-term6;
do i=1 to dim(tm);
if missing(tm) then tm=9999;
end;
caoncat = cat(of term:);
end;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.