Hello SAS community!
I love and appreciate all the help from this community. I’m seeking help with the following. I have a database containing individuals submitting information during 6 different periods. I am looking to 1) create a new string variable (24 bytes) which combines the data together from all 6 periods (4 bytes each). Maybe a concatenation? However, there may be blanks during a period in which I would like to input as ‘9999’. How can I do this? 2) I would also like to identify the individuals who submit the same receipts during different terms? Examples are the last two individuals (Term1 & Term5) (Term3 and Term6). I would like to create a separate dataset to identify those individuals for further analysis. Can anyone help me? Thank you so much!!
ID1 | Term1 | Term2 | Term3 | Term4 | Term5 | Term6 |
a_78076 | 7600 | 9875 | 4388 | 3709 | ||
er_866455 | 5436 | 9468 | 4009 | 4388 | 7600 | 3288 |
ik_33442 | 9865 | 4009 | 6783 | 9865 | 3288 | |
ev_965747 | 5436 | 9468 | 3288 | 4388 | 7600 | 3288 |
Want output and separate dataset for the last two individuals who submitted same receipts:
a_78076 | 760099999875438837099999 | ||
er_866455 | 543694684009438876003288 | ||
ik_33442 | 986540096783999998653288 | ||
ev_965747 | 543694683288438876003288 |
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
You may try next code:
data want;
set have;
keep ID1 concat;
array tm {6} term1-term6;
do i=1 to dim(tm);
if missing(tm) then tm=9999;
end;
caoncat = cat(of term:);
end;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.