Hello SAS community!
I love and appreciate all the help from this community. I’m seeking help with the following. I have a database containing individuals submitting information during 6 different periods. I am looking to 1) create a new string variable (24 bytes) which combines the data together from all 6 periods (4 bytes each). Maybe a concatenation? However, there may be blanks during a period in which I would like to input as ‘9999’. How can I do this? 2) I would also like to identify the individuals who submit the same receipts during different terms? Examples are the last two individuals (Term1 & Term5) (Term3 and Term6). I would like to create a separate dataset to identify those individuals for further analysis. Can anyone help me? Thank you so much!!
ID1 | Term1 | Term2 | Term3 | Term4 | Term5 | Term6 |
a_78076 | 7600 | 9875 | 4388 | 3709 | ||
er_866455 | 5436 | 9468 | 4009 | 4388 | 7600 | 3288 |
ik_33442 | 9865 | 4009 | 6783 | 9865 | 3288 | |
ev_965747 | 5436 | 9468 | 3288 | 4388 | 7600 | 3288 |
Want output and separate dataset for the last two individuals who submitted same receipts:
a_78076 | 760099999875438837099999 | ||
er_866455 | 543694684009438876003288 | ||
ik_33442 | 986540096783999998653288 | ||
ev_965747 | 543694683288438876003288 |
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
data want;
set have;
length long_string $ 24;
array term $ term1-term6;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
end;
long_string = cats(of term1-term6);
drop term1-term6 i;
run;
Next, duplicate receipts
data dup_receipts;
set have;
array term $ term1-term6;
length long_string $ 24;
flag=0;
do i=1 to dim(term);
if missing(term(i)) then term(i)='9999';
if i<dim(term) then do j=(i+1) to dim(term);
if term(i)=term(j) and term(i)^='9999' then flag=1;
leave;
end;
end;
long_string = cats(of term1-term6);
if flag=1 then output;
drop term1-term6 i j;
run;
Which brings up the question, why do you need a string of 24 digits?
You may try next code:
data want;
set have;
keep ID1 concat;
array tm {6} term1-term6;
do i=1 to dim(tm);
if missing(tm) then tm=9999;
end;
caoncat = cat(of term:);
end;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.