BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Merdock
Quartz | Level 8

I have a dataset with many sites (like in excerpt below) where the 3 participants at the end with XXX are additional dummy participants. What I want is to modify these XXX patids so that they are in the format I have in the "want" dataset below, as opposed to having XXX at the end, while keeping the group assignment from have dataset unchanged.

data have;
 input site $ patid $ group $;
 datalines;
ABC ABCPROT01001 A
ABC ABCPROT01002 B
ABC ABCPROT01003 A
ABC ABCPROT01005 A
ABC ABCPROT01006 A
ABC ABCPROT01XXX B
ABC ABCPROT01XXX A
ABC ABCPROT01XXX B
CDF CDFPROT01004 A
CDF CDFPROT01005 A
CDF CDFPROT01006 A
CDF CDFPROT01007 B
CDF CDFPROT01008 A
CDF CDFPROT01009 B
CDF CDFPROT01010 A
CDF CDFPROT01012 A
CDF CDFPROT01013 B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX A
AMD AMDPROT01001 A
AMD AMDPROT01002 B
AMD AMDPROT01003 A
AMD AMDPROT01XXX B
AMD AMDPROT01XXX A
AMD AMDPROT01XXX A
;
run;

data want;
input site $ patid $ group $;
datalines;
ABC ABCPROT01001 A
ABC ABCPROT01002 B
ABC ABCPROT01003 A
ABC ABCPROT01005 A
ABC ABCPROT01006 A
ABC ABCPROT01007 B
ABC ABCPROT01008 A
ABC ABCPROT01009 B
CDF CDFPROT01004 A
CDF CDFPROT01005 A
CDF CDFPROT01006 A
CDF CDFPROT01007 B
CDF CDFPROT01008 A
CDF CDFPROT01009 B
CDF CDFPROT01010 A
CDF CDFPROT01012 A
CDF CDFPROT01013 B
CDF CDFPROT01014 B
CDF CDFPROT01015 B
CDF CDFPROT01016 A
AMD AMDPROT01001 A
AMD AMDPROT01002 B
AMD AMDPROT01003 A
AMD AMDPROT01004 B
AMD AMDPROT01005 A
AMD AMDPROT01006 A
;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data want (drop = n l);
   set have;
   by site notsorted;
   if first.site then n = 0;

   l = substr(patid, length(patid) - 2);
   if not find(l, 'x', 'i') then n = input(l, 3.);
   else n = sum(n, 1);

   substr(patid, 10, 3) = put(n, z3.);

   retain n;
run;

 

Result:

 

Spoiler
site patid        group 
ABC  ABCPROT01001 A 
ABC  ABCPROT01002 B 
ABC  ABCPROT01003 A 
ABC  ABCPROT01005 A 
ABC  ABCPROT01006 A 
ABC  ABCPROT01007 B 
ABC  ABCPROT01008 A 
ABC  ABCPROT01009 B 
CDF  CDFPROT01004 A 
CDF  CDFPROT01005 A 
CDF  CDFPROT01006 A 
CDF  CDFPROT01007 B 
CDF  CDFPROT01008 A 
CDF  CDFPROT01009 B 
CDF  CDFPROT01010 A 
CDF  CDFPROT01012 A 
CDF  CDFPROT01013 B 
CDF  CDFPROT01014 B 
CDF  CDFPROT01015 B 
CDF  CDFPROT01016 A 
QQL  QQLPROT01001 A 
QQL  QQLPROT01002 B 
QQL  QQLPROT01003 A 
AMD  AMDPROT01001 A 
AMD  AMDPROT01002 B 
AMD  AMDPROT01003 A 
AMD  AMDPROT01004 B 
AMD  AMDPROT01005 A 
AMD  AMDPROT01006 A 

View solution in original post

4 REPLIES 4
ballardw
Super User

Your data step code cuts of the last characters of the PATID field.

I think this does what you are asking for.

data have;
 input site $ patid :$12. group $;
 datalines;
ABC ABCPROT01001 A
ABC ABCPROT01002 B
ABC ABCPROT01003 A
ABC ABCPROT01005 A
ABC ABCPROT01006 A
ABC ABCPROT01XXX B
ABC ABCPROT01XXX A
ABC ABCPROT01XXX B
CDF CDFPROT01004 A
CDF CDFPROT01005 A
CDF CDFPROT01006 A
CDF CDFPROT01007 B
CDF CDFPROT01008 A
CDF CDFPROT01009 B
CDF CDFPROT01010 A
CDF CDFPROT01012 A
CDF CDFPROT01013 B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX A
AMD AMDPROT01001 A
AMD AMDPROT01002 B
AMD AMDPROT01003 A
AMD AMDPROT01XXX B
AMD AMDPROT01XXX A
AMD AMDPROT01XXX A
;
run;

data want;
   set have;
   retain lastno ;
   if substr(patid,10) ne 'XXX' then lastno=input( substr(patid,10),f3.);
   else do;
      lastno+1;
      patid= cats(substr(patid,1,9),put(lastno,z3.));
   end;
   drop lastno;
run;
Merdock
Quartz | Level 8

Thank you, ballardw. This does work great but what about if I have a site that only has the 3 patids with XXX? I edited my initial dataset so for example, for site QQL, there's no existing patids yet but I'd like to add those 3 dummy patids in there and I wanted them to be assigned as QQLPROT01001, QQLPROT01002, QQLPROT01003. The code now instead assigns them as QQLPROT01017, QQLPROT01018, QQLPROT01019 (I guess since there's no previous QQL patid before them, it takes the last patid from CDF, which is CDFPROT01016) 

data have;
 input site $ patid :$12. group $;
 datalines;
ABC ABCPROT01001 A
ABC ABCPROT01002 B
ABC ABCPROT01003 A
ABC ABCPROT01005 A
ABC ABCPROT01006 A
ABC ABCPROT01XXX B
ABC ABCPROT01XXX A
ABC ABCPROT01XXX B
CDF CDFPROT01004 A
CDF CDFPROT01005 A
CDF CDFPROT01006 A
CDF CDFPROT01007 B
CDF CDFPROT01008 A
CDF CDFPROT01009 B
CDF CDFPROT01010 A
CDF CDFPROT01012 A
CDF CDFPROT01013 B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX B
CDF CDFPROT01XXX A
QQL QQLPROT01XXX A
QQL QQLPROT01XXX B
QQL QQLPROT01XXX A
AMD AMDPROT01001 A
AMD AMDPROT01002 B
AMD AMDPROT01003 A
AMD AMDPROT01XXX B
AMD AMDPROT01XXX A
AMD AMDPROT01XXX A
;
run;

data want;
input site $ patid :$12. group $;
datalines;
ABC ABCPROT01001 A
ABC ABCPROT01002 B
ABC ABCPROT01003 A
ABC ABCPROT01005 A
ABC ABCPROT01006 A
ABC ABCPROT01007 B
ABC ABCPROT01008 A
ABC ABCPROT01009 B
CDF CDFPROT01004 A
CDF CDFPROT01005 A
CDF CDFPROT01006 A
CDF CDFPROT01007 B
CDF CDFPROT01008 A
CDF CDFPROT01009 B
CDF CDFPROT01010 A
CDF CDFPROT01012 A
CDF CDFPROT01013 B
CDF CDFPROT01014 B
CDF CDFPROT01015 B
CDF CDFPROT01016 A
QQL QQLPROT01001 A
QQL QQLPROT01002 B
QQL QQLPROT01003 A
AMD AMDPROT01001 A
AMD AMDPROT01002 B
AMD AMDPROT01003 A
AMD AMDPROT01004 B
AMD AMDPROT01005 A
AMD AMDPROT01006 A
;
run;
PeterClemmensen
Tourmaline | Level 20

Try this

 

data want (drop = n l);
   set have;
   by site notsorted;
   if first.site then n = 0;

   l = substr(patid, length(patid) - 2);
   if not find(l, 'x', 'i') then n = input(l, 3.);
   else n = sum(n, 1);

   substr(patid, 10, 3) = put(n, z3.);

   retain n;
run;

 

Result:

 

Spoiler
site patid        group 
ABC  ABCPROT01001 A 
ABC  ABCPROT01002 B 
ABC  ABCPROT01003 A 
ABC  ABCPROT01005 A 
ABC  ABCPROT01006 A 
ABC  ABCPROT01007 B 
ABC  ABCPROT01008 A 
ABC  ABCPROT01009 B 
CDF  CDFPROT01004 A 
CDF  CDFPROT01005 A 
CDF  CDFPROT01006 A 
CDF  CDFPROT01007 B 
CDF  CDFPROT01008 A 
CDF  CDFPROT01009 B 
CDF  CDFPROT01010 A 
CDF  CDFPROT01012 A 
CDF  CDFPROT01013 B 
CDF  CDFPROT01014 B 
CDF  CDFPROT01015 B 
CDF  CDFPROT01016 A 
QQL  QQLPROT01001 A 
QQL  QQLPROT01002 B 
QQL  QQLPROT01003 A 
AMD  AMDPROT01001 A 
AMD  AMDPROT01002 B 
AMD  AMDPROT01003 A 
AMD  AMDPROT01004 B 
AMD  AMDPROT01005 A 
AMD  AMDPROT01006 A 
Merdock
Quartz | Level 8

@PeterClemmensen, thank you so much, this did the trick!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1141 views
  • 1 like
  • 3 in conversation