I have this issue with the data set I'm basically want to get this result from the data set I posted below.
the way the variable converted came within subjid R101=TL01 and R102=TL02 .....ETC.
But the confusing part is R20X take the next number available after R10X conversion for example if TL02 already reserved the R201 will convert to TL03, R202 converts to TL04
the aim
SUBJID | LESNID | Converted |
C12-123 | R101 | TL01 |
C12-123 | R102 | TL02 |
C12-123 | R201 | TL03 |
C12-123 | R101 | TL01 |
C12-123 | R102 | TL02 |
C12-123 | R201 | TL03 |
C14-062 | R101 | TL01 |
C14-062 | R201 | TL02 |
C14-062 | R101 | TL01 |
C14-062 | R201 | TL02 |
C14-062 | R101 | TL01 |
C14-062 | R201 | TL02 |
C14-062 | R101 | TL01 |
C14-062 | R201 | TL02 |
this is a sample of the data I have
length SUbJID $7 LESNID $5;
input SUbJID $ LESNID $;
datalines;
C12-123 R101
C12-123 R102
C12-123 R201
C12-123 R101
C12-123 R102
C12-123 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
;
Are there any cases where you have the R2 value within a subjid group without any of the R1 values?
For your example data this works but does not maintain sort order of the original "have". If you need to keep the original order of have then you need to provide one or more variables to put in an order by clause in the last proc sql step
data have; length SUbJID $7 LESNID $5; input SUbJID $ LESNID $; datalines; C12-123 R101 C12-123 R102 C12-123 R201 C12-123 R101 C12-123 R102 C12-123 R201 C14-062 R101 C14-062 R201 C14-062 R101 C14-062 R201 C14-062 R101 C14-062 R201 C14-062 R101 C14-062 R201 ; run; Proc sql; create table temp as select distinct subjid, LESNID from have order by subjid, LESNID ; quit; data match; set temp; by subjid LESNID; retain counter; length convert $4.; if first.subjid then counter=1; else counter+1; convert= catt('TL',put(counter,z2.)); run; proc sql; create table want as select a.*, b.convert from have as a left join match as b on a.subjid=b.subjid and a.lesnid=b.lesnid ; quit;
The two soln are really good. I have to figure them out bc I have to convert R301 and 401 too.
@mona4u wrote:
The two soln are really good. I have to figure them out bc I have to convert R301 and 401 too.
Here's an example where I added a couple of additional entries with different values of LESNID. See if the results look as desired. This approach is only concerned with the order of the values of the variable LESNID, not the actual values. It will likely fall apart a tad as is if you have any lesnid values of R1101 as R1101 will come before R201 in normal sort order. A separate proc sort before the Data MATCH using the SORTSEQ=linguistic option would address that.
data have; length SUbJID $7 LESNID $5; input SUbJID $ LESNID $; datalines; C12-123 R101 C12-123 R102 C12-123 R201 C12-123 R101 C12-123 R102 C12-123 R201 C12-123 R401 C12-123 R301 C14-062 R101 C14-062 R201 C14-062 R101 C14-062 R201 C14-062 R101 C14-062 R401 C14-062 R201 C14-062 R101 C14-062 R201 C14-062 R1101 ; run; Proc sql; create table temp as select distinct subjid, LESNID from have order by subjid, LESNID ; quit; proc sort data=temp sortseq=linguistic (numeric_collation=on); by subjid lesnid; run; data match; set temp; by subjid LESNID; retain counter; length convert $4.; if first.subjid then counter=1; else counter+1; convert= catt('TL',put(counter,z2.)); run; proc sql; create table want as select a.*, b.convert from have as a left join match as b on a.subjid=b.subjid and a.lesnid=b.lesnid ; quit;
If the dataset is large, and already sorted by subjid, then it's worth considering a hash solution, which permits a single step approach:
data have;
length SUbJID $7 LESNID $5;
input SUbJID $ LESNID $;
datalines;
C12-123 R101
C12-123 R102
C12-123 R201
C12-123 R101
C12-123 R102
C12-123 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
C14-062 R101
C14-062 R201
run;
data want;
set have;
by subjid;
length converted $4;
if _n_=1 then do;
declare hash h ();
h.definekey('lesnid');
h.definedata('converted');
h.definedone();
end;
if h.find() ^=0 then do;
converted=cats('TL',put(h.num_items+1,z2.));
h.add();
end;
if last.subjid then h.clear();
run;
The program looks to see if LESNID is already in the hash object (h.find), and if so, retrieves the corresponding vale of converted. If not (i.e. if h.find()^=0), it builds a new value for converted concatenating TL with 1+number of items already in the hash object (formatted as a two digit numeric value), and adds that new value to the hash object. When a subject id is complete, the hash object to emptied, so the next id is ready to start at converted=TL01.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.