Solved: Map numbered variable values on a huge set of variables

NewUsrStat · Posted 02-04-2026 02:11 PM

Dear all,

suppose to have the following dataset:

data DB;
  input ID :$20.  Marker1 :$20. Marker2 :$20. Marker3 :$20. Marker4 :$20. Marker5 :$20.;
cards;
0001  CD34      .        .   CD33     Other,specify 
0001    .  CD303 (BDCA2) .     .             .
0002    .       .        .     .           CD14
0002   CD3      .       CD14  CD3           .
 
; 
run;

It contains measurements (i.e., numbers not shown for simplicity) for each marker.

For each patient different markers were measured. It means not only different types like CD33 or CD14 but also different choices (Marker1-Marker5). I don't know a lot on this data. I only know that they result from a survey.

You can see also, that IDs (patients) are repeated and this is because measurements were repeated during the same day or at different dates. Dates are not shown for simplicity.

The Marker* variables are Char but I can only see "Other, specify" when I read the input file. The remining values like for example CD33 are numbers like 11 etc. I have the list of markers and I can link the number to the specific marker, for example: 1 = CD3.

The problem is that although the numbered markers are 18, the variables Marker* are 50 (Marker1-Marker50).

Is there an efficient way to map all the numbers to the corresponding marker name on such a huge number of variables (Marker*)?

Thank you in advance.

Tom · Posted 02-04-2026 03:09 PM

Sounds like you just want to transpose the data. Perhaps keeping track of where the values appeared originally.

First let's make a working data set that includes your other series of variables.

data DB;
  input (ID Marker1-Marker5) (&:$20.) Value1-Value5 ;
cards;
0001  CD34              .     .  CD33  Other,specify  
      1 . . 2 3
0001     .  CD303 (BDCA2)     .     .              .  
      . 4 . . .
0002     .              .     .     .           CD14  
      . . . . 5
0002   CD3              .  CD14   CD3              .
      6 . 7 8 .
;

Now we can just use a couple of ARRAYs to transpose the marker/value pairs into separate observations.

data tall;
  set db;
  by id ;
  if first.id then repeat=0;
  repeat+1;
  array mark Marker1-Marker5 ;
  array val  Value1-Value5 ;
  do location=1 to dim(mark);
    Marker=mark[location];
    Value=val[location];
    if cmiss(marker,value)<2 then output;
  end;
  drop Marker1-Marker5 Value1-Value5;
run;

Result

View solution in original post

quickbluefish · Posted 02-04-2026 03:02 PM

Can you provide the list of markers?

Tom · Posted 02-04-2026 03:09 PM

Sounds like you just want to transpose the data. Perhaps keeping track of where the values appeared originally.

First let's make a working data set that includes your other series of variables.

data DB;
  input (ID Marker1-Marker5) (&:$20.) Value1-Value5 ;
cards;
0001  CD34              .     .  CD33  Other,specify  
      1 . . 2 3
0001     .  CD303 (BDCA2)     .     .              .  
      . 4 . . .
0002     .              .     .     .           CD14  
      . . . . 5
0002   CD3              .  CD14   CD3              .
      6 . 7 8 .
;

Now we can just use a couple of ARRAYs to transpose the marker/value pairs into separate observations.

data tall;
  set db;
  by id ;
  if first.id then repeat=0;
  repeat+1;
  array mark Marker1-Marker5 ;
  array val  Value1-Value5 ;
  do location=1 to dim(mark);
    Marker=mark[location];
    Value=val[location];
    if cmiss(marker,value)<2 then output;
  end;
  drop Marker1-Marker5 Value1-Value5;
run;

Result

Map numbered variable values on a huge set of variables

Re: Map numbered variable values on a huge set of variables

Re: Map numbered variable values on a huge set of variables

Re: Map numbered variable values on a huge set of variables

Catch up on SAS Innovate 2026