Dear all,
suppose to have the following dataset:
data DB;
input ID :$20. Marker1 :$20. Marker2 :$20. Marker3 :$20. Marker4 :$20. Marker5 :$20.;
cards;
0001 CD34 . . CD33 Other,specify
0001 . CD303 (BDCA2) . . .
0002 . . . . CD14
0002 CD3 . CD14 CD3 .
;
run;
It contains measurements (i.e., numbers not shown for simplicity) for each marker.
For each patient different markers were measured. It means not only different types like CD33 or CD14 but also different choices (Marker1-Marker5). I don't know a lot on this data. I only know that they result from a survey.
You can see also, that IDs (patients) are repeated and this is because measurements were repeated during the same day or at different dates. Dates are not shown for simplicity.
The Marker* variables are Char but I can only see "Other, specify" when I read the input file. The remining values like for example CD33 are numbers like 11 etc. I have the list of markers and I can link the number to the specific marker, for example: 1 = CD3.
The problem is that although the numbered markers are 18, the variables Marker* are 50 (Marker1-Marker50).
Is there an efficient way to map all the numbers to the corresponding marker name on such a huge number of variables (Marker*)?
Thank you in advance.