Hi,
After transposing my long dataset to wide, I realised some variables were the same but titled differently (one is serum, and the other plasma). So now I would like to clean my data further by merging the variables but having trouble with the coding.
For example (see sample data below):
For subject E01, I would like to put the data for 'var_Alanine_Aminotransferase__P' under the first variable 'var_Alanine_Aminotransferase__Se' and delete 'var_Alanine_Aminotransferase__P' . Similarly, I would like to put the data under 'var_Alkaline_Phosphatase__Plasma' under the 3rd variable 'var_Alkaline_Phosphatase__Serum_'.
What is the best way to go about doing this? Thanks in advance.
SUBJECT | var_Alanine_Aminotransferase__Se | var_Albumin__Serum_ | var_Alkaline_Phosphatase__Serum_ | var_Aspartate_Aminotransferase__ | var_Bilirubin__Total__Serum_ | var_Calcium__Serum_ | var_Carbohydrate_Antigen_125__Se | var_Creatinine__Serum_ | var_Gamma_Glutamyltransferase__S | var_Hemoglobin__Blood_ | var_Lactate_Dehydrogenase__Serum | var_Leucocytes__Particle_Concent | var_Lymphocytes__Particle_Concen | var_Neutrophils__Particle_Concen | var_Platelets__Particle_Concentr | var_Protein__Total__Serum_ | var_Alanine_Aminotransferase__Pl | var_Alkaline_Phosphatase__Plasma | var_Bilirubin__Total__Plasma_ | var_Calcium__Plasma_ | var_Creatinine__Plasma_ | var_Gamma_Glutamyltransferase__P | var_Lactate_Dehydrogenase__Plasm | var_Lymphocytes__Blood_ | var_Protein__Total__Plasma_ | var_Albumin__Plasma_ |
E02 | 27 | 43 | 129 | 34 | 6 | 2.54 | 5 | 68 | 29 | 122 | 241 | 7.3 | 2.5 | 3.8 | 175 | 75 | ||||||||||
E01 | 26 | 39 | 115 | 5.11 | 2.5 | 2.2 | 187 | 37 | 72 | 3.249 | 2.36 | 83.98 | 18 | 231 | 49.7 | 74 |
Use next code to do the work:
data want
set have;
array x_serum var_Alanine_Aminotransferase__Se var_Albumin__Serum_ ..... <up to the last ..._srum>;
array x_plasma var_Alanine_Aminotransferase__Pl var_Alkaline_Phosphatase__Plasma ..... <up to last ..._plasma>;
**** Be careful to order both arrays in compatible order ! ***;
do i=1 to dim(x_serum);
if x_plasma(i) ne . /* missing value */
then x_serum(i) = x_plasma(i);
end;
drop i x_plasma var_Alanine_Aminotransferase__Pl var_Alkaline_Phosphatase__Plasma ..... <up to last ..._plasma>;
run;
If I understand it correctly you have an array of _serum compatible to array of _plasma
and you want to move the last into the first then drop the last array variables.
Is it correct ?
Can both variables xxx_serum and xxx_plasma be non empty ? - If positive, what do you want to do?
replace the value or save both in separate oservations ?
If I understand it correctly you have an array of _serum compatible to array of _plasma
and you want to move the last into the first then drop the last array variables.
Is it correct ?
Yes that is the correct understanding
Can both variables xxx_serum and xxx_plasma be non empty ? - If positive, what do you want to do?
replace the value or save both in separate oservations ?
I would replace the value
Thanks
Use next code to do the work:
data want
set have;
array x_serum var_Alanine_Aminotransferase__Se var_Albumin__Serum_ ..... <up to the last ..._srum>;
array x_plasma var_Alanine_Aminotransferase__Pl var_Alkaline_Phosphatase__Plasma ..... <up to last ..._plasma>;
**** Be careful to order both arrays in compatible order ! ***;
do i=1 to dim(x_serum);
if x_plasma(i) ne . /* missing value */
then x_serum(i) = x_plasma(i);
end;
drop i x_plasma var_Alanine_Aminotransferase__Pl var_Alkaline_Phosphatase__Plasma ..... <up to last ..._plasma>;
run;
Thank you Shmuel - that worked
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.