BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SAS_Novice22
Quartz | Level 8

**EDITED FOR CLARITY**

 

Hi SAS Experts,

 

I would like to analyze lab data from a small number of samples. My data looks like this:

SUBIDLBTESTCDLBTESTCD_DECLBORRESLBORRESU_DECLBORNRLOLBORNRHI
00014C46mg/dL1444
00061C1-INH antigenic levels10 mg/dL1937
00041C1-INH antigenic levels4 mg/dL2139
00084C4<2g/L1040
00022C1-INH functional levels.08% of normal67100
00023C1q28mg/dL1222
.....................

 

I would like that data to look like this:

  Test Results -->-->-->-->-->
Test NameNumber of TestsMean (SD)MedianMinMaxKurtSkew
 C1-INH antigenic levels (units)N=      
C1-INH functional levels (units)N=      
C4 (units)N=      
C1q (units)N=      

 

Each row in the dataset represents a lab test and as such subjects can have multiple rows. This does not matter to me because I am looking for basic descriptive on the number of tests performed for each test type and the associated descriptive values of test results for each test type.

 

I am looking at the raw dataset and I see that within the same test type there are a variety of units for the test result displayed (e.g., mg/dL, g/L, %. % of normal), I am assuming I would first have to start off with cleaning the data to make the units all the same?

 

So would the steps I take look like:

1. clean the raw dataset and convert all values for each result to be consistent for each test type

2. group each lab test type into its own category - would I generate a new variable or use if and then statements?

3. run the following for each lab test?

4. Would this also include the low and high reference ranges?

 

 

 

 

PROC IMPORT OUT = LB
    DATAFILE = 'M:\SAS\TAKEDA EDR/LB.xlsx'
    DBMS = xlsx;
RUN;

DATA LB2;
    SET LB;
    OVERALL = "Overall";
RUN;

TITLE "Type of Lab Test Name - how to clean data, convert units, therafter use this data to calculate means etc. and include ref ranges??"; 
PROC FREQ DATA=WORK.LB2;
    TABLE LBTESTCD_DEC * OVERALL;
RUN;

DATA WORK.LB2;
SET LB;
IF LBTESTCD_DEC="C1-INH functional levels" THEN C1INH_F=1;
ELSE C1INH_F = .;
RUN;

TITLE "Lab Results by Lab Test Type"; 
PROC MEANS DATA=WORK.LB2 MEAN STDDEV MEDIAN MIN MAX KURT SKEW;
	CLASS OVERALL;
	VAR insert test name here;
RUN;

 

 

 

Thank you in advance for your assistance.

 

Best wishes,

T.

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@SAS_Novice22 wrote:

 

So would the steps I take look like:

1. clean the raw dataset and convert all values for each result to be consistent for each test type

2. group each lab test type into its own category - would I generate a new variable or use if and then statements?

3. run the following for each lab test?

4. Would this also include the low and high reference ranges?

 

Basically it looks like you have the right idea. First, would be the make sure that you have numeric variables for anything that you want to do statistics like Mean, Median, Max etc. When you show a value such as <2 that variable as it currently exists is very likely to not be numeric and you need to decide what value will be recorded. For statistics like Mean to be useful the units must be the same. Typically if variables represent different units of measure the approach is to make a variable for each.

 

If you just want a REPORT, once you have consistent numeric values you may be able to use the structure you have with some thing like:

proc tabulate data=have;
   class LBTESTCD_DEC;
   var LBORRES;
   table  LBTESTCD_DEC,
         lborres*( n mean std min max skew kurt)
   ;
run;

For other analysis you may be able to use your lab test variable as a CLASS variable (Proc Means for example) or a BY variable (sort the data by the variable first).

SAS makes it fairly easy to do many tasks by values of grouping variables (By group processing for instance with variables on a BY statement)

 

4. Would this also include the low and high reference ranges?

Depends on what you mean by "include" and what you want to do with other variables. Also, with questions please use VARIABLE names to reference things. When we have to guess what a "reference range" would be suggestions may well be quite different than you want. You show 2 variables but no "range" I guess are high and low somethings. So those might be limits but how they could be used you need to provide examples or descriptions.

You can request statistics for any numeric variable.

 

 

 

 

 

 

 

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

I am looking at the raw dataset and I see that within the same test type there are a variety of units for the test result displayed (e.g., mg/dL, g/L, %. % of normal), I am assuming I would first have to start off with cleaning the data to make the units all the same?

 

I don't understand why you say this, as your desired output table does not have any column for units. But even more confusingly, you want a mean in your output table, but you don't indicate what variable you want to use to calculate the mean.

 

So I feel there are major disconnects between your input data, your description in words of what you want to do, and the desired output table.

--
Paige Miller
SAS_Novice22
Quartz | Level 8

@PaigeMiller - thank you, sorry I should have been more clear, the descriptive stats (mean, median, min, max, skew, kurtosis) were to showcase the test results. I will try now to edit my post.

 

Thanks for pointing this out.

T.

PaigeMiller
Diamond | Level 26

Typing "Test results" above the word Mean still does not indicate which of the many variables you are computing a mean for.

 

2. group each lab test type into its own category - would I generate a new variable or use if and then statements?

3. run the following for each lab test?

 

PROC SUMMARY or PROC MEANS seems to be all you would need to do, once all the other issues are cleared up.

--
Paige Miller
ballardw
Super User

@SAS_Novice22 wrote:

 

So would the steps I take look like:

1. clean the raw dataset and convert all values for each result to be consistent for each test type

2. group each lab test type into its own category - would I generate a new variable or use if and then statements?

3. run the following for each lab test?

4. Would this also include the low and high reference ranges?

 

Basically it looks like you have the right idea. First, would be the make sure that you have numeric variables for anything that you want to do statistics like Mean, Median, Max etc. When you show a value such as <2 that variable as it currently exists is very likely to not be numeric and you need to decide what value will be recorded. For statistics like Mean to be useful the units must be the same. Typically if variables represent different units of measure the approach is to make a variable for each.

 

If you just want a REPORT, once you have consistent numeric values you may be able to use the structure you have with some thing like:

proc tabulate data=have;
   class LBTESTCD_DEC;
   var LBORRES;
   table  LBTESTCD_DEC,
         lborres*( n mean std min max skew kurt)
   ;
run;

For other analysis you may be able to use your lab test variable as a CLASS variable (Proc Means for example) or a BY variable (sort the data by the variable first).

SAS makes it fairly easy to do many tasks by values of grouping variables (By group processing for instance with variables on a BY statement)

 

4. Would this also include the low and high reference ranges?

Depends on what you mean by "include" and what you want to do with other variables. Also, with questions please use VARIABLE names to reference things. When we have to guess what a "reference range" would be suggestions may well be quite different than you want. You show 2 variables but no "range" I guess are high and low somethings. So those might be limits but how they could be used you need to provide examples or descriptions.

You can request statistics for any numeric variable.

 

 

 

 

 

 

 

SAS_Novice22
Quartz | Level 8
Thanks ballardw. This is useful that is exactly how I will proceed. Thank you for the suggestion. I will clean the data first to make sure all measures of units are the same for each of the 4 different lab tests.

Thank you again for the solution to my problem.

T. 🙂

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 684 views
  • 1 like
  • 3 in conversation