I'm sure this will be an easy one for someone but I can't figure out how to force the new variable 'misfolding activity' to stay numeric as intended.
proc univariate data=rfu2 noprint ;
by status dilution replicate ;
var rfu ;
output out=max_RFU max=maxRFU ;
run;
data max_RFU ;
set max_RFU ;
attrib misfolding_activity format=8. informat=8. label='misfolding activity';
if maxRFU>26000 then misfolding_activity = 1 ;
else misfolding_activity = 0;
run;
In theory, that's true. Each time UNIVARIATE runs, it replaces max_RFU.
In practice, could this combination have occurred?
Show us the entire SAS log. Click on the {i} icon and paste the log into the window that appears. DO NOT SKIP THIS STEP.
Interestingly, I am not having the problem now after coming back to my computer and restarting SAS to produce the log you wanted. I am just in the beginning of putting this code together and thus was revising and resubmitting code. The code I posted was not working and I still am not sure why. For what it might be worth, here is the log as requested but which is now working.
NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.4 (TS1M5) NOTE: This session is executing on the X64_7PRO platform. NOTE: Updated analytical products: SAS/STAT 14.3 SAS/ETS 14.3 SAS/OR 14.3 SAS/IML 14.3 SAS/QC 14.3 NOTE: Additional host information: X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation NOTE: SAS initialization used: real time 1.30 seconds cpu time 0.59 seconds 1 ods graphics / antialiasmax=500000 ; 2 3 footnote'TID6164'; 4 footnote2'Protocol restarted after 21 hours due to power outage.'; 5 footnote3'Data before 21 hours not included in this analysis'; 6 **** 1a) Import, transpose and rename the raw data ; 7 PROC IMPORT OUT= WORK.RAWWide 8 DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m 9 erged runs in excel\TID6164.xlsx" 10 DBMS=EXCELCS REPLACE; 11 RANGE='SAS$B3:CT119'n; 12 SCANTEXT=YES; 13 USEDATE=YES; 14 SCANTIME=YES; 15 RUN; NOTE: WORK.RAWWIDE data set was successfully created. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.76 seconds cpu time 0.09 seconds 16 17 data RAWWide; 18 set RAWWide ; 19 attrib time format=time8. informat=8. ; 20 time = (Time__h_ * '01:00:00't) + '21:00:00't ; /* Convert decimal minutes to SAS time 20 ! */ 21 drop Time__h_ ; 22 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 23 proc sort data=RAWWide ; 24 by time ; 25 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.03 seconds 26 proc transpose data=RAWWide out=RAWLong; 27 var Sample_X1-Sample_X96 ; 28 by time ; 29 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWLONG has 11136 observations and 4 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 30 data RAWLong ; 31 set RAWLong (rename=(COL1=RFU)); 32 attrib sample_ID format=8. label='sample_ID X##'; 33 Sample_ID=substr(_NAME_,9); 34 drop _LABEL_ _NAME_ ; 35 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 33:15 NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 36 37 **** 1b) Import, transpose and rename the layout data ; 38 PROC IMPORT OUT= WORK.Layout 39 DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m 40 erged runs in excel\TID6164.xlsx" 41 DBMS=EXCELCS REPLACE; 42 RANGE="Layout$"; 43 SCANTEXT=YES; 44 USEDATE=YES; 45 SCANTIME=YES; 46 RUN; NOTE: WORK.LAYOUT data set was successfully created. NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.56 seconds cpu time 0.06 seconds 47 48 **** 1c) Put 1a and 1b together; 49 proc sort data=RAWLong ; by sample_ID time ; run; NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 50 proc sort data=Layout ; by sample_ID ; run; NOTE: There were 96 observations read from the data set WORK.LAYOUT. NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.00 seconds 51 data rfu ; 52 merge Layout RAWLong ; 53 by sample_ID ; 54 attrib dilution format=E7. ; 55 run; NOTE: There were 96 observations read from the data set WORK.LAYOUT. NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RFU has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 56 57 proc sort data=rfu ; 58 by status dilution replicate time ; 59 run; NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: The data set WORK.RFU has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 60 proc freq data=rfu ; 61 by status dilution ; 62 table replicate / nopercent nocol norow nocum ; 63 run; NOTE: Writing HTML Body file: sashtml.htm NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: PROCEDURE FREQ used (Total process time): real time 0.56 seconds cpu time 0.10 seconds 64 proc sgpanel data=rfu ; 65 title'PLOT 1: Raw data (n=12 or 16 t reps)'; 66 by status; 67 panelby dilution / novarname columns=1 uniscale=all; 68 series x=time y=rfu / group=replicate ; 69 colaxis type=time tickvalueformat=time8. grid display=(novalues) ; 70 rowaxis values=(0 to 260000 by 25000) grid display=(novalues) ; 71 run; NOTE: PROCEDURE SGPANEL used (Total process time): real time 3.28 seconds cpu time 0.28 seconds NOTE: There were 11136 observations read from the data set WORK.RFU. 72 title; 73 proc freq data=rfu ; 74 table sample_id / nocum norow nocol nopercent ; 75 table row*well / nocum norow nocol nopercent ; 76 run; ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted. ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted. NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 77 /*/*/*/*/* 1d) check for time line glitch */*/*/*/*/; 78 data lag ; 79 set rfu ; 80 attrib lagdif format=time8. informat=8.; 81 lagdif = time - lag(time) ; 82 if time='21:00:00't then lagdif=. ; 83 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 81:19 NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: The data set WORK.LAG has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 84 proc sort data=lag ; 85 by sample_ID time ; 86 run; NOTE: There were 11136 observations read from the data set WORK.LAG. NOTE: The data set WORK.LAG has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.18 seconds cpu time 0.03 seconds 87 proc sgplot data=lag ; 88 series x=time y=lagdif / group=sample_id ; 89 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 90 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 1.10 seconds cpu time 0.06 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG. 91 92 /*/*/*/*/* 1e) fix time line glitch */*/*/*/*/; 93 data rfu1 ; 94 set lag ; 95 by sample_ID; 96 attrib lag_time format=time8. informat=8. ; 97 lag_time=lag(time); 98 if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ; 99 else time=time; 100 drop lagdif lag_time ; 101 run; NOTE: There were 11136 observations read from the data set WORK.LAG. NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 102 proc sort data=rfu1 ; 103 by sample_ID time ; 104 run; NOTE: There were 11136 observations read from the data set WORK.RFU1. NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 105 data lag1 ; 106 set rfu1 ; 107 attrib lagdif format=time8. informat=8.; 108 lagdif = time - lag(time) ; 109 if time='00:00:00't then lagdif=. ; 110 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 108:19 NOTE: There were 11136 observations read from the data set WORK.RFU1. NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 111 proc sort data=lag1 ; 112 by sample_ID time ; 113 run; NOTE: There were 11136 observations read from the data set WORK.LAG1. NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 114 proc sgplot data=lag1 ; 115 series x=time y=lagdif / group=sample_id ; 116 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 117 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.88 seconds cpu time 0.04 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG1. 118 119 data rfu2 ; 120 set lag1 ; 121 by sample_ID; 122 attrib lag_time format=time8. informat=8. ; 123 lag_time=lag(time); 124 if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ; 125 else time=time; 126 drop lagdif lag_time ; 127 run; NOTE: There were 11136 observations read from the data set WORK.LAG1. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 128 proc sort data=rfu2 ; 129 by sample_ID time ; 130 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 131 data lag2 ; 132 set rfu2 ; 133 attrib lagdif format=time8. informat=8.; 134 lagdif = time - lag(time) ; 135 if time='00:00:00't then lagdif=. ; 136 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 134:19 NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.00 seconds 137 proc sort data=lag2 ; 138 by sample_ID time ; 139 run; NOTE: There were 11136 observations read from the data set WORK.LAG2. NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 140 proc sgplot data=lag2 ; 141 series x=time y=lagdif / group=sample_id ; 142 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 143 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.92 seconds cpu time 0.06 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG2. 144 145 proc freq data=rfu2 ; 146 table sample_id / nocum norow nocol nopercent ; 147 table row*well / nocum norow nocol nopercent ; 148 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: PROCEDURE FREQ used (Total process time): real time 0.31 seconds cpu time 0.04 seconds 149 150 **** 2) Visually determine time frame of baseline after rapid ThT decay and before earliest 150! rxn; 151 proc sort data=rfu2 ; 152 by status row well time ; 153 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 154 proc sgplot data=rfu2 noautolegend ; 155 title'PLOT 2a: Visually determined maximum RFU for NEG vs POS'; 156 by status; 157 refline 26000 / axis=y label="Max NEG-RFU reading"; 158 series x=time y=rfu / group=sample_ID ; 159 xaxis min='00:00:00't ; 160 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 1.70 seconds cpu time 0.15 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: There were 11136 observations read from the data set WORK.RFU2. 161 title;title2; 162 163 proc sort data=rfu2 ; 164 by status dilution replicate ; 165 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 166 proc univariate data=rfu2 noprint ; 167 by status dilution replicate ; 168 var rfu ; 169 output out=max_RFU max=maxRFU ; 170 run; NOTE: The data set WORK.MAX_RFU has 96 observations and 4 variables. NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 0.03 seconds cpu time 0.00 seconds 171 172 data max_RFU ; 173 set max_RFU ; 174 attrib misfolding_activity format=8. informat=8. label='misfolding activity'; 175 if maxRFU>26000 then misfolding_activity = 1 ; 176 else misfolding_activity = 0; 177 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 178 proc sort data=max_RFU ; 179 by status dilution ; 180 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 181 proc freq data=max_RFU ; 182 by status dilution ; 183 tables misfolding_activity / bin(level=1 cl=exact); 184 output out=freq bin ; 185 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.FREQ has 7 observations and 12 variables. NOTE: PROCEDURE FREQ used (Total process time): real time 0.34 seconds cpu time 0.06 seconds 186 quit; 187 188 proc genmod data=max_rfu ; 189 class status ; 190 model misfolding_activity = status dilution / dist=poisson ; 191 run; NOTE: Algorithm converged. NOTE: The scale parameter was held fixed. NOTE: PROCEDURE GENMOD used (Total process time): real time 0.45 seconds cpu time 0.01 seconds
These results are consistent with these conditions:
To overcome that, get rid of the original version:
data max_RFU;
set max_RFU (drop=misfolding_activity);
attrib ...
if ....
else ....
run;
Most of the DATA step stays the same. Just add DROP= when bringing in the data.
Thank you. The data step I provided was the first instance of this variable (misfolding activity).
The point of this part of the code was to take the output of proc univariate (max_RFU; which contains the max value for each RFU curve over time) and then add a new variable (misfolding activity, numeric, 0 or 1) as an indicator if the curve ever exceeds the threshold value. I'm wondering if I might have avoided this problem if I had first deleted the max_RFU dataset before resubmitting edited code. Then again, isn't the output dataset just overwritten each time I submit the proc univariate?
Puzzled, but glad it is now working.
Dave
In theory, that's true. Each time UNIVARIATE runs, it replaces max_RFU.
In practice, could this combination have occurred?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.