I'm sure this will be an easy one for someone but I can't figure out how to force the new variable 'misfolding activity' to stay numeric as intended.
proc univariate data=rfu2 noprint ;
by status dilution replicate ;
var rfu ;
output out=max_RFU max=maxRFU ;
run;
data max_RFU ;
set max_RFU ;
attrib misfolding_activity format=8. informat=8. label='misfolding activity';
if maxRFU>26000 then misfolding_activity = 1 ;
else misfolding_activity = 0;
run;
In theory, that's true. Each time UNIVARIATE runs, it replaces max_RFU.
In practice, could this combination have occurred?
Show us the entire SAS log. Click on the {i} icon and paste the log into the window that appears. DO NOT SKIP THIS STEP.
Interestingly, I am not having the problem now after coming back to my computer and restarting SAS to produce the log you wanted. I am just in the beginning of putting this code together and thus was revising and resubmitting code. The code I posted was not working and I still am not sure why. For what it might be worth, here is the log as requested but which is now working.
NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.4 (TS1M5) NOTE: This session is executing on the X64_7PRO platform. NOTE: Updated analytical products: SAS/STAT 14.3 SAS/ETS 14.3 SAS/OR 14.3 SAS/IML 14.3 SAS/QC 14.3 NOTE: Additional host information: X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation NOTE: SAS initialization used: real time 1.30 seconds cpu time 0.59 seconds 1 ods graphics / antialiasmax=500000 ; 2 3 footnote'TID6164'; 4 footnote2'Protocol restarted after 21 hours due to power outage.'; 5 footnote3'Data before 21 hours not included in this analysis'; 6 **** 1a) Import, transpose and rename the raw data ; 7 PROC IMPORT OUT= WORK.RAWWide 8 DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m 9 erged runs in excel\TID6164.xlsx" 10 DBMS=EXCELCS REPLACE; 11 RANGE='SAS$B3:CT119'n; 12 SCANTEXT=YES; 13 USEDATE=YES; 14 SCANTIME=YES; 15 RUN; NOTE: WORK.RAWWIDE data set was successfully created. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.76 seconds cpu time 0.09 seconds 16 17 data RAWWide; 18 set RAWWide ; 19 attrib time format=time8. informat=8. ; 20 time = (Time__h_ * '01:00:00't) + '21:00:00't ; /* Convert decimal minutes to SAS time 20 ! */ 21 drop Time__h_ ; 22 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 23 proc sort data=RAWWide ; 24 by time ; 25 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.03 seconds 26 proc transpose data=RAWWide out=RAWLong; 27 var Sample_X1-Sample_X96 ; 28 by time ; 29 run; NOTE: There were 116 observations read from the data set WORK.RAWWIDE. NOTE: The data set WORK.RAWLONG has 11136 observations and 4 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 30 data RAWLong ; 31 set RAWLong (rename=(COL1=RFU)); 32 attrib sample_ID format=8. label='sample_ID X##'; 33 Sample_ID=substr(_NAME_,9); 34 drop _LABEL_ _NAME_ ; 35 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 33:15 NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 36 37 **** 1b) Import, transpose and rename the layout data ; 38 PROC IMPORT OUT= WORK.Layout 39 DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m 40 erged runs in excel\TID6164.xlsx" 41 DBMS=EXCELCS REPLACE; 42 RANGE="Layout$"; 43 SCANTEXT=YES; 44 USEDATE=YES; 45 SCANTIME=YES; 46 RUN; NOTE: WORK.LAYOUT data set was successfully created. NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.56 seconds cpu time 0.06 seconds 47 48 **** 1c) Put 1a and 1b together; 49 proc sort data=RAWLong ; by sample_ID time ; run; NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 50 proc sort data=Layout ; by sample_ID ; run; NOTE: There were 96 observations read from the data set WORK.LAYOUT. NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.00 seconds 51 data rfu ; 52 merge Layout RAWLong ; 53 by sample_ID ; 54 attrib dilution format=E7. ; 55 run; NOTE: There were 96 observations read from the data set WORK.LAYOUT. NOTE: There were 11136 observations read from the data set WORK.RAWLONG. NOTE: The data set WORK.RFU has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 56 57 proc sort data=rfu ; 58 by status dilution replicate time ; 59 run; NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: The data set WORK.RFU has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 60 proc freq data=rfu ; 61 by status dilution ; 62 table replicate / nopercent nocol norow nocum ; 63 run; NOTE: Writing HTML Body file: sashtml.htm NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: PROCEDURE FREQ used (Total process time): real time 0.56 seconds cpu time 0.10 seconds 64 proc sgpanel data=rfu ; 65 title'PLOT 1: Raw data (n=12 or 16 t reps)'; 66 by status; 67 panelby dilution / novarname columns=1 uniscale=all; 68 series x=time y=rfu / group=replicate ; 69 colaxis type=time tickvalueformat=time8. grid display=(novalues) ; 70 rowaxis values=(0 to 260000 by 25000) grid display=(novalues) ; 71 run; NOTE: PROCEDURE SGPANEL used (Total process time): real time 3.28 seconds cpu time 0.28 seconds NOTE: There were 11136 observations read from the data set WORK.RFU. 72 title; 73 proc freq data=rfu ; 74 table sample_id / nocum norow nocol nopercent ; 75 table row*well / nocum norow nocol nopercent ; 76 run; ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted. ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted. NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 77 /*/*/*/*/* 1d) check for time line glitch */*/*/*/*/; 78 data lag ; 79 set rfu ; 80 attrib lagdif format=time8. informat=8.; 81 lagdif = time - lag(time) ; 82 if time='21:00:00't then lagdif=. ; 83 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 81:19 NOTE: There were 11136 observations read from the data set WORK.RFU. NOTE: The data set WORK.LAG has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 84 proc sort data=lag ; 85 by sample_ID time ; 86 run; NOTE: There were 11136 observations read from the data set WORK.LAG. NOTE: The data set WORK.LAG has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.18 seconds cpu time 0.03 seconds 87 proc sgplot data=lag ; 88 series x=time y=lagdif / group=sample_id ; 89 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 90 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 1.10 seconds cpu time 0.06 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG. 91 92 /*/*/*/*/* 1e) fix time line glitch */*/*/*/*/; 93 data rfu1 ; 94 set lag ; 95 by sample_ID; 96 attrib lag_time format=time8. informat=8. ; 97 lag_time=lag(time); 98 if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ; 99 else time=time; 100 drop lagdif lag_time ; 101 run; NOTE: There were 11136 observations read from the data set WORK.LAG. NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 102 proc sort data=rfu1 ; 103 by sample_ID time ; 104 run; NOTE: There were 11136 observations read from the data set WORK.RFU1. NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 105 data lag1 ; 106 set rfu1 ; 107 attrib lagdif format=time8. informat=8.; 108 lagdif = time - lag(time) ; 109 if time='00:00:00't then lagdif=. ; 110 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 108:19 NOTE: There were 11136 observations read from the data set WORK.RFU1. NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 111 proc sort data=lag1 ; 112 by sample_ID time ; 113 run; NOTE: There were 11136 observations read from the data set WORK.LAG1. NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 114 proc sgplot data=lag1 ; 115 series x=time y=lagdif / group=sample_id ; 116 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 117 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.88 seconds cpu time 0.04 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG1. 118 119 data rfu2 ; 120 set lag1 ; 121 by sample_ID; 122 attrib lag_time format=time8. informat=8. ; 123 lag_time=lag(time); 124 if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ; 125 else time=time; 126 drop lagdif lag_time ; 127 run; NOTE: There were 11136 observations read from the data set WORK.LAG1. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 128 proc sort data=rfu2 ; 129 by sample_ID time ; 130 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 131 data lag2 ; 132 set rfu2 ; 133 attrib lagdif format=time8. informat=8.; 134 lagdif = time - lag(time) ; 135 if time='00:00:00't then lagdif=. ; 136 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 134:19 NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.00 seconds 137 proc sort data=lag2 ; 138 by sample_ID time ; 139 run; NOTE: There were 11136 observations read from the data set WORK.LAG2. NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 140 proc sgplot data=lag2 ; 141 series x=time y=lagdif / group=sample_id ; 142 yaxis values=('00:00:00't to '01:30:00't by '00:10:00't); 143 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.92 seconds cpu time 0.06 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement. NOTE: There were 11136 observations read from the data set WORK.LAG2. 144 145 proc freq data=rfu2 ; 146 table sample_id / nocum norow nocol nopercent ; 147 table row*well / nocum norow nocol nopercent ; 148 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: PROCEDURE FREQ used (Total process time): real time 0.31 seconds cpu time 0.04 seconds 149 150 **** 2) Visually determine time frame of baseline after rapid ThT decay and before earliest 150! rxn; 151 proc sort data=rfu2 ; 152 by status row well time ; 153 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 154 proc sgplot data=rfu2 noautolegend ; 155 title'PLOT 2a: Visually determined maximum RFU for NEG vs POS'; 156 by status; 157 refline 26000 / axis=y label="Max NEG-RFU reading"; 158 series x=time y=rfu / group=sample_ID ; 159 xaxis min='00:00:00't ; 160 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 1.70 seconds cpu time 0.15 seconds NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: The column format TIME8 is replaced by an auto-generated format on the axis. NOTE: There were 11136 observations read from the data set WORK.RFU2. 161 title;title2; 162 163 proc sort data=rfu2 ; 164 by status dilution replicate ; 165 run; NOTE: There were 11136 observations read from the data set WORK.RFU2. NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.06 seconds cpu time 0.01 seconds 166 proc univariate data=rfu2 noprint ; 167 by status dilution replicate ; 168 var rfu ; 169 output out=max_RFU max=maxRFU ; 170 run; NOTE: The data set WORK.MAX_RFU has 96 observations and 4 variables. NOTE: PROCEDURE UNIVARIATE used (Total process time): real time 0.03 seconds cpu time 0.00 seconds 171 172 data max_RFU ; 173 set max_RFU ; 174 attrib misfolding_activity format=8. informat=8. label='misfolding activity'; 175 if maxRFU>26000 then misfolding_activity = 1 ; 176 else misfolding_activity = 0; 177 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 178 proc sort data=max_RFU ; 179 by status dilution ; 180 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 181 proc freq data=max_RFU ; 182 by status dilution ; 183 tables misfolding_activity / bin(level=1 cl=exact); 184 output out=freq bin ; 185 run; NOTE: There were 96 observations read from the data set WORK.MAX_RFU. NOTE: The data set WORK.FREQ has 7 observations and 12 variables. NOTE: PROCEDURE FREQ used (Total process time): real time 0.34 seconds cpu time 0.06 seconds 186 quit; 187 188 proc genmod data=max_rfu ; 189 class status ; 190 model misfolding_activity = status dilution / dist=poisson ; 191 run; NOTE: Algorithm converged. NOTE: The scale parameter was held fixed. NOTE: PROCEDURE GENMOD used (Total process time): real time 0.45 seconds cpu time 0.01 seconds
These results are consistent with these conditions:
To overcome that, get rid of the original version:
data max_RFU;
set max_RFU (drop=misfolding_activity);
attrib ...
if ....
else ....
run;
Most of the DATA step stays the same. Just add DROP= when bringing in the data.
Thank you. The data step I provided was the first instance of this variable (misfolding activity).
The point of this part of the code was to take the output of proc univariate (max_RFU; which contains the max value for each RFU curve over time) and then add a new variable (misfolding activity, numeric, 0 or 1) as an indicator if the curve ever exceeds the threshold value. I'm wondering if I might have avoided this problem if I had first deleted the max_RFU dataset before resubmitting edited code. Then again, isn't the output dataset just overwritten each time I submit the proc univariate?
Puzzled, but glad it is now working.
Dave
In theory, that's true. Each time UNIVARIATE runs, it replaces max_RFU.
In practice, could this combination have occurred?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.