BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
das
Obsidian | Level 7 das
Obsidian | Level 7

I'm sure this will be an easy one for someone but I can't figure out how to force the new variable 'misfolding activity' to stay numeric as intended.

 

proc univariate data=rfu2 noprint ;
	by status dilution replicate ;
	var rfu ;
	output out=max_RFU max=maxRFU ;
run;

data max_RFU ;
	set max_RFU ;
	attrib misfolding_activity format=8. informat=8. label='misfolding activity';
	if maxRFU>26000 then misfolding_activity = 1 ;
	else misfolding_activity = 0;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

In theory, that's true.  Each time UNIVARIATE runs, it replaces max_RFU.

 

In practice, could this combination have occurred?

 

  • The program strings together several iterations of the logic into a single program.
  • One of the UNIVARIATEs fails, thus failing to replace max_RFU.
  • So the DATA step is actually bringing in max_RFU from the previous iteration, one that has already added MISFOLDING_ACTIVITY.

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Show us the entire SAS log. Click on the {i} icon and paste the log into the window that appears. DO NOT SKIP THIS STEP.

--
Paige Miller
das
Obsidian | Level 7 das
Obsidian | Level 7

Interestingly, I am not having the problem now after coming back to my computer and restarting SAS to produce the log you wanted. I am just in the beginning of putting this code together and thus was revising and resubmitting code. The code I posted was not working and I still am not sure why. For what it might be worth, here is the log as requested but which is now working.

NOTE: Copyright (c) 2016 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M5)

NOTE: This session is executing on the X64_7PRO  platform.



NOTE: Updated analytical products:

      SAS/STAT 14.3
      SAS/ETS 14.3
      SAS/OR 14.3
      SAS/IML 14.3
      SAS/QC 14.3

NOTE: Additional host information:

 X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation

NOTE: SAS initialization used:
      real time           1.30 seconds
      cpu time            0.59 seconds

1    ods graphics / antialiasmax=500000 ;
2
3    footnote'TID6164';
4    footnote2'Protocol restarted after 21 hours due to power outage.';
5    footnote3'Data before 21 hours not included in this analysis';
6    **** 1a) Import, transpose and rename the raw data ;
7    PROC IMPORT OUT= WORK.RAWWide
8                DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m
9    erged runs in excel\TID6164.xlsx"
10               DBMS=EXCELCS REPLACE;
11        RANGE='SAS$B3:CT119'n;
12        SCANTEXT=YES;
13        USEDATE=YES;
14        SCANTIME=YES;
15   RUN;

NOTE: WORK.RAWWIDE data set was successfully created.
NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.76 seconds
      cpu time            0.09 seconds


16
17   data RAWWide;
18       set RAWWide ;
19       attrib time format=time8. informat=8. ;
20       time = (Time__h_ * '01:00:00't) + '21:00:00't ; /* Convert decimal minutes to SAS time
20 ! */
21       drop Time__h_ ;
22   run;

NOTE: There were 116 observations read from the data set WORK.RAWWIDE.
NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds


23   proc sort data=RAWWide ;
24       by time ;
25   run;

NOTE: There were 116 observations read from the data set WORK.RAWWIDE.
NOTE: The data set WORK.RAWWIDE has 116 observations and 97 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.03 seconds


26   proc transpose data=RAWWide out=RAWLong;
27       var Sample_X1-Sample_X96 ;
28       by time ;
29   run;

NOTE: There were 116 observations read from the data set WORK.RAWWIDE.
NOTE: The data set WORK.RAWLONG has 11136 observations and 4 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


30   data RAWLong ;
31       set RAWLong (rename=(COL1=RFU));
32       attrib sample_ID format=8. label='sample_ID X##';
33       Sample_ID=substr(_NAME_,9);
34       drop _LABEL_ _NAME_ ;
35   run;

NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      33:15
NOTE: There were 11136 observations read from the data set WORK.RAWLONG.
NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


36
37   **** 1b) Import, transpose and rename the layout data ;
38   PROC IMPORT OUT= WORK.Layout
39               DATAFILE= "J:\SchneiderLab\Edith Orozco\QuIC assay\RT-QuIC m
40   erged runs in excel\TID6164.xlsx"
41               DBMS=EXCELCS REPLACE;
42        RANGE="Layout$";
43        SCANTEXT=YES;
44        USEDATE=YES;
45        SCANTIME=YES;
46   RUN;

NOTE: WORK.LAYOUT data set was successfully created.
NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.56 seconds
      cpu time            0.06 seconds


47
48   **** 1c) Put 1a and 1b together;
49   proc sort data=RAWLong ; by sample_ID time ; run;

NOTE: There were 11136 observations read from the data set WORK.RAWLONG.
NOTE: The data set WORK.RAWLONG has 11136 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds


50   proc sort data=Layout ; by sample_ID ; run;

NOTE: There were 96 observations read from the data set WORK.LAYOUT.
NOTE: The data set WORK.LAYOUT has 96 observations and 11 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds


51   data rfu ;
52       merge Layout RAWLong ;
53       by sample_ID ;
54       attrib dilution format=E7. ;
55   run;

NOTE: There were 96 observations read from the data set WORK.LAYOUT.
NOTE: There were 11136 observations read from the data set WORK.RAWLONG.
NOTE: The data set WORK.RFU has 11136 observations and 13 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


56
57   proc sort data=rfu ;
58       by status dilution replicate time ;
59   run;

NOTE: There were 11136 observations read from the data set WORK.RFU.
NOTE: The data set WORK.RFU has 11136 observations and 13 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds


60   proc freq data=rfu ;
61       by status dilution ;
62       table replicate / nopercent nocol norow nocum ;
63   run;

NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 11136 observations read from the data set WORK.RFU.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.56 seconds
      cpu time            0.10 seconds


64   proc sgpanel data=rfu ;
65       title'PLOT 1: Raw data (n=12 or 16 t reps)';
66       by status;
67       panelby dilution / novarname columns=1 uniscale=all;
68       series x=time y=rfu / group=replicate ;
69       colaxis type=time tickvalueformat=time8. grid display=(novalues) ;
70       rowaxis values=(0 to 260000 by 25000) grid display=(novalues) ;
71   run;

NOTE: PROCEDURE SGPANEL used (Total process time):
      real time           3.28 seconds
      cpu time            0.28 seconds

NOTE: There were 11136 observations read from the data set WORK.RFU.

72   title;
73   proc freq data=rfu ;
74       table sample_id / nocum norow nocol nopercent ;
75       table row*well / nocum norow nocol nopercent ;
76   run;

ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted.
ERROR: Permanent copy of file WORK.'SASTMP-000000018'n.UTILITY was deleted.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

77   /*/*/*/*/*  1d) check for time line glitch */*/*/*/*/;


78   data lag ;
79       set rfu ;
80       attrib lagdif format=time8. informat=8.;
81       lagdif = time - lag(time) ;
82       if time='21:00:00't then lagdif=. ;
83   run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 81:19
NOTE: There were 11136 observations read from the data set WORK.RFU.
NOTE: The data set WORK.LAG has 11136 observations and 14 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


84   proc sort data=lag ;
85       by sample_ID time ;
86   run;

NOTE: There were 11136 observations read from the data set WORK.LAG.
NOTE: The data set WORK.LAG has 11136 observations and 14 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.18 seconds
      cpu time            0.03 seconds


87   proc sgplot data=lag ;
88       series x=time y=lagdif / group=sample_id ;
89       yaxis values=('00:00:00't to '01:30:00't by '00:10:00't);
90   run;

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           1.10 seconds
      cpu time            0.06 seconds

NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the
      MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement.
NOTE: There were 11136 observations read from the data set WORK.LAG.

91
92   /*/*/*/*/*  1e) fix time line glitch    */*/*/*/*/;
93   data rfu1 ;
94       set lag ;
95       by sample_ID;
96       attrib lag_time format=time8. informat=8. ;
97       lag_time=lag(time);
98       if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ;
99           else time=time;
100      drop lagdif lag_time ;
101  run;

NOTE: There were 11136 observations read from the data set WORK.LAG.
NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds


102  proc sort data=rfu1 ;
103      by sample_ID time ;
104  run;

NOTE: There were 11136 observations read from the data set WORK.RFU1.
NOTE: The data set WORK.RFU1 has 11136 observations and 13 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


105  data lag1 ;
106      set rfu1 ;
107      attrib lagdif format=time8. informat=8.;
108      lagdif = time - lag(time) ;
109      if time='00:00:00't then lagdif=. ;
110  run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 108:19
NOTE: There were 11136 observations read from the data set WORK.RFU1.
NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


111  proc sort data=lag1 ;
112      by sample_ID time ;
113  run;

NOTE: There were 11136 observations read from the data set WORK.LAG1.
NOTE: The data set WORK.LAG1 has 11136 observations and 14 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


114  proc sgplot data=lag1 ;
115      series x=time y=lagdif / group=sample_id ;
116      yaxis values=('00:00:00't to '01:30:00't by '00:10:00't);
117  run;

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           0.88 seconds
      cpu time            0.04 seconds

NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the
      MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement.
NOTE: There were 11136 observations read from the data set WORK.LAG1.

118
119  data rfu2 ;
120      set lag1 ;
121      by sample_ID;
122      attrib lag_time format=time8. informat=8. ;
123      lag_time=lag(time);
124      if lagdif>'00:45:00't then time=(lag_time+'00:43:12't) ;
125          else time=time;
126      drop lagdif lag_time ;
127  run;

NOTE: There were 11136 observations read from the data set WORK.LAG1.
NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds


128  proc sort data=rfu2 ;
129      by sample_ID time ;
130  run;

NOTE: There were 11136 observations read from the data set WORK.RFU2.
NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


131  data lag2 ;
132      set rfu2 ;
133      attrib lagdif format=time8. informat=8.;
134      lagdif = time - lag(time) ;
135      if time='00:00:00't then lagdif=. ;
136  run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 134:19
NOTE: There were 11136 observations read from the data set WORK.RFU2.
NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.00 seconds


137  proc sort data=lag2 ;
138      by sample_ID time ;
139  run;

NOTE: There were 11136 observations read from the data set WORK.LAG2.
NOTE: The data set WORK.LAG2 has 11136 observations and 14 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds


140  proc sgplot data=lag2 ;
141      series x=time y=lagdif / group=sample_id ;
142      yaxis values=('00:00:00't to '01:30:00't by '00:10:00't);
143  run;

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           0.92 seconds
      cpu time            0.06 seconds

NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: Some graph legends have been dropped due to size constraints. Try adjusting the
      MAXLEGENDAREA=, WIDTH= and HEIGHT= options in the ODS GRAPHICS statement.
NOTE: There were 11136 observations read from the data set WORK.LAG2.

144
145  proc freq data=rfu2 ;
146      table sample_id / nocum norow nocol nopercent ;
147      table row*well / nocum norow nocol nopercent ;
148  run;

NOTE: There were 11136 observations read from the data set WORK.RFU2.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.31 seconds
      cpu time            0.04 seconds


149
150  **** 2) Visually determine time frame of baseline after rapid ThT decay and before earliest
150!  rxn;
151  proc sort data=rfu2 ;
152      by status row well time ;
153  run;

NOTE: There were 11136 observations read from the data set WORK.RFU2.
NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.01 seconds


154  proc sgplot data=rfu2 noautolegend ;
155      title'PLOT 2a: Visually determined maximum RFU for NEG vs POS';
156      by status;
157      refline 26000 / axis=y label="Max NEG-RFU reading";
158      series x=time y=rfu / group=sample_ID ;
159      xaxis min='00:00:00't ;
160  run;

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           1.70 seconds
      cpu time            0.15 seconds

NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: The column format TIME8 is replaced by an auto-generated format on the axis.
NOTE: There were 11136 observations read from the data set WORK.RFU2.

161  title;title2;
162
163  proc sort data=rfu2 ;
164      by status dilution replicate ;
165  run;

NOTE: There were 11136 observations read from the data set WORK.RFU2.
NOTE: The data set WORK.RFU2 has 11136 observations and 13 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds


166  proc univariate data=rfu2 noprint ;
167      by status dilution replicate ;
168      var rfu ;
169      output out=max_RFU max=maxRFU ;
170  run;

NOTE: The data set WORK.MAX_RFU has 96 observations and 4 variables.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds


171
172  data max_RFU ;
173      set max_RFU ;
174      attrib misfolding_activity format=8. informat=8. label='misfolding activity';
175      if maxRFU>26000 then misfolding_activity = 1 ;
176      else misfolding_activity = 0;
177  run;

NOTE: There were 96 observations read from the data set WORK.MAX_RFU.
NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds


178  proc sort data=max_RFU ;
179      by status dilution ;
180  run;

NOTE: There were 96 observations read from the data set WORK.MAX_RFU.
NOTE: The data set WORK.MAX_RFU has 96 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


181  proc freq data=max_RFU ;
182      by status dilution ;
183      tables misfolding_activity / bin(level=1 cl=exact);
184      output out=freq bin ;
185  run;

NOTE: There were 96 observations read from the data set WORK.MAX_RFU.
NOTE: The data set WORK.FREQ has 7 observations and 12 variables.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.34 seconds
      cpu time            0.06 seconds


186  quit;
187
188  proc genmod data=max_rfu ;
189      class status ;
190      model misfolding_activity = status dilution / dist=poisson ;
191  run;

NOTE: Algorithm converged.
NOTE: The scale parameter was held fixed.
NOTE: PROCEDURE GENMOD used (Total process time):
      real time           0.45 seconds
      cpu time            0.01 seconds


Astounding
PROC Star

These results are consistent with these conditions:

 

  • misfolding_activity already exists in the incoming data, and
  • it is defined as being a character variable

To overcome that, get rid of the original version:

 

data max_RFU;
   set max_RFU (drop=misfolding_activity);
   attrib ...
   if ....
   else ....
run;

Most of the DATA step stays the same.  Just add DROP= when bringing in the data.

das
Obsidian | Level 7 das
Obsidian | Level 7

Thank you. The data step I provided was the first instance of this variable (misfolding activity).

 

The point of this part of the code was to take the output of proc univariate (max_RFU; which contains the max value for each RFU curve over time) and then add a new variable (misfolding activity, numeric, 0 or 1) as an indicator if the curve ever exceeds the threshold value. I'm wondering if I might have avoided this problem if I had first deleted the max_RFU dataset before resubmitting edited code. Then again, isn't the output dataset just overwritten each time I submit the proc univariate?

 

Puzzled, but glad it is now working.

 

Dave

Astounding
PROC Star

In theory, that's true.  Each time UNIVARIATE runs, it replaces max_RFU.

 

In practice, could this combination have occurred?

 

  • The program strings together several iterations of the logic into a single program.
  • One of the UNIVARIATEs fails, thus failing to replace max_RFU.
  • So the DATA step is actually bringing in max_RFU from the previous iteration, one that has already added MISFOLDING_ACTIVITY.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 4398 views
  • 0 likes
  • 3 in conversation