Hi,
I would greatly appreciate if someone could help me with the SAS code to solve this problem in dataset 2. Eventually, I would like to have the same output for dataset 2 as in dataset 1 below: The main problem lies with the proc transpose in dataset 2. Please, output for datasets 1 and 2 are attached. Errors were obtained in the proc transpose for dataset 2; SAS produced only 3 obs for the output.
The ultimate aim is to find the frequencies of associations between the pollutants.No errors for dataset 1.
Thanks.
ak.
/*DATASET 1*/
data idnew1;
input id$ job idchem;
datalines;
os1 1 990005 .
os1 1 990021
os1 1 211700
os1 2 211700
os1 2 990021
os1 2 210701
os1 2 990005
os2 1 210701
os2 1 990005
os2 2 990021
os2 3 210701
os2 3 990005
os3 3 210701
os3 1 211700
os4 1 210701
os4 1 990005
os4 1 211700
;
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /*DATASET 1*/
74 data idnew1;
75 input id$ job idchem;
76 datalines;
NOTE: The data set WORK.IDNEW1 has 17 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
94 ;
95
96 proc format;
97 value idchem
98 990005 = "cla_exp"
99 990021 = "bio_exp"
100 210701 = "amo_exp"
101 211700 = "chl_exp";
NOTE: Format IDCHEM is already on the library WORK.FORMATS.
NOTE: Format IDCHEM has been output.
102 run;
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
103
104 data temp;
105 set idnew1;
106 dum = 1;
107 format idchem idchem.;
108 id_job=catx('_', id, job);
109 put _all_;
110 run;
id=os1 job=1 idchem=cla_exp dum=1 id_job=os1_1 _ERROR_=0 _N_=1
id=os1 job=1 idchem=bio_exp dum=1 id_job=os1_1 _ERROR_=0 _N_=2
id=os1 job=1 idchem=chl_exp dum=1 id_job=os1_1 _ERROR_=0 _N_=3
id=os1 job=2 idchem=chl_exp dum=1 id_job=os1_2 _ERROR_=0 _N_=4
id=os1 job=2 idchem=bio_exp dum=1 id_job=os1_2 _ERROR_=0 _N_=5
id=os1 job=2 idchem=amo_exp dum=1 id_job=os1_2 _ERROR_=0 _N_=6
id=os1 job=2 idchem=cla_exp dum=1 id_job=os1_2 _ERROR_=0 _N_=7
id=os2 job=1 idchem=amo_exp dum=1 id_job=os2_1 _ERROR_=0 _N_=8
id=os2 job=1 idchem=cla_exp dum=1 id_job=os2_1 _ERROR_=0 _N_=9
id=os2 job=2 idchem=bio_exp dum=1 id_job=os2_2 _ERROR_=0 _N_=10
id=os2 job=3 idchem=amo_exp dum=1 id_job=os2_3 _ERROR_=0 _N_=11
id=os2 job=3 idchem=cla_exp dum=1 id_job=os2_3 _ERROR_=0 _N_=12
id=os3 job=3 idchem=amo_exp dum=1 id_job=os3_3 _ERROR_=0 _N_=13
id=os3 job=1 idchem=chl_exp dum=1 id_job=os3_1 _ERROR_=0 _N_=14
id=os4 job=1 idchem=amo_exp dum=1 id_job=os4_1 _ERROR_=0 _N_=15
id=os4 job=1 idchem=cla_exp dum=1 id_job=os4_1 _ERROR_=0 _N_=16
id=os4 job=1 idchem=chl_exp dum=1 id_job=os4_1 _ERROR_=0 _N_=17
NOTE: There were 17 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.TEMP has 17 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
111
112 proc sort data=temp; by id job id_job; run;
NOTE: There were 17 observations read from the data set WORK.TEMP.
NOTE: The data set WORK.TEMP has 17 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
113
114 proc transpose data=temp out=idnew2(drop=_name_);
115 by id job id_job;
116 id idchem;
117 var dum;
118 run;
NOTE: There were 17 observations read from the data set WORK.TEMP.
NOTE: The data set WORK.IDNEW2 has 8 observations and 7 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
119 proc print data=idnew2;
120 Title "Table 1: Merged exposure files for cla, bio, amo and chl pollutants";
121 run;
NOTE: There were 8 observations read from the data set WORK.IDNEW2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.32 seconds
cpu time 0.31 seconds
122
123 /*Replacing missing values(.)with zeros(0) ie. unexposed*/
124 data t;
125 set idnew2;
NOTE: There were 8 observations read from the data set WORK.IDNEW2.
NOTE: The data set WORK.T has 8 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
126 proc stdize out=t2 reponly missing=0;
127 drop _type_;
NOTE: The DROP and KEEP statements are not supported in procedure steps in this release of the SAS System. Therefore, these
statements are ignored.
128 run;
NOTE: No VAR statement is given. All numerical variables not named elsewhere make up the first set of variables.
NOTE: There were 8 observations read from the data set WORK.T.
NOTE: The data set WORK.T2 has 8 observations and 7 variables.
NOTE: PROCEDURE STDIZE used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
129 proc sort; by id job;
130 Title " Table 2: Merged exposure files for cla, bio, amo and chl pollutants: missing values replaced with zeros";
131
NOTE: There were 8 observations read from the data set WORK.T2.
NOTE: The data set WORK.T2 has 8 observations and 7 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
132 proc print data=t2; run;
NOTE: There were 8 observations read from the data set WORK.T2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.11 seconds
cpu time 0.12 seconds
133
134 /*FINDING FREQUENCIES OF ASSOCIATIONS*/
135 proc freq data=t2;
136 tables
137 cla_exp*bio_exp
138 cla_exp*amo_exp
139 cla_exp*chl_exp
140 bio_exp*amo_exp
141 bio_exp*chl_exp
142 bio_exp*chl_exp;
143
144 Title "Frequencies of associations between cla, bio, amo and chl pollutants";
145 run;
NOTE: There were 8 observations read from the data set WORK.T2.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.47 seconds
cpu time 0.46 seconds
146
147 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
159
---------------- INFO ON DATASET 2 BELOW,please-------------------
/*DATASET 2*/
data env;
input id$ job idchem expyr;
datalines;
ps1 1 210701 82
ps1 1 210701 84
ps1 1 210701 74
ps1 1 430102 79
ps1 1 430102 62
ps1 4 210701 93
ps5 4 210701 78
ps5 4 460003 86
ps5 4 460003 77
ps5 4 520299 88
ps5 4 520299 76
ps5 5 520299 70
ps7 5 520299 93
ps7 1 430101 85
ps7 1 430101 74
ps7 1 430101 72
ps8 1 460003 74
ps8 1 460003 71
;
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /*DATASET 2*/
74 data env;
75 input id$ job idchem expyr;
76 datalines;
NOTE: The data set WORK.ENV has 18 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
95 ;
96
97 proc format;
98 value idchem
99 210701 = "amo_exp"
100 430102 = "tol_exp"
101 460003 = "sol_exp"
102 520299 = "ali_exp"
103 430101 = "ben_exp";
NOTE: Format IDCHEM is already on the library WORK.FORMATS.
NOTE: Format IDCHEM has been output.
104 run;
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
105
106 data temp;
107 set env;
108 dum = 1;
109 format idchem idchem.;
110 id_job=catx('_', id, job);
111 put _all_;
112 run;
id=ps1 job=1 idchem=amo_exp expyr=82 dum=1 id_job=ps1_1 _ERROR_=0 _N_=1
id=ps1 job=1 idchem=amo_exp expyr=84 dum=1 id_job=ps1_1 _ERROR_=0 _N_=2
id=ps1 job=1 idchem=amo_exp expyr=74 dum=1 id_job=ps1_1 _ERROR_=0 _N_=3
id=ps1 job=1 idchem=tol_exp expyr=79 dum=1 id_job=ps1_1 _ERROR_=0 _N_=4
id=ps1 job=1 idchem=tol_exp expyr=62 dum=1 id_job=ps1_1 _ERROR_=0 _N_=5
id=ps1 job=4 idchem=amo_exp expyr=93 dum=1 id_job=ps1_4 _ERROR_=0 _N_=6
id=ps5 job=4 idchem=amo_exp expyr=78 dum=1 id_job=ps5_4 _ERROR_=0 _N_=7
id=ps5 job=4 idchem=sol_exp expyr=86 dum=1 id_job=ps5_4 _ERROR_=0 _N_=8
id=ps5 job=4 idchem=sol_exp expyr=77 dum=1 id_job=ps5_4 _ERROR_=0 _N_=9
id=ps5 job=4 idchem=ali_exp expyr=88 dum=1 id_job=ps5_4 _ERROR_=0 _N_=10
id=ps5 job=4 idchem=ali_exp expyr=76 dum=1 id_job=ps5_4 _ERROR_=0 _N_=11
id=ps5 job=5 idchem=ali_exp expyr=70 dum=1 id_job=ps5_5 _ERROR_=0 _N_=12
id=ps7 job=5 idchem=ali_exp expyr=93 dum=1 id_job=ps7_5 _ERROR_=0 _N_=13
id=ps7 job=1 idchem=ben_exp expyr=85 dum=1 id_job=ps7_1 _ERROR_=0 _N_=14
id=ps7 job=1 idchem=ben_exp expyr=74 dum=1 id_job=ps7_1 _ERROR_=0 _N_=15
id=ps7 job=1 idchem=ben_exp expyr=72 dum=1 id_job=ps7_1 _ERROR_=0 _N_=16
id=ps8 job=1 idchem=sol_exp expyr=74 dum=1 id_job=ps8_1 _ERROR_=0 _N_=17
id=ps8 job=1 idchem=sol_exp expyr=71 dum=1 id_job=ps8_1 _ERROR_=0 _N_=18
NOTE: There were 18 observations read from the data set WORK.ENV.
NOTE: The data set WORK.TEMP has 18 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
113
114 proc sort data=temp; by id job id_job; run;
NOTE: There were 18 observations read from the data set WORK.TEMP.
NOTE: The data set WORK.TEMP has 18 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
115
116 proc transpose data=temp out=temp2(drop=_name_);
117 by id job id_job;
118 id idchem;
119 var dum;
120 run;
ERROR: The ID value "amo_exp" occurs twice in the same BY group.
ERROR: The ID value "amo_exp" occurs twice in the same BY group.
ERROR: The ID value "tol_exp" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
id=ps1 job=1 id_job=ps1_1
ERROR: The ID value "sol_exp" occurs twice in the same BY group.
ERROR: The ID value "ali_exp" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
id=ps5 job=4 id_job=ps5_4
ERROR: The ID value "ben_exp" occurs twice in the same BY group.
ERROR: The ID value "ben_exp" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
id=ps7 job=1 id_job=ps7_1
ERROR: The ID value "sol_exp" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
id=ps8 job=1 id_job=ps8_1
WARNING: 4 BY groups omitted due to earlier errors.
NOTE: There were 18 observations read from the data set WORK.TEMP.
NOTE: The data set WORK.TEMP2 has 3 observations and 8 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
121
122 proc print data=temp2;
123 Title "Dataset 2: Proc transpose results";
124 run;
NOTE: There were 3 observations read from the data set WORK.TEMP2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.15 seconds
cpu time 0.15 seconds
125
126 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
138