BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ak2011
Fluorite | Level 6
Hi,
Could someone help me with the SAS code to find associations(ie. cross tabulations)
between exposed variables? Exposed(exp=1), unexposed=0. Actually, I have a dataset with 30 idchems (over 100,000 obs in total) but I am using only
4 of them as a test to see if the method will work. I used proc transpose to read the idchems as
idchem1=990005........... idchem4=211700 and finally found the cross tabulations using proc freq
shown in the SAS output. Proc transpose worked well with the test dataset and I was able to find
the associations(cross tabulations) as shown in output(attached). Proc transpose for the larger dataset
was a problem; SAS couldn't transpose and display all the dataset for the 30 idchems;
it transposed and displayed a dataset having only 4 idchems. Could someone help me with a SAS code
for another method without the proc transpose or even if with a proc transpose a shorter datastep ie.
one that incorporates the dataset with 30 idchems at a time.
In summary, a SAS code/method that will help me eventually to find the cross associations for the exposed(1)
and unexposed(0) other than the method I used, which of course will not be very efficient for the larger
data with 30 idchems. SAS code used is found below and results is attached.

Thanks in advance.

ak.

data idnew1;
input id$ job idchem;
datalines;
os1 1 990005
os1 1 990021
os1 1 211700
os1 2 211700
os1 2 990021
os1 2 210701
os1 2 990005
os2 1 210701
os2 1 990005
os2 2 990021
os2 3 210701
os2 3 990005
os3 3 210701
os3 1 211700
os4 1 210701
os4 1 990005
os4 1 211700
;
run;

/*TRANSPOSING VARIABLES*/
proc sort data=idnew1; by id job;
proc transpose data=idnew1 out=idnew1b prefix=idchem;
by id job;
/*id job;*/
var idchem;
run;

/*Cla exposure*/
data clat;
set idnew1b;
if idchem1='990005' or idchem2='990005' or idchem3='990005' or idchem4='990005' then cla_exp=1;
else cla_exp=0;
id_job=catx('_', id, job);
put _all_;
drop _name_;
run;
proc print data=clat;
Title "Cla exposure";
run;

/*Bio exposure*/
data biot;
set idnew1b;
if idchem1='990021' or idchem2='990021' or idchem3='990021' or idchem4='990021' then bio_exp=1;
else bio_exp=0;
id_job=catx('_', id, job);
put _all_;
drop _name_;
run;
Title "Bio exposure";
proc print data=biot;

run;

/*Amo exposure*/
data amot;
set idnew1b;
if idchem1='210701' or idchem2='210701' or idchem3='210701' or idchem4='210701' then amo_exp=1;
else amo_exp=0;
id_job=catx('_', id, job);
put _all_;
drop _name_;
run;

proc print data=amot;
Title "Amo exposure";
run;


/*Chl exposure*/
data chlt;
set idnew1b;
if idchem1='211700' or idchem2='211700' or idchem3='211700'or idchem4='211700' then chl_exp=1;
else chl_exp=0;
id_job=catx('_', id, job);
put _all_;
drop _name_;
run;

proc print data=chlt;
Title "chl exposure";
run;

/* Merging clat,biot and amot files*/
data mlt; merge clat biot amot chlt;

run;

proc print data=mlt;
Title "Merged exposure files for cla ,bio, amo and chl pollutants";
run;


/*CROSS ASSOCIATIONS:clat,biot,amot,chlt*/
proc freq data=mlt;
tables cla_exp*bio_exp;
tables cla_exp*amo_exp;
tables cla_exp*chl_exp;
tables bio_exp*cla_exp;
tables bio_exp*amo_exp;
tables bio_exp*chl_exp;
run;  


1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 data idnew1;
74 input id$ job idchem;
75 datalines;
 
NOTE: The data set WORK.IDNEW1 has 17 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
93 ;
94 run;
95
96 /*TRANSPOSING VARIABLES*/
97 proc sort data=idnew1; by id job;
 
NOTE: There were 17 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW1 has 17 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.02 seconds
 
 
98 proc transpose data=idnew1 out=idnew1b prefix=idchem;
99 by id job;
100 /*id job;*/
101 var idchem;
102 run;
 
NOTE: There were 17 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW1B has 8 observations and 7 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.06 seconds
cpu time 0.02 seconds
 
 
103
104 /*Cla exposure*/
105 data clat;
106 set idnew1b;
107 if idchem1='990005' or idchem2='990005' or idchem3='990005' or idchem4='990005' then cla_exp=1;
108 else cla_exp=0;
109 id_job=catx('_', id, job);
110 put _all_;
111 drop _name_;
112 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
107:12 107:32 107:52 107:72
id=os1 job=1 _NAME_=idchem idchem1=990005 idchem2=990021 idchem3=211700 idchem4=. cla_exp=1 id_job=os1_1 _ERROR_=0 _N_=1
id=os1 job=2 _NAME_=idchem idchem1=211700 idchem2=990021 idchem3=210701 idchem4=990005 cla_exp=1 id_job=os1_2 _ERROR_=0 _N_=2
id=os2 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. cla_exp=1 id_job=os2_1 _ERROR_=0 _N_=3
id=os2 job=2 _NAME_=idchem idchem1=990021 idchem2=. idchem3=. idchem4=. cla_exp=0 id_job=os2_2 _ERROR_=0 _N_=4
id=os2 job=3 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. cla_exp=1 id_job=os2_3 _ERROR_=0 _N_=5
id=os3 job=1 _NAME_=idchem idchem1=211700 idchem2=. idchem3=. idchem4=. cla_exp=0 id_job=os3_1 _ERROR_=0 _N_=6
id=os3 job=3 _NAME_=idchem idchem1=210701 idchem2=. idchem3=. idchem4=. cla_exp=0 id_job=os3_3 _ERROR_=0 _N_=7
id=os4 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=211700 idchem4=. cla_exp=1 id_job=os4_1 _ERROR_=0 _N_=8
NOTE: There were 8 observations read from the data set WORK.IDNEW1B.
NOTE: The data set WORK.CLAT has 8 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.02 seconds
 
 
113 proc print data=clat;
114 Title "Cla exposure";
115 run;
 
NOTE: There were 8 observations read from the data set WORK.CLAT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.20 seconds
cpu time 0.17 seconds
 
 
116
117 /*Bio exposure*/
118 data biot;
119 set idnew1b;
120 if idchem1='990021' or idchem2='990021' or idchem3='990021' or idchem4='990021' then bio_exp=1;
121 else bio_exp=0;
122 id_job=catx('_', id, job);
123 put _all_;
124 drop _name_;
125 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
120:12 120:32 120:52 120:72
id=os1 job=1 _NAME_=idchem idchem1=990005 idchem2=990021 idchem3=211700 idchem4=. bio_exp=1 id_job=os1_1 _ERROR_=0 _N_=1
id=os1 job=2 _NAME_=idchem idchem1=211700 idchem2=990021 idchem3=210701 idchem4=990005 bio_exp=1 id_job=os1_2 _ERROR_=0 _N_=2
id=os2 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. bio_exp=0 id_job=os2_1 _ERROR_=0 _N_=3
id=os2 job=2 _NAME_=idchem idchem1=990021 idchem2=. idchem3=. idchem4=. bio_exp=1 id_job=os2_2 _ERROR_=0 _N_=4
id=os2 job=3 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. bio_exp=0 id_job=os2_3 _ERROR_=0 _N_=5
id=os3 job=1 _NAME_=idchem idchem1=211700 idchem2=. idchem3=. idchem4=. bio_exp=0 id_job=os3_1 _ERROR_=0 _N_=6
id=os3 job=3 _NAME_=idchem idchem1=210701 idchem2=. idchem3=. idchem4=. bio_exp=0 id_job=os3_3 _ERROR_=0 _N_=7
id=os4 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=211700 idchem4=. bio_exp=0 id_job=os4_1 _ERROR_=0 _N_=8
NOTE: There were 8 observations read from the data set WORK.IDNEW1B.
NOTE: The data set WORK.BIOT has 8 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
126 Title "Bio exposure";
127 proc print data=biot;
128
129 run;
 
NOTE: There were 8 observations read from the data set WORK.BIOT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.24 seconds
cpu time 0.24 seconds
 
 
130
131 /*Amo exposure*/
132 data amot;
133 set idnew1b;
134 if idchem1='210701' or idchem2='210701' or idchem3='210701' or idchem4='210701' then amo_exp=1;
135 else amo_exp=0;
136 id_job=catx('_', id, job);
137 put _all_;
138 drop _name_;
139 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
134:12 134:32 134:52 134:72
id=os1 job=1 _NAME_=idchem idchem1=990005 idchem2=990021 idchem3=211700 idchem4=. amo_exp=0 id_job=os1_1 _ERROR_=0 _N_=1
id=os1 job=2 _NAME_=idchem idchem1=211700 idchem2=990021 idchem3=210701 idchem4=990005 amo_exp=1 id_job=os1_2 _ERROR_=0 _N_=2
id=os2 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. amo_exp=1 id_job=os2_1 _ERROR_=0 _N_=3
id=os2 job=2 _NAME_=idchem idchem1=990021 idchem2=. idchem3=. idchem4=. amo_exp=0 id_job=os2_2 _ERROR_=0 _N_=4
id=os2 job=3 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. amo_exp=1 id_job=os2_3 _ERROR_=0 _N_=5
id=os3 job=1 _NAME_=idchem idchem1=211700 idchem2=. idchem3=. idchem4=. amo_exp=0 id_job=os3_1 _ERROR_=0 _N_=6
id=os3 job=3 _NAME_=idchem idchem1=210701 idchem2=. idchem3=. idchem4=. amo_exp=1 id_job=os3_3 _ERROR_=0 _N_=7
id=os4 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=211700 idchem4=. amo_exp=1 id_job=os4_1 _ERROR_=0 _N_=8
NOTE: There were 8 observations read from the data set WORK.IDNEW1B.
NOTE: The data set WORK.AMOT has 8 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
140
141 proc print data=amot;
142 Title "Amo exposure";
143 run;
 
NOTE: There were 8 observations read from the data set WORK.AMOT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.12 seconds
cpu time 0.13 seconds
 
 
144
145
146 /*Chl exposure*/
147 data chlt;
148 set idnew1b;
149 if idchem1='211700' or idchem2='211700' or idchem3='211700'or idchem4='211700' then chl_exp=1;
________
49
NOTE 49-169: The meaning of an identifier after a quoted string might change in a future SAS release. Inserting white space
between a quoted string and the succeeding identifier is recommended.
 
150 else chl_exp=0;
151 id_job=catx('_', id, job);
152 put _all_;
153 drop _name_;
154 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
149:12 149:32 149:52 149:71
id=os1 job=1 _NAME_=idchem idchem1=990005 idchem2=990021 idchem3=211700 idchem4=. chl_exp=1 id_job=os1_1 _ERROR_=0 _N_=1
id=os1 job=2 _NAME_=idchem idchem1=211700 idchem2=990021 idchem3=210701 idchem4=990005 chl_exp=1 id_job=os1_2 _ERROR_=0 _N_=2
id=os2 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. chl_exp=0 id_job=os2_1 _ERROR_=0 _N_=3
id=os2 job=2 _NAME_=idchem idchem1=990021 idchem2=. idchem3=. idchem4=. chl_exp=0 id_job=os2_2 _ERROR_=0 _N_=4
id=os2 job=3 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=. idchem4=. chl_exp=0 id_job=os2_3 _ERROR_=0 _N_=5
id=os3 job=1 _NAME_=idchem idchem1=211700 idchem2=. idchem3=. idchem4=. chl_exp=1 id_job=os3_1 _ERROR_=0 _N_=6
id=os3 job=3 _NAME_=idchem idchem1=210701 idchem2=. idchem3=. idchem4=. chl_exp=0 id_job=os3_3 _ERROR_=0 _N_=7
id=os4 job=1 _NAME_=idchem idchem1=210701 idchem2=990005 idchem3=211700 idchem4=. chl_exp=1 id_job=os4_1 _ERROR_=0 _N_=8
NOTE: There were 8 observations read from the data set WORK.IDNEW1B.
NOTE: The data set WORK.CHLT has 8 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
 
 
155
156 proc print data=chlt;
157 Title "chl exposure";
158 run;
 
NOTE: There were 8 observations read from the data set WORK.CHLT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.19 seconds
cpu time 0.20 seconds
 
 
159
160 /* Merging clat,biot and amot files*/
161 data mlt; merge clat biot amot chlt;
162
163 run;
 
NOTE: There were 8 observations read from the data set WORK.CLAT.
NOTE: There were 8 observations read from the data set WORK.BIOT.
NOTE: There were 8 observations read from the data set WORK.AMOT.
NOTE: There were 8 observations read from the data set WORK.CHLT.
NOTE: The data set WORK.MLT has 8 observations and 11 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds
 
 
164
165 proc print data=mlt;
166 Title "Merged exposure files for cla ,bio, amo and chl pollutants";
167 run;
 
NOTE: There were 8 observations read from the data set WORK.MLT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.16 seconds
cpu time 0.16 seconds
 
 
168
169
170 /*CROSS ASSOCIATIONS:clat,biot,amot,chlt*/
171 proc freq data=mlt;
172 tables cla_exp*bio_exp;
173 tables cla_exp*amo_exp;
174 tables cla_exp*chl_exp;
175 tables bio_exp*cla_exp;
176 tables bio_exp*amo_exp;
177 tables bio_exp*chl_exp;
178 run;
 
NOTE: There were 8 observations read from the data set WORK.MLT.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.66 seconds
cpu time 0.58 seconds
 
 
179
180
181 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
193
 





 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @ak2011 

 

You can try this. Hope this help!

The only thing to adapt is the decode of idchem variable at the beginning of the program in order to create the corresponding character variable with the flag name.

 

Best,

 

data idnew1;
	input id$ job idchem;
	id_job=catx('_', id, job);
	length idchem_char $7;
	if 		idchem = 990005 then idchem_char = 'cla_exp';
	else if idchem = 990021 then idchem_char = 'bio_exp';
	else if idchem = 210701 then idchem_char = 'amo_exp';
	else if idchem = 211700 then idchem_char = 'chl_exp'; /* to adapt */
	datalines;
os1 1 990005
os1 1 990021
os1 1 211700
os1 2 211700
os1 2 990021
os1 2 210701
os1 2 990005
os2 1 210701
os2 1 990005
os2 2 990021
os2 3 210701
os2 3 990005
os3 3 210701
os3 1 211700
os4 1 210701
os4 1 990005
os4 1 211700
;
run;

/* 1. Generate macrovariables */

	proc sql noprint;
		/* Retrieve the list of distinct idchems (num) in a macrovariable : &list_idchem */
		select distinct idchem into:list_idchem separated by " " from idnew1;
		/* Retrieve the list of distinct idchems (format) in a macrovariable : &list_idchem_char */
		select distinct idchem_char into:list_idchem_char separated by " " from idnew1;
		/* Retrieve the number of distinct idchems in a macrovariable : &count_idchem */
		select count(distinct idchem_char) into:count_idchem trimmed from idnew1;
	quit;

	data _null_;
		length temp $2000;
		temp=cat('"',tranwrd("&list_idchem_char"," ",'" "'),'"');
		call symput('list_idchem_char_q',temp);
	run;

	%put &list_idchem;
	%put &list_idchem_char;
	%put &list_idchem_char_q;
	%put &count_idchem;

/* 2. Transpose data and concatenate idchem */

proc sort data=idnew1;
	by id_job;
run;

data idnew_tr;
	set idnew1;
	
	array flag(4) $ _temporary_;
	by id_job;
	retain flag;

	if first.id_job then do;
		count=0;
		call missing(of flag(*));
	end;
	
	count+1;
	
	do i=1 to dim(flag);
		flag(count)=idchem_char;
	end;
	
	idchem_list = catx(" ", of flag(*));
	if last.id_job then output;
	
	drop count i idchem idchem_char id job;
run;

/* 3. Create flag variables */
data idnew_flag;
	set idnew_tr;
	
	array list(&count_idchem) $ (&list_idchem_char_q);
	array flag(*) &list_idchem_char;
	
	do i=1 to dim(list);
		if indexw(idchem_list, list(i)) > 0 then flag(i) = 1;
		else flag(i)=0;
	end;
	
	drop i idchem_list list1-list&count_idchem;
run;

proc print data=idnew_flag;
run;

/* 3. Generate as many 2x2 tables as combinations */

%macro combi (dataset);

proc sql;
	select count(*) into :nbvar from dictionary.columns where libname='WORK' and memname='IDNEW_FLAG' ;
	select name into :mv1- :mv%cmpres(&nbvar) from dictionary.columns where libname='WORK' and memname='IDNEW_FLAG' ;
quit;
 
%do i=2 %to %eval(&nbvar) ;
	%do j=%eval(&i+1) %to %eval(&nbvar);
 
proc freq data=idnew_flag;
	tables &&mv&i*&&mv&j ;
run;

	%end;
%end;
 
%mend;

%combi ();

 

View solution in original post

13 REPLIES 13
ed_sas_member
Meteorite | Level 14

Hi @ak2011 

 

You can try this. Hope this help!

The only thing to adapt is the decode of idchem variable at the beginning of the program in order to create the corresponding character variable with the flag name.

 

Best,

 

data idnew1;
	input id$ job idchem;
	id_job=catx('_', id, job);
	length idchem_char $7;
	if 		idchem = 990005 then idchem_char = 'cla_exp';
	else if idchem = 990021 then idchem_char = 'bio_exp';
	else if idchem = 210701 then idchem_char = 'amo_exp';
	else if idchem = 211700 then idchem_char = 'chl_exp'; /* to adapt */
	datalines;
os1 1 990005
os1 1 990021
os1 1 211700
os1 2 211700
os1 2 990021
os1 2 210701
os1 2 990005
os2 1 210701
os2 1 990005
os2 2 990021
os2 3 210701
os2 3 990005
os3 3 210701
os3 1 211700
os4 1 210701
os4 1 990005
os4 1 211700
;
run;

/* 1. Generate macrovariables */

	proc sql noprint;
		/* Retrieve the list of distinct idchems (num) in a macrovariable : &list_idchem */
		select distinct idchem into:list_idchem separated by " " from idnew1;
		/* Retrieve the list of distinct idchems (format) in a macrovariable : &list_idchem_char */
		select distinct idchem_char into:list_idchem_char separated by " " from idnew1;
		/* Retrieve the number of distinct idchems in a macrovariable : &count_idchem */
		select count(distinct idchem_char) into:count_idchem trimmed from idnew1;
	quit;

	data _null_;
		length temp $2000;
		temp=cat('"',tranwrd("&list_idchem_char"," ",'" "'),'"');
		call symput('list_idchem_char_q',temp);
	run;

	%put &list_idchem;
	%put &list_idchem_char;
	%put &list_idchem_char_q;
	%put &count_idchem;

/* 2. Transpose data and concatenate idchem */

proc sort data=idnew1;
	by id_job;
run;

data idnew_tr;
	set idnew1;
	
	array flag(4) $ _temporary_;
	by id_job;
	retain flag;

	if first.id_job then do;
		count=0;
		call missing(of flag(*));
	end;
	
	count+1;
	
	do i=1 to dim(flag);
		flag(count)=idchem_char;
	end;
	
	idchem_list = catx(" ", of flag(*));
	if last.id_job then output;
	
	drop count i idchem idchem_char id job;
run;

/* 3. Create flag variables */
data idnew_flag;
	set idnew_tr;
	
	array list(&count_idchem) $ (&list_idchem_char_q);
	array flag(*) &list_idchem_char;
	
	do i=1 to dim(list);
		if indexw(idchem_list, list(i)) > 0 then flag(i) = 1;
		else flag(i)=0;
	end;
	
	drop i idchem_list list1-list&count_idchem;
run;

proc print data=idnew_flag;
run;

/* 3. Generate as many 2x2 tables as combinations */

%macro combi (dataset);

proc sql;
	select count(*) into :nbvar from dictionary.columns where libname='WORK' and memname='IDNEW_FLAG' ;
	select name into :mv1- :mv%cmpres(&nbvar) from dictionary.columns where libname='WORK' and memname='IDNEW_FLAG' ;
quit;
 
%do i=2 %to %eval(&nbvar) ;
	%do j=%eval(&i+1) %to %eval(&nbvar);
 
proc freq data=idnew_flag;
	tables &&mv&i*&&mv&j ;
run;

	%end;
%end;
 
%mend;

%combi ();

 

mkeintz
PROC Star

I think what you want is one observations per ID/JOB,   with N dummy variables (where N is the number of distinct IDCHEM values) over the entire original datal.

 

You can do the following:

 

  1. proc freq the data set to get a file with N sorted obs, one per IDCHEM.in dataset FREQS
  2. transpose that (very small) FREQS dataset twice
    1. Make an empty data set, (data set DUMMY_VARS) but with the needed varnames DUMMY_210701  DUMMY_211700 ...)
    2. Make a single obs data (REF_VALUES) set with the array of observed IDCHEM values.
  3. Then rerun through the (sorted) original dataset, setting realized dummy vars to 1.
data idnew1;
input id$ job idchem;
datalines;
os1 1 990005
os1 1 990021
os1 1 211700
os1 2 211700
os1 2 990021
os1 2 210701
os1 2 990005
os2 1 210701
os2 1 990005
os2 2 990021
os2 3 210701
os2 3 990005
os3 3 210701
os3 1 211700
os4 1 210701
os4 1 990005
os4 1 211700
run;

proc sort data=idnew1 out=have;
  by id job;
run;

/* Generate a dataset FREQS of all realised IDCHEM values, in sorted order */
proc freq data=idnew1 noprint;
 tables idchem / out=freqs;
run;


/* Make a template dataset (0 obs) of sorted varnames (dummy_210701 dummy211700 ...)
   corresponding to the IDCHEM values */
proc transpose data=freqs (keep=idchem) out=dummy_vars (drop=_name_ )  prefix=dummy_;
  id idchem;
run;

/* Make a single transposed list of sorted IDCHEM values, for value checking */
proc transpose data=freqs (keep=idchem) out=ref_values (drop=_name_) prefix=_value_;
  var idchem;
run;


data want (drop=_:);
  if _n_=1 then set ref_values;   /* Get array of reference CHEM values */

  do until (last.job);
    set have;
    by id job;
	if 0 then set dummy_vars;      /* Just to establish the DUMMY var names */
	array dummy{*} dummy_: ;
	if first.job then do _i=1 to dim(dummy); dummy{_i}=0; end;
	dummy{whichn(idchem,of _value_:)}=1;
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ak2011
Fluorite | Level 6
Thanks. Maybe I did not clarify things well.
The dummies have been created alright but now I need to find id_job variables and find the associations between say cla_exp * bio_exp, etc using the proc frequency. Now the exposed(1) and unexposed(0) variables have been created but I need to find the associations between them.
Thanks.
ak.
ak2011
Fluorite | Level 6
Thanks very much! It works for the smaller dataset. I am trying it on the larger dataset but the only trouble is the idchem_char. Sas generated an error message. Actually it is not your fault, for the smaller dataset I am used datalines but in the larger one, I am reading directly from a file(not datalines). Can you please help me to resolve the idchem_char problem? The code is founld below:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* MACRO APPROACH*/
74
75 data chem1(keep=id job idchem);
76 length idchem_char $7;
77 set multi.multiworkchempostauto;
NOTE: Data file MULTI.MULTIWORKCHEMPOSTAUTO.DATA is in a format that is native to another host, or the file encoding does not match
the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might
reduce performance.
78 /*input id$ job idchem;*/
79 /*id_job=catx('_', id, job);*/
80 length idchem_char $7;
81 if idchem = 990005 then idchem_char = 'cla_exp';
82 else if idchem = 990021 then idchem_char = 'bio_exp';
83 else if idchem = 520299 then idchem_char = 'ali_exp';
84 else if idchem = 420204 then idchem_char = 'iso_exp'; /* to adapt */
85
86 else if idchem = 430101 then idchem_char = 'ben_exp';
87 else if idchem = 430103 then idchem_char = 'xyl_exp';
88 else if idchem = 430102 then idchem_char = 'tol_exp';
89 else if idchem = 430104 then idchem_char = 'sty_exp'; /* to adapt */
90
91 else if idchem = 460003 then idchem_char = 'sol_exp';
92 else if idchem = 220501 then idchem_char = 'for_exp';
93 else if idchem = 510004 then idchem_char = 'hyp_exp';
94 else if idchem = 210701 then idchem_char = 'amo_exp'; /* to adapt */
95
96
97 else if idchem = 211700 then idchem_char = 'chl_exp';
98 else if idchem = 410002 then idchem_char = 'cau_exp';
99 else if idchem = 411501 then idchem_char = 'pho_exp';
100 else if idchem = 420401 then idchem_char = 'ace_exp'; /* to adapt */
101
102 else if idchem = 211701 then idchem_char = 'hcl_exp';
103 else if idchem = 460004 then idchem_char = 'wpo_exp'; /* to adapt */
104 id_job=catx('_', id, job);
105 run;

NOTE: There were 124695 observations read from the data set MULTI.MULTIWORKCHEMPOSTAUTO.
NOTE: The data set WORK.CHEM1 has 124695 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 1.89 seconds
cpu time 1.11 seconds


106
107 /* 1. Generate macrovariables */
108
109
109 ! proc sql noprint;
110 /* Retrieve the list of distinct idchems (num) in a macrovariable : &list_idchem */
111 select distinct idchem into:list_idchem separated by " " from chem1;
112 /* Retrieve the list of distinct idchems (format) in a macrovariable : &list_idchem_char */
113 select distinct idchem_char into:list_idchem_char separated by " " from chem1;
ERROR: The following columns were not found in the contributing tables: idchem_char.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
114 /* Retrieve the number of distinct idchems in a macrovariable : &count_idchem */
115 select count(distinct idchem_char) into:count_idchem trimmed from chem1;
ERROR: The following columns were not found in the contributing tables: idchem_char.
116 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.17 seconds
cpu time 0.23 seconds

117
118
119 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
131
Thanks. ak.
ed_sas_member
Meteorite | Level 14

Hi @ak2011 

 

It seems that you have kept only the following variables on line 75 -> data chem1(keep=id job idchem)

You should add also data idchem_char

Best,

 

 

ak2011
Fluorite | Level 6
Thanks. A little problem again on flag(count) line 88:
Can you please check for me?

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 data idnew_tr;
74 set idnew1;
75
76 array flag(4) $ _temporary_;
77 by id_job;
78 retain flag;
79
80 if first.id_job then do;
81 count=0;
82 call missing(of flag(*));
83 end;
84
85 count+1;
86
87 do i=1 to dim(flag);
88 flag(count)=idchem_char;
89 end;
90
91 idchem_list = catx(" ", of flag(*));
92 if last.id_job then output;
93
94 drop count i idchem idchem_char id job;
95 run;

ERROR: Array subscript out of range at line 88 column 3.
id=OSaa01 job=2 idchem=530196 idchem_char= id_job=OSaa01_2 FIRST.id_job=0 LAST.id_job=1 count=5 i=1 idchem_list= _ERROR_=1 _N_=8
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 9 observations read from the data set WORK.IDNEW1.
WARNING: The data set WORK.IDNEW_TR may be incomplete. When this step was stopped there were 1 observations and 2 variables.
WARNING: Data set WORK.IDNEW_TR was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

Thanks.
ak.
ed_sas_member
Meteorite | Level 14

Hi @ak2011 

It seems to be a little mistake in the code.

Please put 

 

	flag(count)=idchem_char;

instead of 

 

	do i=1 to dim(flag);
		flag(count)=idchem_char;
	end;

Best,

ak2011
Fluorite | Level 6
Hi,
Your code is right since it is working well with the smaller dataset but when I apply it to the larger one is when the error generates. I think it has to do with flag(count). Please recheck this code: Thanks.:

OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* 2. Transpose data and concatenate idchem */
74
75 proc sort data=idnew1;
76 by id_job;
77 run;

NOTE: PROCEDURE SORT used (Total process time):
real time 0.36 seconds
cpu time 0.37 seconds

NOTE: There were 124695 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW1 has 124695 observations and 5 variables.

78
79 data idnew_tr;
80 set idnew1;
81
82 array flag(4) $ _temporary_;
83 by id_job;
84 retain flag;
85
86 if first.id_job then do;
87 count=0;
88 call missing(of flag(*));
89 end;
90
91 count+1;
92
93 do i=1 to dim(flag);
94 flag(count)=idchem_char;
95 end;
96
97 idchem_list = catx(" ", of flag(*));
98 if last.id_job then output;
99
100 drop count i idchem idchem_char id job;
101 run;

ERROR: Array subscript out of range at line 94 column 3.
id=OSaa01 job=2 idchem=530196 idchem_char= id_job=OSaa01_2 FIRST.id_job=0 LAST.id_job=1 count=5 i=1 idchem_list= _ERROR_=1 _N_=8
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 9 observations read from the data set WORK.IDNEW1.
WARNING: The data set WORK.IDNEW_TR may be incomplete. When this step was stopped there were 1 observations and 2 variables.
WARNING: Data set WORK.IDNEW_TR was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds


102
103 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
115

Thanks.
ak.
ak2011
Fluorite | Level 6
Sorry for back and forth communication:
the flag(count) =... did not work. I expect the "count" to be in "blue" like what you sent me but it wasn't like that:
Please recheck the code below:


OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* 2. Transpose data and concatenate idchem */
74
75 proc sort data=idnew1;
76 by id_job;
77 run;

NOTE: Input data set is already sorted, no sorting done.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


78
79 data idnew_tr;
80 set idnew1;
81
82 array flag(4) $ _temporary_;
83 by id_job;
84 retain flag;
85
86 if first.id_job then do;
87 count=0;
88 call missing(of flag(*));
89 end;
90
91 count+1;
92 flag(count)=idchem_char;
93
94 idchem_list = catx(" ", of flag(*));
95 if last.id_job then output;
96
97 drop count i idchem idchem_char id job;
98 run;

WARNING: The variable i in the DROP, KEEP, or RENAME list has never been referenced.
ERROR: Array subscript out of range at line 92 column 2.
id=OSaa01 job=2 idchem=530196 idchem_char= id_job=OSaa01_2 FIRST.id_job=0 LAST.id_job=1 count=5 idchem_list= _ERROR_=1 _N_=8
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 9 observations read from the data set WORK.IDNEW1.
WARNING: The data set WORK.IDNEW_TR may be incomplete. When this step was stopped there were 1 observations and 2 variables.
WARNING: Data set WORK.IDNEW_TR was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds


99
100
101 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
113

Thanks.
ak.
ed_sas_member
Meteorite | Level 14

HI @ak2011 

 

No problem Smiley Happy

 

You can drop the I variable which is no more used in the drop statement as follows:

drop idchem_list list1-list&count_idchem; /*line 97*/

and also put  &count_idchem line 82 instead of 4. This macro variable is created during the first step (proc sql)

 array flag(&count_idchem) $ _temporary_; /*line 82*/

 

ak2011
Fluorite | Level 6
I really appreciate your expert input! I am almost there:Comments(steps) #1 and 2 ok; afew things to resolve, however.: Warning: Not all variables are in list-list18. Is there a way I can correct the warning?Also steps 3 and 4 are difficult to process by SAS. It took quite a long time for #3 and for # 4,
SAS has not been able to process it.
I think the entire data with 18 idchems should be split into 2 or 3 for this macro approah for easy processing by SAS. I would be pleased if you could help me again (if possible) to split the data into 2 or 3.
Thanks in advance for your help. ak.

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* MACRO APPROACH*/
74
75 libname multi'/folders/myfolders/multisitedata';
NOTE: Libref MULTI refers to the same physical library as _TEMP3.
NOTE: Libref MULTI was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders/multisitedata
76
77 data idnew1(keep=id job idchem idchem_char id_job);
78 set multi.multiworkchempostauto;
NOTE: Data file MULTI.MULTIWORKCHEMPOSTAUTO.DATA is in a format that is native to another host, or the file encoding does not match
the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might
reduce performance.
79 /*length idchem_char $7;*/
80 id_job=catx('_', id, job);
81
82 if idchem = 990005 then idchem_char = 'cla_exp';
83 else if idchem = 990021 then idchem_char = 'bio_exp';
84 else if idchem = 520299 then idchem_char = 'ali_exp';
85 else if idchem = 420204 then idchem_char = 'iso_exp'; /* to adapt */
86
87 else if idchem = 430101 then idchem_char = 'ben_exp';
88 else if idchem = 430103 then idchem_char = 'xyl_exp';
89 else if idchem = 430102 then idchem_char = 'tol_exp';
90 else if idchem = 430104 then idchem_char = 'sty_exp'; /* to adapt */
91
92 else if idchem = 460003 then idchem_char = 'sol_exp';
93 else if idchem = 220501 then idchem_char = 'for_exp';
94 else if idchem = 510004 then idchem_char = 'hyp_exp';
95 else if idchem = 210701 then idchem_char = 'amo_exp'; /* to adapt */
96
97
98 else if idchem = 211700 then idchem_char = 'chl_exp';
99 else if idchem = 410002 then idchem_char = 'cau_exp';
100 else if idchem = 411501 then idchem_char = 'pho_exp';
101 else if idchem = 420401 then idchem_char = 'ace_exp'; /* to adapt */
102
103 else if idchem = 211701 then idchem_char = 'hcl_exp';
104 else if idchem = 460004 then idchem_char = 'wpo_exp'; /* to adapt */
105 run;

NOTE: There were 124695 observations read from the data set MULTI.MULTIWORKCHEMPOSTAUTO.
NOTE: The data set WORK.IDNEW1 has 124695 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 1.43 seconds
cpu time 1.22 seconds


106
107
108
109
110 /* 1. Generate macrovariables */
111
112
112 ! proc sql noprint;
113 /* Retrieve the list of distinct idchems (num) in a macrovariable : &list_idchem */
114 select distinct idchem into:list_idchem separated by " " from idnew1;
115 /* Retrieve the list of distinct idchems (format) in a macrovariable : &list_idchem_char */
116 select distinct idchem_char into:list_idchem_char separated by " " from idnew1;
117 /* Retrieve the number of distinct idchems in a macrovariable : &count_idchem */
118 select count(distinct idchem_char) into:count_idchem trimmed from idnew1;
119 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.46 seconds
cpu time 0.55 seconds


120
121
121 ! data _null_;
122 length temp $2000;
123 temp=cat('"',tranwrd("&list_idchem_char"," ",'" "'),'"');
124 call symput('list_idchem_char_q',temp);
125 run;

NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


126
127 %put &list_idchem;
110001 110002 110004 110005 110007 110008 110009 110010 110011 110012 110013 110014 110015 110016 110017 110018 110019 110020
110021 110024 110025 110026 110028 110029 110030 110033 110501 111101 111102 111110 111301 111303 111401 111501 111600 112001
112002 112003 112005 112201 112400 112600 112601 112800 112900 113000 113001 114800 115000 118200 118201 118204 118205 130001
130101 140001 140002 140003 140004 140005 140006 140007 140008 140009 140010 140012 140013 140014 140015 140016 140017 140018
145001 150001 150002 150007 150008 150009 150010 150011 150012 150013 150014 150015 150016 150017 150018 150019 150020 150022
150023 150024 150025 150026 150027 150028 150029 150030 160001 170001 170002 170003 170004 170005 170006 170008 170009 201602
210100 210601 210602 210701 210703 210801 210901 211601 211602 211700 211701 211702 220001 220101 220102 220501 220603 221101
221102 221103 221301 225001 260001 260002 270001 310001 310002 310003 310004 311201 311301 312001 312201 312401 312501 312601
312801 312901 313001 314701 314801 315001 317901 318201 370001 370002 370004 370005 370006 370008 370009 370010 370011 370012
370013 370014 370015 370016 410001 410002 410003 410005 410704 410802 411501 411603 418000 420001 420002 420201 420202 420203
420204 420205 420303 420401 420403 420602 420701 420702 420801 420802 420803 420804 421001 421101 421302 421303 421404 421501
430101 430102 430103 430104 430201 430701 440001 440002 440003 440501 460002 460003 460004 460005 460006 460007 460008 460009
460011 460012 460013 460014 460015 460017 460019 460022 460025 460026 460027 460029 460030 460031 470001 470002 470003 470005
510001 510002 510003 510004 510005 510499 511299 511399 512299 512399 512499 512599 512699 512799 512899 512999 513099 513399
513499 514799 514899 515099 515199 515299 517499 517999 518099 518299 520197 520198 520199 520299 520599 520899 521199 521399
521499 521599 521699 521999 530193 530194 530195 530196 530197 530198 530199 530299 530399 531799 531899 890001 890002 890003
990005 990007 990008 990009 990012 990013 990014 990021 990022
128 %put &list_idchem_char;
ace_exp ali_exp amo_exp ben_exp bio_exp cau_exp chl_exp cla_exp for_exp hcl_exp hyp_exp iso_exp pho_exp sol_exp sty_exp tol_exp
wpo_exp xyl_exp
129 %put &list_idchem_char_q;
"" "ace_exp" "ali_exp" "amo_exp" "ben_exp" "bio_exp" "cau_exp" "chl_exp" "cla_exp" "for_exp" "hcl_exp" "hyp_exp" "iso_exp"
"pho_exp" "sol_exp" "sty_exp" "tol_exp" "wpo_exp" "xyl_exp"
130 %put &count_idchem;
18
131
132 /* 2. Transpose data and concatenate idchem */
133
134 proc sort data=idnew1;
135 by id_job;
136 run;

NOTE: There were 124695 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW1 has 124695 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.31 seconds
cpu time 0.31 seconds


137
138 data idnew_tr;
139 set idnew1;
140
141 array flag(&count_idchem) $ _temporary_;
142 by id_job;
143 retain flag;
144
145 if first.id_job then do;
146 count=0;
147 call missing(of flag(*));
148 end;
149
150 count+1;
151
152 do i=1 to dim(flag);
153 /*flag(count)=idchem_char;*/
154 end;
155
156 idchem_list = catx(" ", of flag(*));
157 if last.id_job then output;
158
159 /*drop count i idchem idchem_char id job;*/
160 drop idchem_list list1-list&count_idchem;
161 run;

WARNING: Not all variables in the list list1-list18 were found.
NOTE: There were 124695 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW_TR has 11058 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.28 seconds
cpu time 0.28 seconds


162
163 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
175
ed_sas_member
Meteorite | Level 14

Hi @ak2011 

 

It seems that there are two mistakes in the code:

  • Line 123 of your code, please insert the strip() function as follows:
temp=cat('"',tranwrd(strip("&list_idchem_char")," ",'" "'),'"');
  • Line 152-154 of your code, you have put into comment the wrong part Smiley Happy . It should be:
/*do i=1 to dim(flag);*/
flag(count)=idchem_char;
/*end;*/

idchem_list = catx(" ", of flag(*)); if last.id_job then output; drop count idchem idchem_char id job;
ak2011
Fluorite | Level 6
Thanks very much for your time. Your code works fine with the smaller dataset but there is still a problem with the flag(count)=idchem_char;
Please read log below:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73
74 /* MACRO APPROACH*/
75
76 libname multi'/folders/myfolders/multisitedata';
NOTE: Libref MULTI refers to the same physical library as _TEMP3.
NOTE: Libref MULTI was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders/multisitedata
77
78 data idnew1(keep=id job idchem idchem_char id_job);
79 set multi.multiworkchempostauto;
NOTE: Data file MULTI.MULTIWORKCHEMPOSTAUTO.DATA is in a format that is native to another host, or the file encoding does not match
the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might
reduce performance.
80 /*length idchem_char $7;*/
81 id_job=catx('_', id, job);
82
83 if idchem = 990005 then idchem_char = 'cla_exp';
84 else if idchem = 990021 then idchem_char = 'bio_exp';
85 else if idchem = 520299 then idchem_char = 'ali_exp';
86 else if idchem = 420204 then idchem_char = 'iso_exp'; /* to adapt */
87
88 else if idchem = 430101 then idchem_char = 'ben_exp';
89 else if idchem = 430103 then idchem_char = 'xyl_exp';
90 else if idchem = 430102 then idchem_char = 'tol_exp';
91 else if idchem = 430104 then idchem_char = 'sty_exp'; /* to adapt */
92
93 else if idchem = 460003 then idchem_char = 'sol_exp';
94 else if idchem = 220501 then idchem_char = 'for_exp';
95 else if idchem = 510004 then idchem_char = 'hyp_exp';
96 else if idchem = 210701 then idchem_char = 'amo_exp'; /* to adapt */
97
98
99 else if idchem = 211700 then idchem_char = 'chl_exp';
100 else if idchem = 410002 then idchem_char = 'cau_exp';
101 else if idchem = 411501 then idchem_char = 'pho_exp';
102 else if idchem = 420401 then idchem_char = 'ace_exp'; /* to adapt */
103
104 else if idchem = 211701 then idchem_char = 'hcl_exp';
105 else if idchem = 460004 then idchem_char = 'wpo_exp'; /* to adapt */
106 run;

NOTE: There were 124695 observations read from the data set MULTI.MULTIWORKCHEMPOSTAUTO.
NOTE: The data set WORK.IDNEW1 has 124695 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 1.46 seconds
cpu time 1.20 seconds


107
108
109
110
111 /* 1. Generate macrovariables */
112
113
113 ! proc sql noprint;
114 /* Retrieve the list of distinct idchems (num) in a macrovariable : &list_idchem */
115 select distinct idchem into:list_idchem separated by " " from idnew1;
116 /* Retrieve the list of distinct idchems (format) in a macrovariable : &list_idchem_char */
117 select distinct idchem_char into:list_idchem_char separated by " " from idnew1;
118 /* Retrieve the number of distinct idchems in a macrovariable : &count_idchem */
119 select count(distinct idchem_char) into:count_idchem trimmed from idnew1;
120 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.44 seconds
cpu time 0.53 seconds


121
122
122 ! data _null_;
123 length temp $2000;
124 temp=cat('"',tranwrd(strip("&list_idchem_char")," ",'" "'),'"');
125 call symput('list_idchem_char_q',temp);
126 run;

NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds


127
128 %put &list_idchem;
110001 110002 110004 110005 110007 110008 110009 110010 110011 110012 110013 110014 110015 110016 110017 110018 110019 110020
110021 110024 110025 110026 110028 110029 110030 110033 110501 111101 111102 111110 111301 111303 111401 111501 111600 112001
112002 112003 112005 112201 112400 112600 112601 112800 112900 113000 113001 114800 115000 118200 118201 118204 118205 130001
130101 140001 140002 140003 140004 140005 140006 140007 140008 140009 140010 140012 140013 140014 140015 140016 140017 140018
145001 150001 150002 150007 150008 150009 150010 150011 150012 150013 150014 150015 150016 150017 150018 150019 150020 150022
150023 150024 150025 150026 150027 150028 150029 150030 160001 170001 170002 170003 170004 170005 170006 170008 170009 201602
210100 210601 210602 210701 210703 210801 210901 211601 211602 211700 211701 211702 220001 220101 220102 220501 220603 221101
221102 221103 221301 225001 260001 260002 270001 310001 310002 310003 310004 311201 311301 312001 312201 312401 312501 312601
312801 312901 313001 314701 314801 315001 317901 318201 370001 370002 370004 370005 370006 370008 370009 370010 370011 370012
370013 370014 370015 370016 410001 410002 410003 410005 410704 410802 411501 411603 418000 420001 420002 420201 420202 420203
420204 420205 420303 420401 420403 420602 420701 420702 420801 420802 420803 420804 421001 421101 421302 421303 421404 421501
430101 430102 430103 430104 430201 430701 440001 440002 440003 440501 460002 460003 460004 460005 460006 460007 460008 460009
460011 460012 460013 460014 460015 460017 460019 460022 460025 460026 460027 460029 460030 460031 470001 470002 470003 470005
510001 510002 510003 510004 510005 510499 511299 511399 512299 512399 512499 512599 512699 512799 512899 512999 513099 513399
513499 514799 514899 515099 515199 515299 517499 517999 518099 518299 520197 520198 520199 520299 520599 520899 521199 521399
521499 521599 521699 521999 530193 530194 530195 530196 530197 530198 530199 530299 530399 531799 531899 890001 890002 890003
990005 990007 990008 990009 990012 990013 990014 990021 990022
129 %put &list_idchem_char;
ace_exp ali_exp amo_exp ben_exp bio_exp cau_exp chl_exp cla_exp for_exp hcl_exp hyp_exp iso_exp pho_exp sol_exp sty_exp tol_exp
wpo_exp xyl_exp
130 %put &list_idchem_char_q;
"ace_exp" "ali_exp" "amo_exp" "ben_exp" "bio_exp" "cau_exp" "chl_exp" "cla_exp" "for_exp" "hcl_exp" "hyp_exp" "iso_exp" "pho_exp"
"sol_exp" "sty_exp" "tol_exp" "wpo_exp" "xyl_exp"
131 %put &count_idchem;
18
132
133 /* 2. Transpose data and concatenate idchem */
134
135 proc sort data=idnew1;
136 by id_job;
137 run;

NOTE: There were 124695 observations read from the data set WORK.IDNEW1.
NOTE: The data set WORK.IDNEW1 has 124695 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.38 seconds
cpu time 0.39 seconds


138
139 data idnew_tr;
140 set idnew1;
141
142 array flag(&count_idchem) $ _temporary_;
143 by id_job;
144 retain flag;
145
146 if first.id_job then do;
147 count=0;
148 call missing(of flag(*));
149 end;
150
151 count+1;
152
153 /*do i=1 to dim(flag);*/
154 flag(count)=idchem_char;
155 /*end;*/
156
157 idchem_list = catx(" ", of flag(*));
158 if last.id_job then output;
159
160 drop count idchem idchem_char id job;
161
162
163 run;

ERROR: Array subscript out of range at line 154 column 1.
id=OSaa03 job=4 idchem=460013 id_job=OSaa03_4 idchem_char= FIRST.id_job=0 LAST.id_job=0 count=19 idchem_list= _ERROR_=1 _N_=54
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 55 observations read from the data set WORK.IDNEW1.
WARNING: The data set WORK.IDNEW_TR may be incomplete. When this step was stopped there were 7 observations and 2 variables.
WARNING: Data set WORK.IDNEW_TR was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.04 seconds


164
165 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
177
ak.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2265 views
  • 0 likes
  • 3 in conversation