my data set has the vars listed below (along with a unique student ID). The values can = . , 1, 2, 3 , 4 or 5. For each observation I only want to keep the scores that are ge 3. Next, I want to create a variable named MET_AP, which will be a string of the names of the exams with a value ge 3 separated by a dash (if more than one occurrence).
Data I Want:
ID MET_AP
1 APbio2015 - APchem2015
2 APcalcBC2015 - APlit2015 - APphys12015
3 APworldhist2015
List of variables with values = . , 1, 2, 3, 4, 5
APbio2015 |
APcalcAB2015 |
APcalcABsub2015 |
APcalcBC2015 |
APchem2015 |
APengl2015 |
APlit2015 |
APenv_scr2015 |
APphysB2015 |
APphys_elect2015 |
APphys_mech2015 |
APphys12015 |
APphys22015 |
APstats2015 |
APworldhist2015 |
APmacro2015 |
APmicro2015 |
APeurohist2015 |
APgovt2015 |
APgovt_US2015 |
APgeogr2015 |
APint_engl2015 |
APhist2015 |
I have to assume that "get rid of the score" means set it to missing. After all, you can't get rid of a variable on one observation unless you get rid of the variable for all observations So that step might look like this:
data want;
set have;
array names {*} list of all variable names to process here;
do _n_=1 to dim(names);
if names{_n_} = 1 then names{_n_} = .O;
else if names{_n_} = 2 then names {_n_} = .T;
end;
run;
That changes the values to missing, but still preserves the original values as being distinct from one another. All the original "1" values are saved as the special missing value .O so you can get the average of all the 3+ values, but retain the knowledge of what the missing values used to be.
Stringing together names is rarely a good idea. With that being said, here is how you might modify the DATA step above to do it:
data want;
set have;
length _3_plus_names $ 500;
array names {*} list of all variable names to process here;
do _n_=1 to dim(names);
if names{_n_} = 1 then names{_n_} = .O;
else if names{_n_} = 2 then names {_n_} = .T;
else if names{_n_} >= 3 then _3_plus_names = catx(' - ', _3_plus_names, vname(names{_n_})) ;
end;
run;
I have to assume that "get rid of the score" means set it to missing. After all, you can't get rid of a variable on one observation unless you get rid of the variable for all observations So that step might look like this:
data want;
set have;
array names {*} list of all variable names to process here;
do _n_=1 to dim(names);
if names{_n_} = 1 then names{_n_} = .O;
else if names{_n_} = 2 then names {_n_} = .T;
end;
run;
That changes the values to missing, but still preserves the original values as being distinct from one another. All the original "1" values are saved as the special missing value .O so you can get the average of all the 3+ values, but retain the knowledge of what the missing values used to be.
Stringing together names is rarely a good idea. With that being said, here is how you might modify the DATA step above to do it:
data want;
set have;
length _3_plus_names $ 500;
array names {*} list of all variable names to process here;
do _n_=1 to dim(names);
if names{_n_} = 1 then names{_n_} = .O;
else if names{_n_} = 2 then names {_n_} = .T;
else if names{_n_} >= 3 then _3_plus_names = catx(' - ', _3_plus_names, vname(names{_n_})) ;
end;
run;
Try this:
data want;
set have;
array scores APbio2015 -- APhist2015;
do i = 1 to dim(scores);
if scores{i} >= 3 then MET_AP = catx(" - ", MET_AP, vname(scores{i}));
end;
keep ID MET_AP;
run;
(untested)
You can use (as already suggested by others) the CATX function to string the names - but I would much prefer to use CALL CATX, for two reasons:
So, a solution could be something like this:
data want; set have; array scores(*) APbio2015--APhist2015; length MET_AP $300; do _N_=1 to dim(scores); if scores(_N_)>=3 then call catx(' - ',MET_AP,scores(_N_)); end; keep ID MET_AP; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.