If you know the variables and the naming structure is pretty fixed you have a few options that I see:
1. Generate a dynamic list using a macro approach
2. Select the names using pattern matching from the list, assumed fixed pattern of table2_COUNTRY_YEAR_STATISTIC. Use SCAN() to extract those portion of the name.
3. Reformat the data such that it's in a long/longer format and then filter/manipulate it.
For the long run, I suspect #3 is better because the countries will be rows rather than in the variable name.
Approach #2 - note the index (2/4) may be wrong, you should test this.
proc sql noprint;
select name into :name_list separated by " "
from sashelp.vcolumn where memname = 'WORLDDATA'
and lower(scan(name, 2, "_.")) in ('use', 'canada', 'japan')
and lower(scan(name, 4, "_.")) in ('population', 'crimerate', 'gap');
quit;
%put &name_list.;
data want;
set worldData;
keep &name_list;
run;
For an example of #3 I'd need more information on the sample structure.
@RogerJones wrote:
I have a very large dataset with over 1000 columns, with column names formatted like this:
WORLDDATA.table2_usa_2017_population
WORLDDATA.table2_japan_2017_gnp
I only need to keep a subset of these parameters for a select few countries. I specify the custom lists:
%let list1 = usa canada uk japan southafrica;
%let list2 = population crimerate gnp;
How do I do a double for loop like so:
param_list = []
for (i in list1) {
for (j in list2) {
param_name = WORLDDATA.table2_{list1[i]}_2017_{list2[j]}
param_list.append(param_name)
}
}
such that I can use param_list in
data final_dataset;
set WORLDDATA.table2;
keep {param_list};
run;
Thank you!