Hi ,
I am fairly new to SAS.
I have a table in below format.
Table Name - main_table
M1 M2 M3 ..... M200 year
1 45 32 ..... 45 2001
6 4 23 ..... 48 2002
..... .....................................
..........................................
The table has 200 columns that are from M1 to M200 & last column is year column.
All these 200 columns have numeric values from 1 to 50.
I already have a proc sql that does job for me to get counts for one M column.
I want to iterate this proc sql for all columns in this table.
Below is proc sql i have -
proc sql;
create table M1 as
SELECT
M1,
SUM(CASE WHEN (year='2001') THEN 1 ELSE 0 END) AS M1_2001,
SUM(CASE WHEN (year='2002') THEN 1 ELSE 0 END) AS M2_2002,
...
...
from main_table
group by M1
order by M1
ASC;
Quit;
Please help me with iterating this proc sql through for all M1 to M200.
The sql query is giving me results i needed just for one column. Need those results for all columns.
Thanks
You are going about it the hard way. Something like this is a lot easier:
proc summary data = main_table nway;
  class year;
  var M1 - M200;
  output out = want
         sum = 
         ;
run;It gives me error as -
ERROR: Variable M1 in list does not match type prescribed for this list.
ERROR: Variable M2 in list does not match type prescribed for this list.
so on.. for all variables.
Are you wanting to count the M variables rather than sum them?
If so how to you want them counted - by each distinct value of M? Please supply sample input data and wanted output data.
Yes , I want them counted by distinct values of M.
Basically all my proc sql doing is - it is creating catching how many times a number which is 1 to 50 came for M1 in every year.
So my output data would look something like this for the proc sql i have now for M1.
Index 2001 2002 2003 ....
1 25 34 67
2 10 45 88
3
.
.
50
That means 1 came 25 times for 2001 , 2 came 88 times in 2003 & so on.
please note this is just for M1 using individual proc sql i provided in question.
I want it to create a table like above for all M1 to M200.
So basically it should create 200 tables.
You likely want this, but it's a single table. You'll need to reformat it for your final outcome, though I suspect this will be easier to work with overall.
proc freq data=have;
table year* (M1 - M200) / out=want;
run;
Then you can manipulate the 'want' data set to get your desired output, likely via PROC TRANSPOSE.
@pa4 wrote:
It gives me error as -
ERROR: Variable M1 in list does not match type prescribed for this list.
ERROR: Variable M2 in list does not match type prescribed for this list.
so on.. for all variables.
This means that your statement
All these 200 columns have numeric values from 1 to 50.
Is false. You have character values that contain digit characters. You do not have numeric values.
Hint: Provide a data set with 10 rows of your data and only 3 of the M variables with values of only 1 to 3 or 5 plus year. That should be small enough you can completely work that data by hand. Show us what the result should be. Make sure the variables have somewhat different values in each row and column as an example with identical rows/columns might be possibly solved with approaches that don't work with general data.
Also, does the result need to be a data set, something that will be fed into another procedure, or a report that people will read?
Hello ,
sorry for the delayed response.
I have added an attachment here that show input and output data.
In the input data - there are 4 M variables & year . Each M contains value either from 10,20,30,40,50.
Now on this input data column M1 if i use proc sql query I provided earlier , I get output data ( which is basically a table for M1 only)
I need to create such tables for M2, M3 & M4 in one go without writing this query 3 more times.
These tables will be later exported into excel.
Any help on this will be great.
Thanks
Can you elaborate on why the "sum =" is open-ended. Thanks.
First a general suggestion: create your own thread and reference this one with your questions. One reason is that as the creator of a thread you have the ability to indicate an answer is "correct". Second is to not confuse issues as to why you may be asking questions about things. Third, it is a good idea to copy and insert text or code that you want to question into a text box opened on the forum the </> so we can see which specific text or code you are questioning. Some of these thread can have similar code and that helps create a specific reference.
In Procs Summary or Means if you use an OUTPUT statement as in (see the text box)
proc summary data = main_table nway;
  class year;
  var M1 - M200;
  output out = want
         sum = 
         ;
run;If you request a single statistic, could be any of those available from the statistic but Sum= is used here, then every variable referenced on the VAR statement will have that statistic value and the name of the variable will be that of the variable of the on the VAR statement. The general form is statistic (variable list)=(output variable name list).
The output statements, yes you may have more than one if you want more than one output data set, for Proc Summary/Means can get fairly complex with ID group options, minimum id options, maximum id options, autoname for naming.
@sas_user_1001 wrote:
Can you elaborate on why the "sum =" is open-ended. Thanks.
The actual WHY is the programmers prerogative. Probably didn't want to create 200 new variable names.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
