I have four datasets, child, adult, senior and over100, each one has over 10000 obs.
the orginal code is very simple
proc sort data=bpchild;
by id;
run;
PROC means DATA= bpchild;
var wvr_:;
run;
proc contents data= bpchild;
run;
I just need to replace child with adult, senior, over100.
would anybody teach me how to use macro do this? Thanks.
Thanks. I am reading it. It takes while for me to finish.
@cynthiaI have read the note from the link you provided however, I still do not know how to apply it to solve my problem.
As Reeza said below (that is originally a kind of my plan, but not in details, as he provided). however, I still have trouble to do it
%macro summarize;
*your sas code;
%mend;
%macro sum; proc sort data=bp2∑ by patientid; run; PROC means DATA=bp2∑ var wvr_:; run; %mend; data want; set bp2∑ run; %sum(child, adult, senior, over100)
%summarize;
%macro summarize(dataset);
proc means data=&dataset;
...
run;
%summarize(bpchild);
Don't.
Append the files together, use the INDSNAME to identify each source data set and then run a proc means with a BY statement.
This is more efficient. Datasets of 100K aren't big to SAS.
data all;
set child adult senior over100 indsname=source;
dsetname=source;
run;
proc sort data=all;
by dsetname id;
run;
proc means data=all;
by dsetname;
var wvr:;
run;
Thanks. But my work is to generate four tables, not a whole picture. Actually, I do have the original dataset, which includes them all. But I am requried to generate four tables.
Please clarify on this sentance:
"But I am requried to generate four tables."
Why are you required to generate four tables. The reason I ask is that @Reeza has provided code which is both simple to read, and efficient in terms of execution. So the question is, why does it then need to be four tables when further procedures can also use by group processing on the data, for instance to report out four tables:
ods tagsets.excelxp file="abc.xlsx" options=(sheet_interval=...);
proc report data=...
by dsetname;
...
run;
ods tagsets.close;
The thing is, your taking data which can be operated on as one step, splitting it up, then lopping over it each time to do the same thing, and you will find once you split the data every step thereafter than needs to loop over these, which makes life more complicated.
Thanks.
First,my work is, after I generated these four tables, I need to compare the frequency of all those variables. For example, if I have 2o variables, I need to compare the frequency of each variable, and then list them according to the frequency. I prefer to have a small one so that I can compre by looking at the list of variables directly. I do not know how to use sas to do that step. if you can give me advice, that will be helpful. This is why I ask for four tables seperatedly
Second, I am not familiar with the following code
indsname=source;
dsetname=source
I searched online. I feel that I should list variable names there. I have many variables. Then how should I list. What is the difference between indsname and dsetname here? I cannot figure out what 'source' means here.
Could you provide test data in the form of a datastep, so that we can provide some code, just guessing from the text here. Help on the indsname option:
http://www.amadeus.co.uk/sas-training/tips/1/1/106/the-indsname-option.php
Basically you would create a variable which holds the name of the dataset that data comes from, in the example given by @Reeza, this would be child or adult etc.
This variable is then used as a bygroup, you can also do frequencies and other things using by group.
I have searched online and found the link you provided. Unfortunately, with my current computer, I could not access, maybe because of setting problems.
*Create sample data;
data childhave;
input patientid var_hed var_eye var_col var_leg var_hea var_lea var_alh;
cards;
371130 1 . . . . . 1
214144 . 1 . . . . .
181841 . . . . 1.1
168065 . . 1....
779199 . . . 1 1 .
155981 1 . 1 . . . .
603224 . . . . . 1 .
;
run;
this is my test data, for one dataset, child group
thanks
@Bal23 Have you tried the steps I've outlined. Besides writing the code letter for letter for you, I'm not sure how else to explain it.
If you're having trouble, post the code you've tried.
You've also stated you want 4 tables, the BY does generate 4 separate tables.
As I said, I do not understand how to use this
indsname=source;
dsetname=source;
I did search online, I got the same link that I could not access
Then, I provided my sample data a few minutes ago. Please check and it will be great if you can provide help regarding my sample data, since I do not know how to modify your above code
thanks.
indsname is a dataset option. It identifies records by creating a variable that stores the name of the data set that provides the record.
However, that variable isn't stored automatically, so you need to reassign it in your data step.
It is only valid in SAS 9.3+
INDSNAME=variable
creates and names a variable that stores the name of the SAS data set from which the current observation is read. The stored name can be a data set name or a physical name. The physical name is the name by which the operating environment recognizes the file.
Tips For data set names, SAS adds the library name to the variable value (for example, WORK.PRICE) and converts the two-level name to uppercase.
Unless previously defined, the length of the variable is set to 41 bytes. Use a LENGTH statement to make the variable length long enough to contain the value of the physical filename if the filename is longer than 41 bytes.
If the variable is previously defined as a character variable with a specific length, that length is not changed. If the value placed into the INDSNAME variable is longer than that length, then the value is truncated.
If the variable is previously defined as a numeric variable, an error will occur.
The variable is available in the DATA step, but the variable is not added to any output data set.
Example Retrieving the Name of the Data Set from Which the Current Observation Is Read
*Create sample data;
data childhave;
input patientid var_hed var_eye var_col var_leg var_hea var_lea var_alh;
cards;
371130 1 . . . . . 1
214144 . 1 . . . . .
181841 . . . . 1.1
168065 . . 1....
779199 . . . 1 1 .
155981 1 . 1 . . . .
603224 . . . . . 1 .
;
run;
@Reeza, I have read this. Without any example, it is hard for me to do. (i did click but was not able to get any example).
Therefore, I copied and pasted my sample data set again and hope to get sample code from you
Thanks
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.