BookmarkSubscribeRSS Feed
Bal23
Lapis Lazuli | Level 10

I have four datasets, child, adult, senior and over100, each one has over 10000 obs.

the orginal code is very simple

 

proc sort data=bpchild;
by id;
run;
PROC means DATA= bpchild;
var wvr_:;
run;
proc contents data= bpchild;
run;

 

 I just need to replace child with adult, senior, over100.

would anybody teach me how to use macro do this? Thanks.

38 REPLIES 38
Bal23
Lapis Lazuli | Level 10

Thanks. I am reading it. It takes while for me to finish.

Bal23
Lapis Lazuli | Level 10

@cynthiaI have read the note from the link you provided however, I still do not know how to apply it to solve my problem.

As Reeza said below (that is originally a kind of my plan, but not in details, as he provided). however, I still have trouble to do it

 

  1. %macro summarize;
    
    *your sas code;
    
    %mend;
  2. %macro sum; proc sort data=bp2∑ by patientid; run; PROC means DATA=bp2∑ var wvr_:; run; %mend; data want; set bp2∑ run; %sum(child, adult, senior, over100)
  3. Test that it works by calling it. 
    %summarize;
  4. Change macro to use a parameter  
    %macro summarize(dataset);
  5. Change variable name in code to be parameter value, eg:
    proc means data=&dataset;
    ...
    run;
  6. Test again 
    %summarize(bpchild);
  7. Make sure it works for multiple cases.
Reeza
Super User

Don't. 

 

Append the files together, use the INDSNAME to identify each source data set and then run a proc means with a BY statement.

This is more efficient. Datasets of 100K aren't big to SAS.

 

data all;
set child adult senior over100 indsname=source;
dsetname=source;
run;

proc sort data=all;
by dsetname id;
run;

proc means data=all;
by dsetname;
var wvr:;
run;

Bal23
Lapis Lazuli | Level 10

Thanks. But my work is to generate four tables, not a whole picture. Actually, I do  have the original dataset, which includes them all. But I am requried to generate four tables.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please clarify on this sentance:

"But I am requried to generate four tables."

Why are you required to generate four tables.  The reason I ask is that @Reeza has provided code which is both simple to read, and efficient in terms of execution.  So the question is, why does it then need to be four tables when further procedures can also use by group processing on the data, for instance to report out four tables:

ods tagsets.excelxp file="abc.xlsx" options=(sheet_interval=...);

proc report data=...

  by dsetname;

...

run;

ods tagsets.close;

 

The thing is, your taking data which can be operated on as one step, splitting it up, then lopping over it each time to do the same thing, and you will find once you split the data every step thereafter than needs to loop over these, which makes life more complicated.

Bal23
Lapis Lazuli | Level 10

Thanks.

First,my work is, after I generated these four tables, I need to compare the frequency of all those variables. For example, if I have 2o variables, I need to compare the frequency of each variable, and then list them according to the frequency. I prefer to have a small one so that I can compre by looking at the list of variables directly. I do not know how to use sas to do that step. if you can give me advice, that will be helpful. This is why I ask for four tables seperatedly

 

Second, I am not familiar with the following code

indsname=source;
dsetname=source

 I searched online. I feel that I should list variable names there. I have many variables. Then how should I list. What is the difference between indsname and dsetname here? I cannot figure out what 'source' means here.

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Could you provide test data in the form of a datastep, so that we can provide some code, just guessing from the text here.  Help on the indsname option:

http://www.amadeus.co.uk/sas-training/tips/1/1/106/the-indsname-option.php

 

Basically you would create a variable which holds the name of the dataset that data comes from, in the example given by @Reeza, this would be child or adult etc.

This variable is then used as a bygroup, you can also do frequencies and other things using by group.

Bal23
Lapis Lazuli | Level 10

I have searched online and found the link you provided. Unfortunately, with my current computer, I could not access, maybe because of setting problems.

Bal23
Lapis Lazuli | Level 10
*Create sample data;
data childhave;
input patientid var_hed var_eye var_col var_leg var_hea var_lea var_alh;
cards;
371130 1 . . . . . 1

214144 . 1 . . . . . 

181841 . . . . 1.1

168065 . . 1....
779199 . . . 1 1 .
155981 1 . 1 . . . .
603224 . . . . . 1 .

;

run;

this is my test data, for one dataset, child group

thanks

Reeza
Super User

@Bal23 Have you tried the steps I've outlined. Besides writing the code letter for letter for you, I'm not sure how else to explain it. 

 

If you're having trouble, post the code you've tried.

 

You've also stated you want 4 tables, the BY does generate 4 separate tables. 

Bal23
Lapis Lazuli | Level 10

As I said, I do not understand how to use this

indsname=source;
dsetname=source;

 I did search online, I got the same link that I could not access

 

Then, I provided my sample data a few minutes ago. Please check and it will be great if you can provide help regarding my sample data, since I do not know how to modify your above code

 

thanks.

Reeza
Super User

indsname is a dataset option. It identifies records by creating a variable that stores the name of the data set that provides the record. 

However, that variable isn't stored automatically, so you need to reassign it in your data step.

 

It is only valid in SAS 9.3+

 

INDSNAME=variable
creates and names a variable that stores the name of the SAS data set from which the current observation is read. The stored name can be a data set name or a physical name. The physical name is the name by which the operating environment recognizes the file.
Tips	For data set names, SAS adds the library name to the variable value (for example, WORK.PRICE) and converts the two-level name to uppercase.
Unless previously defined, the length of the variable is set to 41 bytes. Use a LENGTH statement to make the variable length long enough to contain the value of the physical filename if the filename is longer than 41 bytes.
If the variable is previously defined as a character variable with a specific length, that length is not changed. If the value placed into the INDSNAME variable is longer than that length, then the value is truncated.
If the variable is previously defined as a numeric variable, an error will occur.
The variable is available in the DATA step, but the variable is not added to any output data set.
Example	Retrieving the Name of the Data Set from Which the Current Observation Is Read

http://support.sas.com/documentation/cdl/en/lestmtsref/68024/HTML/default/viewer.htm#p00hxg3x8lwivcn...

Bal23
Lapis Lazuli | Level 10
*Create sample data;
data childhave;
input patientid var_hed var_eye var_col var_leg var_hea var_lea var_alh;
cards;
371130 1 . . . . . 1

214144 . 1 . . . . . 

181841 . . . . 1.1

168065 . . 1....
779199 . . . 1 1 .
155981 1 . 1 . . . .
603224 . . . . . 1 .

;

run;

@Reeza,  I have read this. Without any example, it is hard for me to do. (i did click but was not able to get any example).

Therefore, I copied and pasted my sample data set again  and hope to get sample code from you

 

Thanks

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 38 replies
  • 3571 views
  • 2 likes
  • 5 in conversation