BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

Hi all:

 I  need to create 2 macro variable - var1  and var2 from excel sheet , so when I run ds1 , ds2 ,ds3 in macro ,I get corresponding &var1 ,&var2.

any suggestion would be appreciated. 

Thank you.

Purple

 

%macro want;

data &ds;

set &ds;

xxxx;

run;

%mend; 

%want

 

/*-------------*/

%put &var1;

/*log shows*/

(when run macro want, ds=ds1) var1=a b c d;
 var1=s r;
 var1=c w t ;

 

%put &var2;

/*log shows*/

(when run macro want, ds=ds1)  var2=o_a o_b o_c o_d ;

var2=o_s o_r;

var2=o_c o_w o_t;

/*------------*/

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@purpleclothlady It might be worth to investigate what Proc Compare could do for you.

 

Below just some sample code to get you started. Proc Compare got a lot of options that allow you to get what you want - and besides of an out= option you can also always use ods to capture results and store it in a table for further processing and shape the data exactly as you want them.

data new;
  input id ds $ ta $ f $ ccc $ z y $;
  datalines;
101 ds2 la cah  tx  15 s
;

data old;
  input id ds $ o_ta $ o_f $ o_ccc $ o_z o_y $;;
  datalines;
101 ds2 al cah tx  1 s
;

proc compare base=new comp=old;
  by id;
  var ds ta f ccc;
  with ds o_ta o_f o_ccc;
run;

 

Patrick_0-1667006981840.png

 

View solution in original post

12 REPLIES 12
ballardw
Super User

I am sorry but I do not understand what you expect.

 

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character), name and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against. Or write your own small example data step.

 

Warning: Use of the same data set as input and output such as you have in your macro:

data &ds;
   set &ds;
  <other code>
run;

can cause problems as the code completely replaces your &ds data set. So a minor logic error can corrupt data a bit, a major logic error can result in an empty data set depending on what the other code actually does.

This can be compounded in macros as unless you set options like MPRINT you don't typically get as much information in the LOG and don't see what happened. And after the first pass replacing the data set you may have to go back in process to rebuild the original input.

 

Tom
Super User Tom
Super User

You need to provide much more information about what you want the code to do.

 

Also why did you go to trouble to create an XLSX file and then upload it to the forum to share 6 values?

 

You could have just typed (or pasted) them into your question and not only would have been easier for you it would have been clearer.

dsname varlist
ds1 a b c d
ds2 b r  
ds3 c w t 

So what is it want to do with this list of dataset names and corresponding variable lists?

purpleclothlady
Pyrite | Level 9

hi @Tom @ballardw  and all :

I rewrote the code in SAS , thank you for helping.

Purple

 

data new;
input ds $ a $ b $ c d $ e f $;
datalines;
ds1 la cah 10 tx  15 o
run;

data new2;
input ds $ ta $ f $ ccc $ z y $;
datalines;
ds2 la cah  tx  15 s
run;

data old2;
input ds $ o_ta $ o_f $ o_ccc $ o_z o_y $;;
datalines;
ds2 al cah tx  1 s
run;

/*QUESTION: how to create a program
which KEEP=  a list of variable from new and old by using macro variable 
because we have multiple datasets with different number of variables
ex new1 new2 
   old1 old2

THE GOAL is to compare new and old version and find out in KEEP= the difference*/

data new;
set new (keep= a b c d);
run;

data old;
set old (keep=o_a o_b o_c o_d);
run;

data new2;
set new2 (keep= ta f y);
run;

data old2;
set old2 (keep=o_ta o_f o_y);
run;


/*So that I may use these macro variables to compare version difference by all KEEP=var */
data dif;
  merge new (in=new)
       old (in=old);
	by id;
	      if new and not old 	then _cat ="New Record";
		if old and not new 	then _cat ="Deleted Record";
	     if new and old 		then _cat ="Both";
			
		    if _cat^="Both"   then _catf=_cat;
				
				array vnew{*} &varnew;	 /*a b c d*/
				array vold{*} &varold;	 /*o_ta o_f o_y*/
				array v3{*}   &ck;      /*ck_a ck_b ck_c ck_d*/
						

				do i = 1 to dim(vold);
						if _cat ="Both" and vnew{i} ne vold{i} then do;
							_catf ="Revised";
							v3{i}  = 1;
						end;									
				end;
				drop i old_:;
		run;
Tom
Super User Tom
Super User

I am really have a hard time understanding what application this is actually for.

At first is sounds like you want a way to generate a list of variables to keep, so perhaps to generate the KEEP= dataset option.

Such as in code like:

 

data want;
  merge ds1 (keep= a x y z )
        ds2 (keep= a b c )
  ;
by a ; run;

For which you would need a dataset like:

data keeplist;
  input dsname :$32. idvar :$32. varlist $200.;
cards;
ds1 a x y z
ds2 a b c
;

Or perhaps better would be to normalize the structure like this:

data keeplist;
  input dsname :$32. idvar :$32. varname :$32.;
cards;
ds1 a x
ds1 a y
ds1 a z
ds2 a b 
ds2 a c
;

Then it is not hard to generate the code from that data:

filename code temp;
data _null_;
  set keeplist end=eof;
  by dsname ;
  file code ;
  if _n_=1 then put 'data want;' / 'merge' ;
  if first.dsname then put dsname '(keep=' idvar @;
  put varname @;
  if last.dsname then put ')';
  if eof then put ';'/'by ' idvar ';'/ 'run;' ;
run;
%include code / source2;

 

purpleclothlady
Pyrite | Level 9

hi @Tom and all:

I got the answer of the question, hopefully the code explained the goal. have a great day 

 

/*create data sets*/
/*keepvar is required by doc*/
%let keepvar= ta f ccc;
%let keepvarold= o_ta o_f o_ccc;

data new;
input id ds $ ta $ f $ ccc $ z y $;
datalines;
101 ds2 la cah  tx  15 s
run;
/*keep the required var.*/
data newf;
set new (keep=id ds &keepvar.);
run;
data old;
input id ds $ o_ta $ o_f $ o_ccc $ o_z o_y $;;
datalines;
101 ds2 al cah tx  1 s
run;	
data oldf;
set old (keep=id ds &keepvarold.);
run;

/*add prefix to both new and old variables for compare the difference*/
data names;
set sashelp.vcolumn;
where libname='WORK' and memname='NEWF' and name not in ("ID","DS");
new_name = catt("NEW_", _n_);
keep new_name name;
run;

proc sql noprint;
select catx("=", name, new_name) into :rename_list separated by " "
from names;
quit;

proc datasets lib=work nodetails nolist;
modify newf;
rename &rename_list;
run;quit;

data oldnames;
set sashelp.vcolumn;
where libname='WORK' and memname='OLDF' and name not in ("ID","DS");
old_name = catt("OLD_", _n_);
keep OLD_name name;
run;

proc sql noprint;
select catx("=", name, old_name) into :oldrename_list separated by " "
from oldnames;
quit;

proc datasets lib=work nodetails nolist;
modify oldf;
rename &oldrename_list;
run;quit;

proc sort data=oldf;by id ds;run;
proc sort data=newf;by id ds;run;

/*Merge oldf and newf to find out WHICH VARIABLE is different by using array*/
data adif ;
		length _cat _catf $15.;
			merge oldf(in = old )
				    newf(in = new );
			by id ds;		

				if new and not old 	then _cat ="New Record";
				if old and not new 	then _cat ="Deleted Record";
				if new and old 			then _cat ="Both";
		
        array vcold{*} old_:;
				array vcnew{*} new_:;
				array ckc  {*} ckc1 -ckc3;
						

				do i = 1 to dim(vcold);
						if _cat ="Both" and vcnew{i} ne vcold{i} then do;
							_catf ="Revised";
							ckc{i}  = 1;
						end;									
				end;
  drop i old_:;
run;

purpleclothlady_0-1666988829794.png

 

Patrick
Opal | Level 21

@purpleclothlady It might be worth to investigate what Proc Compare could do for you.

 

Below just some sample code to get you started. Proc Compare got a lot of options that allow you to get what you want - and besides of an out= option you can also always use ods to capture results and store it in a table for further processing and shape the data exactly as you want them.

data new;
  input id ds $ ta $ f $ ccc $ z y $;
  datalines;
101 ds2 la cah  tx  15 s
;

data old;
  input id ds $ o_ta $ o_f $ o_ccc $ o_z o_y $;;
  datalines;
101 ds2 al cah tx  1 s
;

proc compare base=new comp=old;
  by id;
  var ds ta f ccc;
  with ds o_ta o_f o_ccc;
run;

 

Patrick_0-1667006981840.png

 

purpleclothlady
Pyrite | Level 9

@Patrick thank you for the suggestion. It works for simple compare, you are right. I need to explore out= options to output the difference. again this is a great place to learn and share .

 

purple

Tom
Super User Tom
Super User

That does not describe WHAT you want to do nor WHY you want to do it.  It might describe HOW you are trying to do something.   This really looks like an XY problem

 

If you can explain the actual problem you are trying to solve, the data you have (and perhaps since you seem have some type of metadata about the data you want to use to generate a variable list from then describe the metadata you have) and the output you want to get from that input data.  Once that is clear then we can begin to explore optional methods to implement the actual problem.

 

This comment in the log looks like it might be an attempt at a description:

how to create a program which KEEP= a list of variable from new and old by using macro variable
because we have multiple datasets with different number of variables (ex new1 new2 old1 old2)

THE GOAL is to compare new and old version and find out in KEEP= the difference

 

But KEEP= is a dataset option. So that seems to be more about how to implement a particular solution and now a description of the actual problem.  How does the dataset option KEEP= relate to actual problem?

 

What differences are you trying to detect?   Since you seem to want to select only some of the variables (that is what the KEEP= dataset option is for) what is the criteria for deciding which variables you want to keep?  Is the goal to identify the variables that exist in both datasets?

 

What is meant by "compare new and old version"?  Are you implying that you want to two datasets that are supposed to have the same information to detect the differences between them?  How is the program supposed to know which two datasets to compare?  Are you going to pass that information into the program?  Is it just going to guess to pick OLD1 and NEW1 because the names both end in 1?  Is it some how related the DS variable you had some of the datasets?  But none of the values of that DS variable in any of the example data you provided matches the names of any of the other sample datasets you provided. 

 

Any way if you want to compare datasets use PROC COMPARE.

If you want to extend the functionality of PROC COMPARE to make a nice print out of observations that appear in only one of the two datasets then do that separately.  You could use this 25 year old macro from the old master Tom Hoffman.

%macro compds
/*----------------------------------------------------------------------
Conpare two data sets.
----------------------------------------------------------------------*/
(old=      /* Original data set */
,new=      /* Changed data set */
,keys=     /* List of variables that uniquely identify a record */
);

/*----------------------------------------------------------------------
This code was developed by HOFFMAN CONSULTING as part of a FREEWARE
macro tool set. Its use is restricted to current and former clients of
HOFFMAN CONSULTING as well as other professional colleagues. Questions
and suggestions may be sent to TRHoffman@sprynet.com.
-----------------------------------------------------------------------
Usage:

one or more usage examples
------------------------------------------------------------------------
Notes:

additional information
-----------------------------------------------------------------------
History:

15MAR98 TRHoffman Creation
----------------------------------------------------------------------*/
%local macro parmerr lkey;
%let macro=compds;

/*  You can get PARMV macro from github
%*----------------------------------------------------------------------
Validate macro parameters
-----------------------------------------------------------------------;
%parmv(OLD,_req=1)
%parmv(NEW,_req=1)
%parmv(KEYS,_req=1,_words=1)
%if (&parmerr) %then %goto quit;
*/

%*----------------------------------------------------------------------
Replace call to the old PARSEM() macro
-----------------------------------------------------------------------;
%let lkey=%scan(&keys,-1);

data
  old
  new
  deleted
  added
  ;
  set
    &old(in=in1)
    &new
  ;
  by &keys;
  if ^(first.&lkey*last.&lkey) then do;
    if (in1) then output old;
    else output new;
  end;
  else if (in1) then output deleted;
  else output added;
run;

proc print data=added;
  title3 "Records in &new but not in &old";
run;

proc print data=deleted;
  title3 "Records in &old but not in &new";
run;

proc compare data=old compare=new;
  id &keys;
  title3 "OLD=&old - NEW=&new";
run; title3;

%quit:
%mend compds;

If you want proc compare to show a list of the variables that are not in both datasets then check out the LISTALL option of the PROC COMPARE statement. 

purpleclothlady
Pyrite | Level 9
Hi @Tom:
Thanks for the code. Let me explain the problem and goal again.
***Goal:
To compare a dataset extracted at different time (2 versions) to show
which record is (1)newly added, (2)deleted or (3) revised
***Problem:
1. There are about 20 datasets. each dataset must only check same variables against specification (dataset created from Data management had MORE variables than needed).
2. Variables in each dataset has different names . eg Test1- has 10 variables , Test2- has 21 variables , Test 3-has 11 variables.
3. Output the result in excel by different dataset name. the excel sheet should highlight (1)newly added, (2)deleted or (3) revised with different colors
4. for "Revised" , need to identify WHICH variable is revised by highlighting different color in excel cell.
eg. Test1- variable Score is different for 2 versions New (score=10) and Old(score=15) then in excel that cell will be highlighted with a color.
5. All data sets will use unique IDs -ID and RECORD ID to compare.

Thank you again.
Purple.



purpleclothlady
Pyrite | Level 9

and this is the sample output format.

 

Capture.JPG

 

Tom
Super User Tom
Super User

So you have TWO problems.

 

The main one is how to compare two datasets.  

 

The second one is how to generate calls to the first step bases on metadata about the list of datasets to compare.

 

So you should be able to solve the first problem by creating a macro that takes as input the dataset names.  So something like Tom Hoffman's %COMPDS() macro.  But perhaps with an added parameter to list of variables to compare.

%macro compds
/*----------------------------------------------------------------------
Conpare two data sets.
----------------------------------------------------------------------*/
(old=      /* Original data set */
,new=      /* Changed data set */
,keys=     /* List of variables that uniquely identify a record */
,vars=     /* List of variables to compare */
);
...

Then you can just use your metadata with the lists of datasets and variables to drive generating calls to the macro.

 

purpleclothlady
Pyrite | Level 9

@Tom and all:

I took all above suggestions and figured out. 

thanks all ,this is a great learning and sharing place.

purple

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 4098 views
  • 3 likes
  • 4 in conversation