You define the tables and columns to change here:
data excel;
infile datalines dsd;
input dsname :$41. colname :$32.;
datalines;
class,sex
class,age
class,name
sashelp.air,air
;
Column DSNAME contains either <libref>.<table name> or <table name>
If you only pass in <table name> then the code will use WORK as the libref
Whatever you define here must actually exist (libref, table and column) without typo AND the library must be assigned. If it doesn't exist or the library is not assigned then the code won't fail but the SQL will not return any rows (as there was no match to what's in dictionary table dictionary.columns).
The SQL code and sample data for table EXCEL I've posted will pick-up SASHELP.AIR
SASHELP.AIR is a SAS provided table that should exist in any SAS environment.
If you also execute the data step at the beginning in comment then the SQL will also pick-up WORK.CLASS
"what is the use of code gen file can you please explain your code."
The data _null_ step generates code using PUT statements writing this code to a file in WORK.
filename codegen temp;
Keyword TEMP in the filename statement makes this a temporary file in WORK.
Below statement then reads the generated code back into the SAS program so that it gets executed. The SOURCE2 option instructs SAS to write the %included code also to the SAS log.
%include codegen / source2;
With the code I've posted that's what you'll see in the SAS log.
105 %include codegen / source2; NOTE: %INCLUDE (level 1) file CODEGEN is file <path and filename> 106 +data out.AIR ; 107 + set SASHELP.AIR ; 108 + AIR =9999; 109 +run;
And last but not least: You need to define a valid path for your environment to execute the code without error - or you use the libname definition in comment instead - but that's just for function testing and you shouldn't use it for your actual execution.
/* define library to store the tables with masked columns
- do not overwrite the source tables
*/
%let outlib=out;
/*libname &outlib "%sysfunc(pathname(work))";*/
libname &outlib "<path name for output library";
@Kadz_sas1990 wrote:
@Patrick
and also the code is copying all the datasets in library(excel sheet) to output path, but I need to copy any one or two of the datasets in excel sheet, how it can be done. just if I specify the name of the dataset it should copy to output path
Not sure that I understand your question.
The code will create a copy for any data set that you provide as control data via your excel sheet if/and only if at least one of the column names you provide actually exists in the source table.
It will not copy tables from a library if you don't provide the table name. - so just make sure that you don't have tables in your Excel sheet that you don't want copied.
@Kadz_sas1990 wrote:
actually i need to copy all datasets using a single program, this program is not copying the datasets without masked column, if there is no masking column it should copy the dataset as it is. i.e. without masking any column it should simply use SET function.
/*data class;*/
/* set sashelp.class;*/
/*run;*/
data excel;
infile datalines dsd truncover;
input dsname :$41. colname :$32.;
datalines;
class,sex
class,age
class,name
sashelp.air,air
sashelp.cars
sashelp.doesNotExist
;
proc sql;
create table colsToMask as
/* tables with matching col that actually exist in a currently assigned library */
select c.libname, c.memname, c.name, c.type, c.length as len
from
(
select
upcase(scan(cats('WORK.',dsname),-2,'.')) as libname length=8,
upcase(scan(dsname,-1,'.')) as memname length=32,
upcase(colname) as name length=32
from excel
) e
,
dictionary.columns c
where
e.libname=c.libname
and e.memname=c.memname
and e.name=upcase(c.name)
and c.memtype='DATA'
union corr all
/* tables with no matching col that actually exist in a currently assigned library */
select e.libname, e.memname, e.name, ' ' as type length=4, . as len
from
(
select
upcase(scan(cats('WORK.',dsname),-2,'.')) as libname length=8,
upcase(scan(dsname,-1,'.')) as memname length=32,
upcase(colname) as name length=32
from excel
) e
,
dictionary.tables t
where
e.libname=t.libname
and e.memname=t.memname
and t.memtype='DATA'
and missing(e.name)
order by libname, memname
;
quit;
/* define library to store the tables with masked columns
- do not overwrite the source tables
*/
%let outlib=out;
libname &outlib "%sysfunc(pathname(work))";
/*libname &outlib "<path name for output library";*/
/* generate code for masking */
filename codegen temp;
data _null_;
/* file print;*/
file codegen;
set colsToMask;
by libname memname;
if first.memname then
do;
put
"data &outlib.." memname ';' /
' set ' libname +(-1) '.' memname ';'
;
end;
if type='num' then
put ' ' name '=9999;';
else
if type='char' then
do;
_replace=repeat('Z',min(len-1,10));
put
@3 name '="' _replace +(-1)'" ;'
;
end;
if last.memname then
put 'run;' ;
run;
/* ensure that there is always a codegen file */
data _null_;
file codegen mod;
run;
/* execute code for masking */
%include codegen / source2;
i am getting error:
119
120 /* execute code for masking */
121 %include codegen / source2;
NOTE: %INCLUDE (level 1) file CODEGEN is file W:\SASWORK\TD765_\#L70.
122 +data out.TEST_1 ;
123 + set INPUT.TEST_1 ;
124 + ="ZZZZZZZZZZZ" ;
_
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
125 +run;
@Kadz_sas1990 wrote:
i am getting error:
119
120 /* execute code for masking */
121 %include codegen / source2;
NOTE: %INCLUDE (level 1) file CODEGEN is file W:\SASWORK\TD765_\#L70.
122 +data out.TEST_1 ;
123 + set INPUT.TEST_1 ;
124 + ="ZZZZZZZZZZZ" ;
_
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
125 +run;
The code I've posted works for me.
Looking at the error it appears you haven't copied over the data _null_ step bit. There is also a change in there.
if type='num' then
put ' ' name '=9999;';
else
if type='char' then
do;
HI @Patrick,
Hope you are fine and safe,
in the program i changed some lines like below,
if type='num' then
put ' ' name '=9999;';
else
if type='char' then
do;
put " rl = 'TEST';";
put @3 name ' = rl||Strip(_N_);';
end;
if last.memname then
put 'run;' ;
in the above program it is creating a new column (rl) with records like "Test" in the final dataset,
why i changed code is i need character data to be masked with Test1,Test2,Test3....and so on for the field which is character datatype,
example: i am going to mask SURNAME field its Character datatype and my desired output is below,
But what is the problem is i am getting a new field 'rl' is added in the final dataset,
Could you please help me to change the code
Thanks
Something like below should work:
if type='char' then
do;
if upcase(name)='SURNAME' then
do;
put @3 name '= cats("TEST", _n_) ;';
end;
else
do;
_replace=repeat('Z',min(len-1,10));
put
@3 name '="' _replace +(-1)'" ;'
;
end;
end;
Again i got a blocker here in some Columns which we are masking has no values or null, but the code is masking that value as well eg
obs |
First name |
Masked First name/ Current output |
Desired output |
1 |
John |
Test1 |
Test1 |
2 |
Andrew |
Test2 |
Test2 |
3 |
|
Test3 |
|
4 |
Milton |
Test4 |
Test3 |
5 |
|
Test5 |
|
6 |
|
Test6 |
|
7 |
Mark |
Test7 |
Test4 |
8 |
|
Test8 |
|
9 |
Rosy |
Test9 |
Test5 |
Obs 3, 5, 6, 8 are blank values in the original dataset so it should be left blank in final dataset as well as given in last column of the above table.
i.e blank values in the column should be left as it is it should not be masked in both cases either it is character or number (both cases)
Can you please help on this.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.