Yes.
How does that differ from the code I wrote?
can you explain this
proc datasets nolist lib=to; ( HERE lib=to "to" is the destination path right?)
copy in=work out=to memtype=data; (HERE copy in=work out=to what is "work" and "to" again the source and destination?)
select data1 data2 data3; (what is the use of select line here this the place where I need to choose which dataset I need to copy for eg:if I give "select data1" it will copy only the data1? or else it will throw error)
modify data1;(these modify statement will work individually or it will work will select statement i.e. if I give "select data1;" alone and give "modify data2" in my code will it work??)
...
modify data2;
...
Yes, 'to' is the destination path. In my example it is the libname that I specify above.
'Work' is the libname of the library in which your data is stored. In my example, my data sets are in the temporary work library, so change this to the library in which your data is stored.
Yes exactly. The Select Statement lists the data sets you want to copy. If you omit this statement, SAS will copy all data sets from the library.
The Modify Statement lists the individual data set you want to edit. For example this
modify data1;
format phone_num $cmask. name $cmask. gender $cmask. dob nmask. mobile_no nmask. salary nmask.;
means that we want to modify the data1 data set (in the to library). The Format Statement then formats the listed variables in data1.
Hi Draycut,
I tried that code for use but now I found that my script is too long and problem here is
1) I have more than 500+ datasets in one place where I am going to use this code so the modify statement I need to right 500+ and many fields in 50+ datasets are need to be masked
2) if write a code for all those dataset while running mostly I will copy one or two dataset, so in select I will give those dataset names and for remaining all modify statement I will get error messages
proc datasets nolist lib=TO;
copy in=FROM out=TO memtype=data;
select dataset1 dataset2;
modify dataset1;
format column1 $cmask. column3 $cmask.;
modify dataset2;
format column3 $cmask. column5 $cmask.;
modify dataset3;
format phone_num $cmask. name $cmask. gender $cmask. dob nmask. mobile_no nmask. salary nmask.;
modify dataset4;
format ........
modify dataset5;
format ........
......
.....
modify dataset500;
format ........
run;quit;
my code looks same like above,
so in above code I will get errors from dataset3 ... dataset500, since I used only dataset1 and dataset2 in select statement.(around 498 error messages)
3) when ever I need to add some fields as sensitive fields I need to modify the code. instead of doing this I already have an excel sheet
I have excel sheet like this
Your "phone number" seems to be a number, as it is displayed right-justified, so you can only replace it with a number, not a string.
Do you need to mask the values, or do you need to anonymize them in a way so that relations are kept intact? If the later, you need to create a lookup table from all existing values so that you replace consistently.
As a side note, do not store real telephone numbers as numbers, store them as character. Phone numbers can have leading zeroes or special characters ("+" for international), and with extensions they can easily exceed the available precision of numbers in SAS (and other software that uses 8-byte real storage for numbers).
The phone number field is just an example and I have many other fields like name, address, relationship status and all personal information of my clients which I need to be masked while copying the dataset to other users in my team. I don't want to show these details to them and they wont need these fields as well. so while copying the dataset to them I need to give a copy of the SAS dataset in which the sensitive fields are masked or replaced. this is my actual need.
Thanks for your help in this.
If integrity is not a requirement, just set the fields to missing.
Maybe this will be useful? Especially if you want to maintain integrity of the data, ie Phone Number XYZ is the same everywhere its encountered.
https://gist.github.com/statgeek/fd94b0b6e78815430c1340e8c19f8644
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.