11-15-2016 08:03 AM
I have written the following code:
%macro one_run (dataset1=, dataset2=, var1= , var2= ); %macro two_run (yourdate= 01Feb2015)* this is example date proc sql; create table ab as select a.&var1, a.INDIV_ID, b.CD, b.&var2, b.AGENCY, b.DT, b.PAID_AMT, b.PROD_ID, b.CANCEL_DT, b.RECEIVED_CD, b.PAYMENT_DT, b.PAYMENT_STATUS, b.PAYMENT_TYPE, b.OKey, b.Cancel from &dataset1 a, &dataset2 b where a.MSR_INDIV_ID = b.INDIVIDUAL_ID and date ge "&yourdate."d; run; %mend one_run;
%mend two_run; %one_run (dataset1=var.FF_ACX_MSR_ID10, dataset2=ora1.FACT_MKE_PUB_SUBSCRIPTION) %two_run(&yourdate)
The challenge i am facing here is that i have in some of the datasets more than 70 variables hence i cannot use var1, var2 so on and so forth to call these variables.
Kindly suggest how should i achieve this. if there is any other alternate apart from proc sql or macro kindly assist with the same as well.
Looking forward to assistance as always.
11-15-2016 08:10 AM
What are you trying to achieve?
I ask this because I don't think your code can work at all. And is written in a way that successfully obfuscates your intentions.
11-15-2016 08:52 AM
We have moved data from Oracle to SAS.
Now we are trying to find any data if all data has moved entirely or is some discrepancy. here i have some datasets which could have variables ranging from 10 to almost 68.
If i write a proc sql statement, i would have to manually add in name of all the variables every time and that would be a tedious activity. I have around 72 data tables.
I hope i am able to comprehend my requirement.
11-15-2016 09:06 AM
If you want to create steps dynamically, you can get the column names and attributes from dictionary.columns (SQL) or sashelp.vcolumn (data step) and then let a data step write the compare steps dynamically with call execute.
Your initial example was just a inner join with no compare functionality at all.
11-15-2016 08:23 AM
Post example test data (in the form of a datastep) and what the output should look like. Why do you have 70 variables to start with? That is a lot, and any programming you do will be complicated by this. Mostly, if you have lots of variables you would use arrays - can be simple depending on naming of the variables. Your code as given doesn't make any sense I am afraid.
11-15-2016 10:27 AM
Attached excel contains some example data.
Here i have added some mocked data for 2 tables: Employee and emp_salary table which have 8 and 5 variables.
Similarly i have 70 tables which have 5 to 68 variables. With proc sql i would need to manually feed in the variables every time however with a macro i would need to define a macro variable and keep changing its value as per the required variable from respective table
Hope this helps. Apologise i am not familair with SAS arrays.
Once we are able to achieve this we would run a proc freq to see the freq of these variables in the select statement from the respective datasets and find difference.
%macro freq( dsn,n);
proc freq data = &dsn;
11-15-2016 10:39 AM
Please see if this is what you were asking for (in plain text):
Table Name Variable Name
Table A emp_code
Table A emp_name
Table A date_join
Table A department
Table A emp_phone
Table A emp_cell
Table A email_id
Table A emp_status
Table B emp_code
Table B emp_sal
Table B sal_month
Table B avg_tenure
So on and so forth. Table A and B are ficticious names and the variables in these tables are changing.
11-16-2016 04:02 AM
Not sure if my data was uploaded yesterday, sharing again. These are 2 tables - Employee and emp_salary with various variables. These variables will keep on changing for each table and there are 70 tables we have to work on.
Table Name Variabe aname Employee emp_code Employee emp_name Employee date_of_join Employee department Employee emp_phone Employee emp_mob Employee email_id Employee emp_status emp_salary emp_code emp_salary emp_sal emp_salary sal_month emp_salary salary_process_numbe emp_salary Last_Salary
11-16-2016 04:34 AM - edited 11-16-2016 04:35 AM
Yes, I saw your Excel file, however I will not touch Office files from the net, they are a security risk. Post test data - in the form of a datastep - in the body of your post, e.g.;
data 1; input a b; datalines; 1 1 2 2 ; run;
Now looking at what you have posted below, you have two tables, only one of which variable name matches - I have no idea on other properties such as format or length. Now what do you want to do with those two variables - this is where showing an want table is useful. It seems your looking to merge all the 70 tables into one really wide table, not a recommended approach. Much like your Oracle database, it makes life easier for programming if you keep the normalised structure (data goes down rather than across). Take this one, if you normalise the data to:
EMP_CODE PARAM RESULT
001 EMP_NAME Abc
001 DATE_OF_JOIN 01JAN2014
Then your task becomes trivial as you can just set all of the datasets into one big dataset. Later on if you need the transposed data, then use proc transpose or array processing to transpose the data up.
Also note that you can get one complete normalised dataset from the Oracle Clinical (OC) master dataset, i.e. avoiding all the retrieval of individual data and processing. In OC thre is a Questions, and reposnses master table, merge those two and export and it contains the param/result setup from above.
11-15-2016 09:03 AM
Create a macro with data step merge and use keep= if you have less variables to consider or drop= to drop some variables and keep more variables.