Hi Friends,
Scenario : I have N Files with N Variables. Say for Instance a Particular Variable Customer_Name is available in some Files and missing in Some. So what i need is a Master file to be created with All the Unique and distinct Observervations of the Variable Customer_Name, File_name1..Filename 2..FileNameN to see if the particluar observation exists or not in that particular file.
Below is an Example of how the output is required. Please help. The Variables mentioned after the Customer_Name are File Names.
Customer_Name | ODGCR | WESTAR | SAS | BIA |
AB | N | Y | Y | Y |
AC | N | Y | N | Y |
AD | N | Y | N | Y |
AE | N | Y | N | N |
AB | N | Y | N | N |
AC | N | Y | N | N |
AD | N | Y | N | N |
AE | N | Y | N | Y |
AB | Y | Y | Y | Y |
AC | Y | Y | Y | Y |
AD | Y | N | N | Y |
AE | Y | N | N | Y |
AB | Y | N | N | N |
AC | Y | N | Y | N |
AD | N | N | N | Y |
AE | N | N | N | Y |
Thanks
Thanks Sharp... That a wonder..Exactly what I wanted.. Now let me start to try with my Data set that I have to work with and see the same wonders happening.
Thanks again,
Raghuraman Ramesh
I think you should import these files one by one . then merge them all into one dataset , and use indsname= option to hold the dataset name for preparing your final report.
Ksharp
Thanks Sharp,
I did try the code with My Dataset, I get the Filenames in the New variable created "dsetname" but then The Customer Name when I merge the data set - Is repeated. In your Example you had Two Variables called X & Y - In My Case it is Just Customer_Name - The Other variables are the File_Names - It almost like an attendance chart - Present - or - Not Present -
The Code That I Used :
data Master;
set Raghu_Pr.odgcr Raghu_Pr.Sas Raghu_Pr.Tangoe Raghu_Pr.Webstar indsname=dsn;
dsetname=dsn;
Thanks,
Raghuraman Ramesh
Thanks for taking up the trouble Sharp!
I have Four Files - ODGCR, Webstar, SAS and BIA with one valiable named Customer_Name. The Customer_Name can repeat in Mutiple files.
When I merge them to one Data set - I get the Customer_name along with the File Name as the New Variable but with Duplicate Values in Customer_name as the Same customer might exist in different Files.
So I want
1. The Distinct of All Customer's from 4 Files.
2. 4 New Variables (File Names)
3. The Variable File Name should just contain the Value - Yes or No. (If present in that Particular File or Not).
Eg: What I want:
Customer_Name | ODGCR | WESTAR | SAS | BIA |
AB | N | Y | Y | Y |
AC | N | Y | N | Y |
AD | N | Y | N | Y |
AE | N | Y | N | N |
AB | N | Y | N | N |
AC | N | Y | N | N |
AD | N | Y | N | N |
AE | N | Y | N | Y |
AB | Y | Y | Y | Y |
AC | Y | Y | Y | Y |
AD | Y | N | N | Y |
AE | Y | N | N | Y |
AB | Y | N | N | N |
AC | Y | N | Y | N |
AD | N | N | N | Y |
AE | N | N | N | Y |
Not Like This (Below) - Where you have mutiple entries of customer name.
Customer_Name | File_Name |
AB | ODGCR |
AC | ODGCR |
AD | ODGCR |
AE | ODGCR |
AB | ODGCR |
AC | ODGCR |
AB | WESTAR |
AC | WESTAR |
AD | WESTAR |
AE | WESTAR |
AB | WESTAR |
AC | WESTAR |
AD | WESTAR |
AE | WESTAR |
AB | WESTAR |
AC | WESTAR |
AB | SAS |
AB | SAS |
AC | SAS |
AC | SAS |
AB | BIA |
AC | BIA |
AD | BIA |
AE | BIA |
AB | BIA |
AC | BIA |
AD | BIA |
AE | BIA |
AD | BIA |
AE | BIA |
Thanks,
Raghuraman Ramesh
Hi Raghuraman Ramesh,
Your post confused people. in your example, the "Customer_Name" variable in table "what I want" is not unique.
Do you want the file name to be column name ?
data filename; set sashelp.class(keep=sex rename=(sex=filename)); run; data webstar; set sashelp.class(keep=weight rename=(weight=webstar)); run; data want; merge filename webstar; run;
Ksharp
Oh. Maybe I understood what you mean.
To see whether the following code is what you need?
data odgcr; input name $; cards; Patrick Peter Peter Joyce Thomas James Jane John Louise Robert Alice Barbara Jeffrey Alfred ; run; data westar; input name $; cards; Patrick Louise Robert Alice Barbara Jeffrey Alfred Alfred Carol Henry Judy Philip Philip ; run; data sas; input name $; cards; Louise Robert Alice Alice Alice Barbara Jeffrey Alfred Carol Henry Henry ; run; data bia; input name $; cards; Alfred Carol Henry Judy Janet Mary Ronald William William William Philip Philip Philip ; run; proc sort data=odgcr nodupkey;by name;run; proc sort data=westar nodupkey;by name;run; proc sort data=sas nodupkey;by name;run; proc sort data=bia nodupkey;by name;run; data want; merge odgcr(in=a) westar(in=b) sas(in=c) bia(in=d); by name; if a then odgcr='Y'; else odgcr='N'; if b then westar='Y'; else westar='N'; if c then sas='Y'; else sas='N'; if d then bia='Y'; else bia='N'; run;
Ksharp
Thanks Sharp... That a wonder..Exactly what I wanted.. Now let me start to try with my Data set that I have to work with and see the same wonders happening.
Thanks again,
Raghuraman Ramesh
Perfect! Eureka!! Just another small help... My Oberservations gets truncated due the length issue after I merge the files. how do I sort that problem now?
Thanks,
Raghuraman Ramesh
OK.
Add a length statement before merge statement.
data want;
length name $ 100;
merge odgcr(in=a) westar(in=b) sas(in=c) bia(in=d);
by name;
if a then odgcr='Y'; else odgcr='N';
if b then westar='Y'; else westar='N';
if c then sas='Y'; else sas='N';
if d then bia='Y'; else bia='N';
run;
Ksharp
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.