Hi Friends,
Scenario : I have N Files with N Variables. Say for Instance a Particular Variable Customer_Name is available in some Files and missing in Some. So what i need is a Master file to be created with All the Unique and distinct Observervations of the Variable Customer_Name, File_name1..Filename 2..FileNameN to see if the particluar observation exists or not in that particular file.
Below is an Example of how the output is required. Please help. The Variables mentioned after the Customer_Name are File Names.
Customer_Name | ODGCR | WESTAR | SAS | BIA |
AB | N | Y | Y | Y |
AC | N | Y | N | Y |
AD | N | Y | N | Y |
AE | N | Y | N | N |
AB | N | Y | N | N |
AC | N | Y | N | N |
AD | N | Y | N | N |
AE | N | Y | N | Y |
AB | Y | Y | Y | Y |
AC | Y | Y | Y | Y |
AD | Y | N | N | Y |
AE | Y | N | N | Y |
AB | Y | N | N | N |
AC | Y | N | Y | N |
AD | N | N | N | Y |
AE | N | N | N | Y |
Thanks
Thanks Sharp... That a wonder..Exactly what I wanted.. Now let me start to try with my Data set that I have to work with and see the same wonders happening.
Thanks again,
Raghuraman Ramesh
I think you should import these files one by one . then merge them all into one dataset , and use indsname= option to hold the dataset name for preparing your final report.
Ksharp
Thanks Sharp,
I did try the code with My Dataset, I get the Filenames in the New variable created "dsetname" but then The Customer Name when I merge the data set - Is repeated. In your Example you had Two Variables called X & Y - In My Case it is Just Customer_Name - The Other variables are the File_Names - It almost like an attendance chart - Present - or - Not Present -
The Code That I Used :
data Master;
set Raghu_Pr.odgcr Raghu_Pr.Sas Raghu_Pr.Tangoe Raghu_Pr.Webstar indsname=dsn;
dsetname=dsn;
Thanks,
Raghuraman Ramesh
Thanks for taking up the trouble Sharp!
I have Four Files - ODGCR, Webstar, SAS and BIA with one valiable named Customer_Name. The Customer_Name can repeat in Mutiple files.
When I merge them to one Data set - I get the Customer_name along with the File Name as the New Variable but with Duplicate Values in Customer_name as the Same customer might exist in different Files.
So I want
1. The Distinct of All Customer's from 4 Files.
2. 4 New Variables (File Names)
3. The Variable File Name should just contain the Value - Yes or No. (If present in that Particular File or Not).
Eg: What I want:
Customer_Name | ODGCR | WESTAR | SAS | BIA |
AB | N | Y | Y | Y |
AC | N | Y | N | Y |
AD | N | Y | N | Y |
AE | N | Y | N | N |
AB | N | Y | N | N |
AC | N | Y | N | N |
AD | N | Y | N | N |
AE | N | Y | N | Y |
AB | Y | Y | Y | Y |
AC | Y | Y | Y | Y |
AD | Y | N | N | Y |
AE | Y | N | N | Y |
AB | Y | N | N | N |
AC | Y | N | Y | N |
AD | N | N | N | Y |
AE | N | N | N | Y |
Not Like This (Below) - Where you have mutiple entries of customer name.
Customer_Name | File_Name |
AB | ODGCR |
AC | ODGCR |
AD | ODGCR |
AE | ODGCR |
AB | ODGCR |
AC | ODGCR |
AB | WESTAR |
AC | WESTAR |
AD | WESTAR |
AE | WESTAR |
AB | WESTAR |
AC | WESTAR |
AD | WESTAR |
AE | WESTAR |
AB | WESTAR |
AC | WESTAR |
AB | SAS |
AB | SAS |
AC | SAS |
AC | SAS |
AB | BIA |
AC | BIA |
AD | BIA |
AE | BIA |
AB | BIA |
AC | BIA |
AD | BIA |
AE | BIA |
AD | BIA |
AE | BIA |
Thanks,
Raghuraman Ramesh
Hi Raghuraman Ramesh,
Your post confused people. in your example, the "Customer_Name" variable in table "what I want" is not unique.
Do you want the file name to be column name ?
data filename; set sashelp.class(keep=sex rename=(sex=filename)); run; data webstar; set sashelp.class(keep=weight rename=(weight=webstar)); run; data want; merge filename webstar; run;
Ksharp
Oh. Maybe I understood what you mean.
To see whether the following code is what you need?
data odgcr; input name $; cards; Patrick Peter Peter Joyce Thomas James Jane John Louise Robert Alice Barbara Jeffrey Alfred ; run; data westar; input name $; cards; Patrick Louise Robert Alice Barbara Jeffrey Alfred Alfred Carol Henry Judy Philip Philip ; run; data sas; input name $; cards; Louise Robert Alice Alice Alice Barbara Jeffrey Alfred Carol Henry Henry ; run; data bia; input name $; cards; Alfred Carol Henry Judy Janet Mary Ronald William William William Philip Philip Philip ; run; proc sort data=odgcr nodupkey;by name;run; proc sort data=westar nodupkey;by name;run; proc sort data=sas nodupkey;by name;run; proc sort data=bia nodupkey;by name;run; data want; merge odgcr(in=a) westar(in=b) sas(in=c) bia(in=d); by name; if a then odgcr='Y'; else odgcr='N'; if b then westar='Y'; else westar='N'; if c then sas='Y'; else sas='N'; if d then bia='Y'; else bia='N'; run;
Ksharp
Thanks Sharp... That a wonder..Exactly what I wanted.. Now let me start to try with my Data set that I have to work with and see the same wonders happening.
Thanks again,
Raghuraman Ramesh
Perfect! Eureka!! Just another small help... My Oberservations gets truncated due the length issue after I merge the files. how do I sort that problem now?
Thanks,
Raghuraman Ramesh
OK.
Add a length statement before merge statement.
data want;
length name $ 100;
merge odgcr(in=a) westar(in=b) sas(in=c) bia(in=d);
by name;
if a then odgcr='Y'; else odgcr='N';
if b then westar='Y'; else westar='N';
if c then sas='Y'; else sas='N';
if d then bia='Y'; else bia='N';
run;
Ksharp
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.