About rapt1

rapt1 · ‎08-18-2021

Oh. Let me revisit my code. Thanks!

rapt1 · ‎08-18-2021

Hi guys, My data tables has about 1,000 records with the following variables: Name Age City State MonthlyBill I am using the following code to remove retain those who are from NY and pays monthly of $500 or less proc sql; create table want as select t1.* from have t1 where t1.state IN ('NY') and t1.MonthlyBill lt 500; quit; I am left with 300 records and I am confused. Upon checking, my tables have about 200 NY customers and 500 are those below 500 bill. I was only expecting about 200 to be removed. Any thoughts?

rapt1 · ‎08-18-2021

Thank you everyone for the inputs! I would try to do all your suggestions and look into data step as well. I have been using proc sql since I am more familiar with sql but will get into data step soon. Also, thank you for the reminder on sharing the code. I oftentimes forget about this.

rapt1 · ‎08-17-2021

Thanks! Is this also a valid syntax when a row/observation contains a missing value? Say, Client Day1 Day2 Day3 ABC 1 0 1 DEF 0 0 0 GHI 1 1 1 JKL . . 1 I think I got this error: NOTE: Invalid (or missing) arguments to the XXX function have caused the function to return a missing value.

rapt1 · ‎08-17-2021

Hi everyone, Is there a way to create another column based on this objective - as long as the client visit any day from Day1 to Day3, they will be tagged in another column called Visit: Client Day1 Day2 Day3 ABC 1 0 1 DEF 0 0 0 GHI 1 1 1 I was thinking of using case when in proc sql but I am stuck with the syntax or if case when is even the proper statement to use. Basically, the output will be: Client Day1 Day2 Day3 Visit ABC 1 0 1 1 DEF 0 0 0 0 GHI 1 1 1 1

rapt1 · ‎07-21-2021

Thanks everyone for the input!

rapt1 · ‎07-06-2021

Hi @Amir ! Going through the documentation, coding via proc export is not allowed at the moment. I utilize the point and click export tools of Enterprise Guide. Do you happen to know how to create a dataset with no column names so from that output, I can do the export via point and click?

rapt1 · ‎07-06-2021

Hi everyone, I am trying to export a dataset without column names. So instead of: ClientID ----------- 001234 004321 002341 It should appear as: 001234 004321 002341 I am trying to export this in a CSV format so in Excel, the first row should appear as 001234. I already tried this but it requires a name: proc sql; create table ID_Output as select ClientID as " " from raw_data; quit; It is easy to do it if it is just one table but I will be doing this about 30 times on each table so I was wondering if there is a way to do it? I have been trying to search with 'remove column names' or 'remove variable names' but I can't seem to find the right solution. Any tips?

rapt1 · ‎07-06-2021

This makes sense. I tried it and I am getting the output I am looking for. Thank you so much guys!

rapt1 · ‎07-05-2021

Sharing the full code. I am using EG and I try to run these by code block, First, I made copies of the tables and checked for dupes at the same time, One for the userlist, which has 300,000 rows: proc sql; create table WORK.userlist as select * from filelocation1; quit; proc sort data=WORK.userlist nodupkey dupout=userdups_checking; by _all_; run; There are 0 observations that came out of userdups_checking. In another process flow, I made a table for the utilization (has 1 million rows). There are also no dupes as a result: proc sql; create table WORK.utilization as select * from filelocation2; quit; proc sort data=WORK.utilization nodupkey dupout=ussagedups_checking; by _all_; run; Then I merged the table in another process flow: proc sql; create table WORK.MERGED as select t1.*, t2.usage, t2.price from WORK.userlist t1 left join WORK.utilization t2 on t1.ID=t2.clientID; quit; proc sort data=WORK.MERGED nodupkey dupout=dups_checking; by _all_; run; This is where I am getting the duplicates at this point (300,100). Let me know if there's anything else I need to share

rapt1 · ‎07-05-2021

I used the same one for checking for dupes: proc sort data=WORK.utilization nodupkey dupout=dups_checking; by _all_; run; "You could have distinct observations in UTILIZATION but it could still contain multiple observations for some of the values of CLIENTID because the observations differ in some other variable." That might be the case. If that's so, how can we check for multiple CLIENTIDs? So I can see which variables they differ.

rapt1 · ‎07-05-2021

I have checked and there were no duplicates. I also checked the userlist and there were none. I forgot to mention that I was using left join because I had to capture those users in userlist whether they had usage/price values in utilization. So I was only expecting 300,000 users in the merged table.

rapt1 · ‎07-05-2021

Hi everyone, I am trying to join tables again using the code below. userlist has 300,000 observations while utilization has about 1,000,000 observations: proc sql; create table WORK.MERGED as select t1.*, t2.usage, t2.price from WORK.userlist t1 left join WORK.utilization t2 on t1.ID=t2.clientID; quit; However, MERGED had around 300,100 observations in the output. I found that there were duplicates in the MERGED table by using another code such as this, to eliminate them. The output did show the 100 dupes: proc sort data=WORK.MERGED nodupkey dupout=dups_checking; by _all_; run; Now, this works fine for now. Here are my questions: 1. I made sure that there are no duplicates in both the left and right table. So why is it still producing duplicates? 2. When I ran code block 1 with another set of table, I still had duplicates. When I ran code block 2, it did show the number of duplicates but I am still not getting the same amount of observations in the original table and the output table. Any insights on this problem?

rapt1 · ‎07-05-2021

Thanks Kurt! and Thanks everyone for their responses. There is a lot to learn!

rapt1 · ‎07-05-2021

Thank you for this alternative. Will try to read into these additional options further.

Online Status	Offline
Date Last Visited	‎08-22-2021 08:46 AM

Re: Where statements with "and"

Where statements with "and"

Re: Case when for multiple columns

Re: Case when for multiple columns

Case when for multiple columns

Re: Removing column names in output dataset

Re: Removing column names in output dataset

Removing column names in output dataset

Re: Join produces duplicates

Re: Join produces duplicates

Re: Where statements with "and"

Re: Case when for multiple columns

Re: Case when for multiple columns

Re: Case when for multiple columns

Re: Case when for multiple columns

Re: Where statements with "and"

Where statements with "and"

Re: Case when for multiple columns

Re: Case when for multiple columns

Case when for multiple columns

Re: Removing column names in output dataset

Re: Removing column names in output dataset

Removing column names in output dataset

Re: Join produces duplicates

Re: Join produces duplicates

Re: Join produces duplicates

Re: Join produces duplicates

Join produces duplicates

Re: SAS left join via proc sql

Re: SAS left join via proc sql