Please look over the following code, particularly toward the bottom where there is a NOTE and WARNING message:
1 proc summary data=sas_1.combined;
2 var
3 i_50401_Z
4 i_50402_Z
5 i_50403_Z
6 i_50404_Z
7 i_50405_Z
8 i_50408_Z
9 i_50501_Z
10 i_50502_Z
11 i_50503_Z
12 i_50504_Z
13 i_50505_Z
14 i_50508_Z
15 ;
16 output out=sas_1.combined_temp (drop= _:) p80= /autoname;
17 run;
NOTE: Multiple concurrent threads will be used to summarize data.
NOTE: There were 5000 observations read from the data set SAS_1.COMBINED.
NOTE: The data set SAS_1.COMBINED_TEMP has 1 observations and 12 variables.
NOTE: PROCEDURE SUMMARY used (Total process time):
real time 0.07 seconds
user cpu time 0.01 seconds
system cpu time 0.03 seconds
memory 13816.03k
OS Memory 30012.00k
Timestamp 02/11/2022 07:13:37 PM
Step Count 1 Switch Count 0
18
19 Proc sql;
20 create table sas_1.combined as
21 select a.*,
22 (a.i_50401_Z >= b.i_50401_Z_P80) as i_50401_Z_top20pct ,
23 (a.i_50402_Z >= b.i_50402_Z_P80) as i_50402_Z_top20pct ,
24 (a.i_50403_Z >= b.i_50403_Z_P80) as i_50403_Z_top20pct ,
25 (a.i_50404_Z >= b.i_50404_Z_P80) as i_50404_Z_top20pct ,
26 (a.i_50405_Z >= b.i_50405_Z_P80) as i_50405_Z_top20pct ,
27 (a.i_50408_Z >= b.i_50408_Z_P80) as i_50408_Z_top20pct ,
28 (a.i_50501_Z >= b.i_50501_Z_P80) as i_50501_Z_top20pct ,
29 (a.i_50502_Z >= b.i_50502_Z_P80) as i_50502_Z_top20pct ,
30 (a.i_50503_Z >= b.i_50503_Z_P80) as i_50503_Z_top20pct ,
31 (a.i_50504_Z >= b.i_50504_Z_P80) as i_50504_Z_top20pct ,
32 (a.i_50505_Z >= b.i_50505_Z_P80) as i_50505_Z_top20pct ,
33 (a.i_50508_Z >= b.i_50508_Z_P80) as i_50508_Z_top20pct
34
35 from sas_1.combined as a, sas_1.combined_temp as b;
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.
NOTE: Table SAS_1.COMBINED created, with 5000 rows and 447 columns.
36
37 quit;
I'm creating 'flag' variables, to designate the top 20% of the values for various variables. (Huge thanks to @ballardw for the code.)
I'd prefer to keep the name of the dataset the same. But that means overwriting the initial dataset. SAS gives a WARNING. However, the dataset seems to get updated just fine nonetheless.
What do you think? Just overlook the WARNING? Or is there a better way of going about this?
Thanks!
Nicholas Kormanik
Do not overwrite a dataset in a single step, a slight miscue will otherwise destroy it.
For simplicity of coding, I would use a DATA step and arrays.
I am afraid you need to create a new table .
Do not overwrite a dataset in a single step, a slight miscue will otherwise destroy it.
For simplicity of coding, I would use a DATA step and arrays.
Thanks guys....
/* Only works if the desired changed-to filename does not exist.... */
/*
proc datasets library=sas_1;
change combined_2=combined;
run;
*/
data sas_1.combined;
set sas_1.combined_2;
run;
proc datasets lib=sas_1;
delete combined_2;
delete combined_temp;
run;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.