Data step into proc sql

Babloo · Posted 09-25-2019 08:18 AM

Is there a way that we can convert the below data steps into one proc SQL step?

data dis_entity_id ;
  set &library..&input_table ;
  format ENTITY_ID $ENTITY. REPORTING $50.;
  keep ENTITY_ID REPORTING;
run;

proc sort data=dis_entity_id nodupkey;
  by ENTITY_ID REPORTING;
run;

PeterClemmensen · Posted 09-25-2019 08:21 AM

I can't see your data so needless to say, this code is untested

proc sql;
    create table dis_entity_id as
    select distinct ENTITY_ID format=$ENTITY.,
           REPORTING format=$50.
    from &library..&input_table
    order by ENTITY_ID, REPORTING;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

Babloo · Posted 09-26-2019 02:02 AM

Data steps which I mentioned in the intial post will create a variable(s) like ENTITY_ID, REPORTING... in case if it is not available in source dataset. However your code has ended with error as below. I forgot to mention this point in my post. How to tweak your code now to produce the desired result?

26         data test;
27         input REPORTING $;
28         datalines;

NOTE: The data set WORK.TEST has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      
30         ;

31         run;
32         
33         proc sql;
34             create table dis_entity_id as
35             select distinct ENTITY_ID format=$ENTITY.,
36                    REPORTING format=$50.
37             from test
38             order by ENTITY_ID, REPORTING;
ERROR: The following columns were not found in the contributing tables: ENTITY_ID.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
39         quit;

PGStats · Posted 09-25-2019 05:37 PM

I guess the problem is of keeping all original variables but changing the format of just two columns. Can be done with:

proc sql;
create table dis_entity_id as
	select * from &library..&input_table
	order by ENTITY_ID, REPORTING;
alter table dis_entity_id 
	modify 
		ENTITY_ID CHARACTER format=$ENTITY.,
		REPORTING CHARACTER format=$50.;
quit;

PG

Babloo · Posted 09-26-2019 02:04 AM

Data steps which I mentioned in the intial post will create a variable(s) like ENTITY_ID, REPORTING... in case if it is not available in source dataset. However your code has ended with error as below. I forgot to mention this point in my post. How to tweak your code now to produce the desired result?

26         proc sql;
27         create table dis_entity_id as
28         	select * from test
29         	order by ENTITY_ID, REPORTING;
ERROR: The following columns were not found in the contributing tables: ENTITY_ID.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
30         alter table dis_entity_id
31         	modify
32         		ENTITY_ID CHARACTER format=$ENTITY.,
33         		REPORTING CHARACTER format=$50.;
NOTE: Statement not executed due to NOEXEC option.
34         quit;

ballardw · Posted 09-26-2019 11:23 AM

@Babloo wrote:

Data steps which I mentioned in the intial post will create a variable(s) like ENTITY_ID, REPORTING... in case if it is not available in source dataset. However your code has ended with error as below. I forgot to mention this point in my post. How to tweak your code now to produce the desired result?
26         proc sql;
27         create table dis_entity_id as
28         	select * from test
29         	order by ENTITY_ID, REPORTING;
ERROR: The following columns were not found in the contributing tables: ENTITY_ID.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
30         alter table dis_entity_id
31         	modify
32         		ENTITY_ID CHARACTER format=$ENTITY.,
33         		REPORTING CHARACTER format=$50.;
NOTE: Statement not executed due to NOEXEC option.
34         quit;

Did you change the FROM data set name to one you actually have with the appropriate data?

Babloo · Posted 09-26-2019 11:27 AM

Yes, I did changed the FROM dataset to have one variable REPORTING.

could you please help me to either combine your two data step into one step
either using data step or proc sql?

Tom · Posted 09-26-2019 11:53 AM

You didn't explain the PURPOSE of the data step.

From your later comments it sounds like the purpose is to force the creation of those two variables. Not sure why you would want to attach the $50. format to the second variable, especially if it could already exist and have a different length than 50 bytes.

Let's assume that the purpose is to create REPORTING as a character variable of length 50. You don't show how you want ENTITY_ID created (or the definition of the $ENTITY. format) so let's just assume it should be a character variable of length 3 and use the $CHAR3. format as place holder for the $ENTITY. format in the code. Here is code using OUTER UNION CORRESPONDING and pulling the NAME variable from SASHELP.CLASS to give SQL a source for the definition of the potential extra columns.

proc sql ;
create table dis_entity_id as 
select * from &library..&input_table
outer union corresponding 
select name as entity_id length=3 format=$char3.
     , name as reporting length=50 
from sashelp.class(obs=0)
order by entity_id, reporting
;
quit;

Note that is is much harder to implement the NODUPKEY functionality of PROC SORT in PROC SQL. Unless the table actually only has those two variables. In which case the query could then be something like this:

proc sql ;
create table dis_entity_id as 
select distinct
   entity_id format=$char3.
 , reporting 
from
(
select * from &library..&input_table
outer union corresponding 
select name as entity_id length=3 
     , name as reporting length=50 
from sashelp.class(obs=0)
)
order by entity_id, reporting
;
quit;

Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Re: Data step into proc sql

Registration is open

Registration is open

SAS Training: Just a Click Away