DATA Step, Macro, Functions and more

Like to know how to start up

Reply
N/A
Posts: 0

Like to know how to start up

Hi,
I fairly new to SAS and like to know how to make an approach for a senario I have.

Need to create a dataset from different source where the out dataset created will have fields taking data from both the files and the tables. While creating dataset some conditions are applied inbetween so not all the records are fetched from the files or tables. Some fields may have a straight move and some are likely to have some transformation like change date or some conditions are applied.

Like to know how we can do it?

Is there any link I can refer to know how it can be done.

Is it possible to do in a single datastep or proc step? Would like to get some tips.
Thanks
Super Contributor
Super Contributor
Posts: 3,174

Re: Like to know how to start up

Posted in reply to deleted_user
For optimum data/program control (files and tables?), use the DATA step approach to read up your input(s) and to perform your SAS variable/column assignments. Suggest you declare all SAS variables up-front with ATTRIB/LENGTH/FORMAT statements, as preferred, and document well.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

data step programming intro site:sas.com
N/A
Posts: 0

Re: Like to know how to start up

I have couple of options in mind to process files individually and tables individually and the the DS's created can be merged. But wonder if we can do it in a single datastep or proc step?

I have seen in couple of ETL tools that we can pull data or records from various source and do transformations and the records were written out., can we do the same way in SAS?

Like take data from files and tables at the same time and process it. edit

Message was edited by: activa
Valued Guide
Posts: 2,177

Re: Like to know how to start up

Posted in reply to deleted_user
YES
data step is good for that kind of mixture
Occasional Contributor
Posts: 9

Re: Like to know how to start up

Posted in reply to deleted_user
A data step sounds like what you need. If you have SAS Access for the database in question that lets you define a libname statement for the database then this might help.

*---> nodupkey option makes "commonvariable" a unique key for each dataset;
*---> this is important because none unique records in a merge cause interesting results...;
proc sort data=dblib.dbtablename out=dbtable nodupkey;
by commonvariable;
run;

proc sort data=sasdatasetlibname.sasdatasetname out=sasdataset nodupkey;
by commonvariable;
run;

data inboth in_A_only in_B_only;
merge dbtable (in=a) sasdataset (in=b);
by commonvariable;
if a & b then output inboth;
else if a then output in_A_only;
else if b then output in_B_only;
run;

The "(in=a)" after the dataset cause a variable name "a" to have value of '1' when a record is read from that dataset and a zero when there is not matching record for the "commonvariable".

as far as keeping certain fields...

data inboth in_A_only in_B_only;
merge dblib.dbtablename (in=a keep=var1 var2 var3) sasdatasetlibname.sasdatasetname (in=b keep=var4 var5 var6);
by commonvariable;
if a & b then output inboth;
else if a then output in_A_only;
else if b then output in_B_only;
run;

All datasets will have var1 var2 var3 var4 var5 var6 in them. You can use a drop or keep statement on the output datasets if they only need some of the variables. Kepp in mind that the dataset "in+A+only" will have values in the var4 var5 var6 variables because the first dataset did not provide any values (no matching record for "commonvariable").

converting the "if a & b" and "else if" into "do-end" constructs can provide youthe opportunity to adjust the values you are interested in.

Hope this helps...JJ
Ask a Question
Discussion stats
  • 4 replies
  • 190 views
  • 0 likes
  • 4 in conversation