BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
I fairly new to SAS and like to know how to make an approach for a senario I have.

Need to create a dataset from different source where the out dataset created will have fields taking data from both the files and the tables. While creating dataset some conditions are applied inbetween so not all the records are fetched from the files or tables. Some fields may have a straight move and some are likely to have some transformation like change date or some conditions are applied.

Like to know how we can do it?

Is there any link I can refer to know how it can be done.

Is it possible to do in a single datastep or proc step? Would like to get some tips.
Thanks
4 REPLIES 4
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
For optimum data/program control (files and tables?), use the DATA step approach to read up your input(s) and to perform your SAS variable/column assignments. Suggest you declare all SAS variables up-front with ATTRIB/LENGTH/FORMAT statements, as preferred, and document well.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

data step programming intro site:sas.com
deleted_user
Not applicable
I have couple of options in mind to process files individually and tables individually and the the DS's created can be merged. But wonder if we can do it in a single datastep or proc step?

I have seen in couple of ETL tools that we can pull data or records from various source and do transformations and the records were written out., can we do the same way in SAS?

Like take data from files and tables at the same time and process it. edit

Message was edited by: activa
Peter_C
Rhodochrosite | Level 12
YES
data step is good for that kind of mixture
jj030655
Calcite | Level 5
A data step sounds like what you need. If you have SAS Access for the database in question that lets you define a libname statement for the database then this might help.

*---> nodupkey option makes "commonvariable" a unique key for each dataset;
*---> this is important because none unique records in a merge cause interesting results...;
proc sort data=dblib.dbtablename out=dbtable nodupkey;
by commonvariable;
run;

proc sort data=sasdatasetlibname.sasdatasetname out=sasdataset nodupkey;
by commonvariable;
run;

data inboth in_A_only in_B_only;
merge dbtable (in=a) sasdataset (in=b);
by commonvariable;
if a & b then output inboth;
else if a then output in_A_only;
else if b then output in_B_only;
run;

The "(in=a)" after the dataset cause a variable name "a" to have value of '1' when a record is read from that dataset and a zero when there is not matching record for the "commonvariable".

as far as keeping certain fields...

data inboth in_A_only in_B_only;
merge dblib.dbtablename (in=a keep=var1 var2 var3) sasdatasetlibname.sasdatasetname (in=b keep=var4 var5 var6);
by commonvariable;
if a & b then output inboth;
else if a then output in_A_only;
else if b then output in_B_only;
run;

All datasets will have var1 var2 var3 var4 var5 var6 in them. You can use a drop or keep statement on the output datasets if they only need some of the variables. Kepp in mind that the dataset "in+A+only" will have values in the var4 var5 var6 variables because the first dataset did not provide any values (no matching record for "commonvariable").

converting the "if a & b" and "else if" into "do-end" constructs can provide youthe opportunity to adjust the values you are interested in.

Hope this helps...JJ

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1823 views
  • 0 likes
  • 4 in conversation