02-19-2016 12:25 AM
I need a SAS program which extract only two variables from a dataset (.dat, .txt or .cum file) that has more than 30 variable data.
For Example: An input .dat file contains 25 variables. However i want to pull only the 18th and 22nd variable in my SAS output file. Condition is I dont want to define all 25 variables in my SAS program. Only the 2 variable names, length and position should be mentioned in program.
Basically, I need this program for a QC purpose wherein - if one variable has a value, then the second should be Blank. If both values are present then such data is of interest. Its a kind of filter that is required. I can use the IF and ELSE statement to populate or delete the data in output. However not sure how to filter the larger data into only area of interest.
I have attached a sample .dat file.
Kindly request to provide a solution for my requirement.
02-19-2016 02:12 AM
Import all variables in the DATA step (run PROC IMPORT and get the resulting DATA step from the log) and use KEEP (either as statement or KEEP= option in the output dataset) to only keep the variables you want.
Why do you do this? Because it is easier to maintain in the future when the input file structure changes.
You might get away with pointers in case the input file has fixed columns.
02-19-2016 03:13 AM
02-19-2016 04:43 AM
Several question are raised here:
- You posts contain no example data, no attachments.
- What is this data file you talk about, is it fixed width, delimited, database file etc.
- Why do you want to only read a certain couple of variables - not just once, but apparently many times across differing data, this sounds to me like you have a systematic problem with your data transfers which needs to be fixed rather than trying to handle it like this.
To my mind either the vendor for the data should alter their steps, so you get data in the way you want, or you import all the data into a repository and then extract the data you want from their. Whilst there are methods which can extract parts of the data as @Ksharp has shown, what if those columns are not in 4? Or they are, but next time they are not etc. Simply put the process of data transfer should be a) agreed upon by sender and recipient - and documented! b) programs written to import the data based on the agreement c) validated throughout. Following on from this "data transfer" procedure, you can then set to working on stage two, identifying the data you want to keep - logically by using the SAS metadata on the files, or by a documented process baased on the previous documentation of the data transfer.
02-19-2016 02:42 AM
data have; length second fourth $ 100; input; second=scan(_infile_,2,',','m'); fourth=scan(_infile_,4,',','m'); datalines; as,as,as,, ds,,sd,sd ,dsd,,sds, ; run;
Here is an example:
02-23-2016 09:37 AM
Download the macro in this thread https://communities.sas.com/message/210551#210551
You will need to define an empty data set as described in the header of the macro as the model for what fields are to be read from your file(s). Like named fields, ignoring case, will be read using the same INFORMAT data type etc. The order of the columns does not matter.