I have a dataset with a lot of observations. A variable of observation is a number. I want to replace this number with another one. Which number is the correct replacement is in a separate file (which can have chances over the years). My previous solution is that I wrote every possible substitute in the code:
if XYZ = 98765 then do XYZ = 12345; end; if XYZ = 98764 then do XYZ = 12345; end; if XYZ = 98763 then do XYZ = 12346; end;
The datafine is *.txt and looks like this:
98765;12345
98764;12345
98763;12346
198763;12346 and so on (yes, the first variable is some times 5 digits, sometime 6 digits. the second is allways 5 digits)
I would like a solution in which my program reads the * .txt file every time I start the program and replaces the specific value within the 150,000 observation. The program reads the dataset with:
data QWERT; infile 'C:\data\project\qwert.txt' lrecl=410; input @20 XYZ 6.; /* some other stuff happening that is not relevant for the challenge */ run;
I was thinking of some kind of loop while reading the file and than a replacement routine. But I dont really know how to start.
I am using SAS Enterpirse Guide 7.12 HF8 (German Version)
Please show an full example - what you have as your original data set, what you have as your modification data set and what you want as your final output. Please create a small representative example of your actual data structure.
Some options you can look into are UPDATE, MODIFY and COALESCE with a merge. If you'd like code examples please provide the information requested.
@Gedankenfee wrote:
I have a dataset with a lot of observations. A variable of observation is a number. I want to replace this number with another one. Which number is the correct replacement is in a separate file (which can have chances over the years). My previous solution is that I wrote every possible substitute in the code:
if XYZ = 98765 then do XYZ = 12345; end; if XYZ = 98764 then do XYZ = 12345; end; if XYZ = 98763 then do XYZ = 12346; end;The datafine is *.txt and looks like this:
98765;12345
98764;12345
98763;12346
198763;12346 and so on (yes, the first variable is some times 5 digits, sometime 6 digits. the second is allways 5 digits)
I would like a solution in which my program reads the * .txt file every time I start the program and replaces the specific value within the 150,000 observation. The program reads the dataset with:
data QWERT; infile 'C:\data\project\qwert.txt' lrecl=410; input @20 XYZ 6.; /* some other stuff happening that is not relevant for the challenge */ run;I was thinking of some kind of loop while reading the file and than a replacement routine. But I dont really know how to start.
I am using SAS Enterpirse Guide 7.12 HF8 (German Version)
how about this code.
/* sample dataset */
data have;
do XYZ=98000 to 200000;
output;
end;
run;
/* create sample txt file to explain */
filename txt temp;
data _null_;
file txt;
put '98765;12345';
put '98764;12345';
put '98763;12346';
put '198763;12346';
run;
*filename txt 'C:\data\project\qwert.txt'; /* <- In your environment. */
data QWERT;
infile txt dlm=';';
length xyz newnum 8;
input xyz newnum;
run;
filename txt;
proc sort;
by xyz;
run;
data want;
merge have qwert(in=inB);
by xyz;
if inB then xyz=newnum;
drop newnum;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.