okay, so i have a field with names in it called name. i am trying to parallel a process. i am relatively new to sas as well so here is what i need to do...
i need to get the ascii values of the first 30 letters (ONLY because it can be more) of the name, sum them and update a field a_name with a concatenation of that and the fields record number. so, like this...
id name a_name
101 abcd 101-394
102 efgh 102-410
103 ijkl 103-426 ...
you get the idea. i am piecing it together but help would be appreciated.
This works for your example data:
data have; infile datalines truncover; informat id $5. name $30.; input id name; datalines; 101 abcd 102 efgh 103 ijkl ; run; data want; set have; do i= 1 to length(name); count= sum(count,rank(substr(name,i,1))); end; length a_name $ 12; a_name=catx('-',id,count); drop i count; run;
HOWEVER there may be a few things to consider. Does your "name" variable include trailing blanks as part of the value?
Also the lengths I assigned to id and a_name a guesses. You'll have to provide better based on your actual values.
this is an ~80,000 line table. i believe i have id set at 6 and then name can be up to 50.
yeah, my first attempt that did not work. i am sure it is the datalines part. so this needs to run for every value in a table that has 80k values in it. i cannot write out that many datalines. also, the name are business names so there could be values in them however i have removed all trailing spaces via trim or something...
Show your code, and post example data in a data step. Your intentions are very unclear, at least to me.
@me55 wrote:
yeah, my first attempt that did not work. i am sure it is the datalines part. so this needs to run for every value in a table that has 80k values in it. i cannot write out that many datalines. also, the name are business names so there could be values in them however i have removed all trailing spaces via trim or something...
YOU do not need the datalines. I did that to provide a workable SAS data set. You should only need the Data Want step if you already have a SAS data set. If you don't have a SAS data set then import or use a data step to read the text file.
The length I set for a_name as I said was a guess. If your ID was 10 characters and you have up to 50 characters in name then you need to make a_name at least 10(length of ID) +1(for the dash) + something long enough to hold the sum of 50 ASCII codes. Assuming (big dange but..) an average ASCII value of about 100 then 50 characters should have a sum near 50*100 = 5000 and fit into 4 characters. BUT I would probably push that upe to 6 or so for a total length of 10+1+6 = 17. Or ignore that and let CATX set the length to possibly some much larger value.
i need it to cut off at 30. i removed the data lines...i assumed that was not needed but i am getting another error. so the table i am trying to update in is table_a so...
data table_a;
infile table_a;
informat id $6. name $30.
input id name;
run;
data table_a;
set table_a;
do i=1 to length(name);
count=sum(count,rank(substr(name,i,1)));
end;
length name $ 12;
a_name=catx('-',id,count);
drop iaap count;
run;
basically i want to update a field in the same table. so i put want=have in your example and it seems to be having an issue there...
Your first data step does not make much sense. If do not have a dataset yet then ask a new question about how to create a dataset.
So let's just assume that you already have a dataset named TABLE_A with two variables named ID and NAME and want to create a new variable named A_NAME. If you already have a variable named A_NAME then remove the LENGTH statement.
Also let's make the new dataset have a different name than the input dataset so that when it fails we haven't lost the input dataset.
data new_table_a;
set table_a;
do i=1 to min(length(name),30);
count=sum(count,rank(char(name,i)));
end;
length a_name $12;
a_name=catx('-',id,count);
drop i count;
run;
This structure:
Data existingdataset;
set existingdataset;
Does not "update" it completely overwrites the existing data set and hence is a dangerous construct.
Start with sending output to a new data set. If you only want to test something you can restrict the number of observations processed by using the OBS option.
Do you a FILENAME statement referencing you external file as table_a? That is what your infile would expect.
You say "so i put want=have in your example and it seems to be having an issue there..." What exactly is the issue? I expect at least
a warning along the lines of
WARNING: Length of character variable name has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the
length of a character variable.
because you have: Length name $ 12; instead of Length A_name.
Note that
do i= 1 to length(name);
should change to
do i= 1 to min(30,length(name));
to limit to the first 30 characters.
having a bit of a day here...okay so i need this data back in the original table. this is the last step before i build a final report and it is going to use most of the data in the table...
also, keep in mind i am very new to sas. while i have deep knowledge of sql, the data step stuff, i am learning as i am going here. personally i would rather do it in proc sql but i do need to learn sas and data steps so...i am still trying to fully understand these data steps but i do need the data back in the original table. the only other way is i need all of the data moved from the current table to a new table with the new data.
Database system have their own view of how to process things. Normally because they are geared around gathering and storing data. Usually making a new table (dataset) or database (library) is complex, time comsuming and left to specialist. So in that type of system it might make sense to think in terms on updating variables (fields, columns) in an existing dataset (table, object).
SAS is more designed for manipulating and reporting on data. So you take your inputs manipulate them to calculate the values you need and produce your report. You can save datasets (tables) easily and there is not much advantage to be gained from have pre-built datasets that you update in place.
In SAS if you have a current program that makes a report from table XXX it is trivial to update the program to report from table YYY instead (as long as it has the data you need). You can even use macro variables so that the actual code doesn't change just the value of the macro variable that contains the dataset name that the program will use.
id is an auto-numbered field that simply counts up to the last line in the table and that is record number...and as i said, it needs to find the ASCII VALUE of the characters and sum them.
The result depends on whether you include the spaces at the end of the names. Looks like you do not.
data have;
length id 8 name $50 want $40 ;
input id name want ;
cards;
101 abcd 101-394
102 efgh 102-410
103 ijkl 103-426
104 ThisNameHasMoreThanThirtyCharactersInIt ?
;
data want ;
set have ;
do _n_=1 to min(30,length(name));
sum_ascii=sum(sum_ascii,rank(char(name,_n_)));
end;
do _n_=1 to 30;
sum_ascii2=sum(sum_ascii2,rank(char(name,_n_)));
end;
do _n_=1 to length(name);
sum_ascii3=sum(sum_ascii3,rank(char(name,_n_)));
end;
a_name = catx('-',id,put(sum_ascii,32.));
run;
proc print; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.