BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasgorilla
Pyrite | Level 9

I have a date set with oldvar comprised of a few thousand alphanumeric values. Big chunks of these values (60-100) represent respective organizations. I want to create newvar with a value based on the value of oldvar.

 

In a similar situation with only a few values for a variable I have done the below.  

data want;
     set have;
     If oldvar in ("123abc" "abc234" "789ghi") then newvar=1;
     else if oldvar in ("222xyz" "444qrs" "432opm") then newvar=2;
run;

In this case with a few thousand  values I am trying to paste from an excel file, what is the most efficient way to do this without having to put quotation marks around every single value? Thank you. 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@sasgorilla wrote:

Thank you, very helpful. I needed one more step and had one additional question. 

 

Step: Sort by oldvar

*I received an error message because I hadn't sorted by the "by" variable in each data set. This was an easy fix by sorting each of the data sets by oldvar. 

 

Question:

When I used your code with merge, the number of observations increased. Does this mean that there must be some "oldvar" in the "company_groups" that were not in the "have" dataset?  Is there an easy way to test for this if many observations in the "have" dataset have a blank or missing value for oldvar?


Yes. To use MERGE the datasets must be sorted (or have an index that can be used to retrieve the values) in the order specified in the BY statement.  You could use an SQL join instead, in which case SQL will sort the data for you.

 

If the number of observations increases there are two possible causes.  Like you said it could be that some extra observations were added.  But the subsetting IF in my code eliminates those.

data want;
  merge have(in=in1) company_groups;
  by oldvar;
  if in1;
run;

The other reason would be that you have multiple observations with the same value of OLDVAR. Either just plain duplicate observations.  Or OLDVAR values that are mapped to more than one NEWVAR value.

 

You could fix that by first eliminating the duplicates using PROC SORT.

proc sort data=company_groups nodupkey;
  by oldvar;
run;

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

Leave the list in DATA, not CODE.

So make a dataset that has OLDVAR and NEWVAR 

data company_groups;
  input oldvar $ newvar ;
cards;
123abc 1
abc234 1
789ghi 1
222xyz 2
444qrs 2
432opm 2
;

then merge it with HAVE.

data want;
  merge have(in=in1) company_groups;
  by oldvar;
  if in1;
run;
sasgorilla
Pyrite | Level 9

Thank you, very helpful. I needed one more step and had one additional question. 

 

Step: Sort by oldvar

*I received an error message because I hadn't sorted by the "by" variable in each data set. This was an easy fix by sorting each of the data sets by oldvar. 

 

Question:

When I used your code with merge, the number of observations increased. Does this mean that there must be some "oldvar" in the "company_groups" that were not in the "have" dataset?  Is there an easy way to test for this if many observations in the "have" dataset have a blank or missing value for oldvar?

Tom
Super User Tom
Super User

@sasgorilla wrote:

Thank you, very helpful. I needed one more step and had one additional question. 

 

Step: Sort by oldvar

*I received an error message because I hadn't sorted by the "by" variable in each data set. This was an easy fix by sorting each of the data sets by oldvar. 

 

Question:

When I used your code with merge, the number of observations increased. Does this mean that there must be some "oldvar" in the "company_groups" that were not in the "have" dataset?  Is there an easy way to test for this if many observations in the "have" dataset have a blank or missing value for oldvar?


Yes. To use MERGE the datasets must be sorted (or have an index that can be used to retrieve the values) in the order specified in the BY statement.  You could use an SQL join instead, in which case SQL will sort the data for you.

 

If the number of observations increases there are two possible causes.  Like you said it could be that some extra observations were added.  But the subsetting IF in my code eliminates those.

data want;
  merge have(in=in1) company_groups;
  by oldvar;
  if in1;
run;

The other reason would be that you have multiple observations with the same value of OLDVAR. Either just plain duplicate observations.  Or OLDVAR values that are mapped to more than one NEWVAR value.

 

You could fix that by first eliminating the duplicates using PROC SORT.

proc sort data=company_groups nodupkey;
  by oldvar;
run;
Ksharp
Super User

You could make a informat by proc format or Hash Table.

 

data company_groups;
  input start $ label ;
  retain fmtname 'fmt' type 'i' ;
cards;
123abc 1
abc234 1
789ghi 1
222xyz 2
444qrs 2
432opm 2
;
proc format cntlin=company_groups;
run;




data have;
  input oldvar $ ;
cards;
123abc 
abc234 
789ghi 
222xyz 
444qrs 
432opm 
;
data want;
 set have;
 newvar=input(oldvar,fmt.);
run;
PaigeMiller
Diamond | Level 26

@sasgorilla wrote:

In this case with a few thousand  values I am trying to paste from an excel file, what is the most efficient way to do this without having to put quotation marks around every single value? Thank you. 


See this explanation of using the %QLIST macro for avoiding adding quotation marks. https://communities.sas.com/t5/SAS-Programming/Assign-the-values-to-a-macro-variable/m-p/955804#M373...

 

Or as @Tom said, put these values in a data set rather than code.

--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1062 views
  • 2 likes
  • 4 in conversation