BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jswinford
Obsidian | Level 7

Hi, I am relatively new to SAS.  I have a dataset that I'm trying to merge with a completely different database, so I've been editing fields in order for the merge to go smoothly. I uploaded this dataset from an .csv file using the import wizard.  For some reason, I can only edit things using PROC SQL with this dataset, and being relativley new to this, I would rather use simpler syntax (PROC DATASETS?).  Can any one think of a reason my dataset would be doing this?  It's been realy frustrating. 

 

If not, can anyone give me the PROC SQL to do this (a simple if/then statement):

data output;

set input;

If variable1 = 1 then variable2 = 2;

run;

 

Let me know if you need more information...and bear with me, I am super new at this.

 

Thanks,

Jillian

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Not sure why you are being forced to use SQL, unless you are actually programming in the external database and not programming in SAS.

The general way to do conditional things in SQL is the CASE statement.

So if your input table as two variables named VARIABLE1 and VARIABLE2 and you wnat to make a new table that is the same data except that variable2 is set to 2 when variable1=1 then your program might look like this.

proc sql ;
create table output  as
select variable1
    , case when (variable1=1) then 2 else variable2 end as variable2 
from input
;
quit;

View solution in original post

10 REPLIES 10
Tom
Super User Tom
Super User

Not sure why you are being forced to use SQL, unless you are actually programming in the external database and not programming in SAS.

The general way to do conditional things in SQL is the CASE statement.

So if your input table as two variables named VARIABLE1 and VARIABLE2 and you wnat to make a new table that is the same data except that variable2 is set to 2 when variable1=1 then your program might look like this.

proc sql ;
create table output  as
select variable1
    , case when (variable1=1) then 2 else variable2 end as variable2 
from input
;
quit;
jswinford
Obsidian | Level 7

I figure out how to make this syntax work....since my second variable was a character variable I had to put a name instead of a numerical value... the problem now is that I only have 2 fields with my output.. how do I bring all the other fields along with that input?

Kurt_Bremser
Super User

Post the complete log from the data step code you tried. SAS has no restrictions against the use of a data step. The only place where you cannot use a data step is within an SQL pass-through, which is not the case here.

Doc_Duke
Rhodochrosite | Level 12

Jillian,

 

Are you using SAS Enterprise Guide?  If so, the default task for manipulating data is SAS SQL.  To use standard SAS programming, you need to click 'Program' and select 'New Program' to put a code node into the flow.

jswinford
Obsidian | Level 7

No I'm using the SAS 9.4 editor.  It just doesn't make sense.  At first I was wondering if it was because I imported it using the import wizard versus PROC IMPORT, but that didn't solve any problems either.  

jswinford
Obsidian | Level 7

Okay, here is an example:

 

This is my input code, CBPdata being my out put, winter_84_16 being my input, and Layer being my variable:

data work.CBPdata;
set out1.winter_84_16;
length Layer $ 1;
label Layer='LEVEL';
run;

 

This is the Log message:

data work.CBPdata;
8 set out1.winter_84_16;
9 length Layer $ 1;
WARNING: Length of character variable Layer has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the length
of a character variable.
10 label Layer='LEVEL';
11 run;

NOTE: There were 58274 observations read from the data set OUT1.WINTER_84_16.
NOTE: The data set WORK.CBPDATA has 58274 observations and 30 variables.
NOTE: DATA statement used (Total process time):
real time 0.16 seconds
cpu time 0.10 seconds

 

It keeps giving me these messages.  I'm assuming when I imported the excel file for some reason it set all of the lenghts and labels and I have to use PROC SQL to change things.  Which is a problem because from what I can tell you can't RENAME fields using PROC SQL.

 

I hope this is helpful

ballardw
Super User

@jswinford wrote:

Okay, here is an example:

 

This is my input code, CBPdata being my out put, winter_84_16 being my input, and Layer being my variable:

data work.CBPdata;
set out1.winter_84_16;
length Layer $ 1;
label Layer='LEVEL';
run;

 

This is the Log message:

data work.CBPdata;
8 set out1.winter_84_16;
9 length Layer $ 1;
WARNING: Length of character variable Layer has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the length
of a character variable.
10 label Layer='LEVEL';
11 run;

NOTE: There were 58274 observations read from the data set OUT1.WINTER_84_16.
NOTE: The data set WORK.CBPDATA has 58274 observations and 30 variables.
NOTE: DATA statement used (Total process time):
real time 0.16 seconds
cpu time 0.10 seconds

 

It keeps giving me these messages.  I'm assuming when I imported the excel file for some reason it set all of the lenghts and labels and I have to use PROC SQL to change things.  Which is a problem because from what I can tell you can't RENAME fields using PROC SQL.

 

I hope this is helpful


The messages do not indicate any error. What has happened is every time you import a file with Proc Import the procedure has to guess characteristics such as variable type and length of character variables. Every time. So if you have multiple Excel files with the same layout the results can change because the Excel engine only shows a few rows of data to the SAS import engine. The online documentation for Proc Import has examples of how to increase the number rows examined but if the longest value for variable X is 15 in one set and then 10 in the next the lengths will be different regardless of the number of rows examined.

You get the specific warning

WARNING: Length of character variable Layer has already been set.

Because you use this code:

 

set out1.winter_84_16;
length Layer $ 1;

when the variable Layer is in the data set(s) on the Set statement the length from the version in the data set is used and you can't change it that way. The rest of the message is very explicit and simple:

 

Use the LENGTH statement as the very first statement in the DATA STEP to declare the length
of a character variable.

So:

 

length Layer $ 1;
set out1.winter_84_16;

actually the length only has to be before the Set statement bringing in the data that contains the variable(s) you want.

 

 

Depending on what you want to do to an existing data set you might investigate Proc Datasets which will allow you to rename data sets or variables, change formats and labels. Proc Datasets, and actually no procedure, will allow to change variable type of an exisiting variable.

 

Before using the length statement as demostrated I would verify that the values of the variable in the data set on the SET statement will not get truncated (unless that is exactly what your are attempting).

jswinford
Obsidian | Level 7

Thanks,

I think this solved a lot of my problems.  I'm super new to this so sometimes it's hard to determine what I'm doing wrong in the syntax since I'm not really familiar with it in the first place.  

 

Hopefully this will make editing this dataset a lot easier.

 

ballardw
Super User

One important thing is that WARNINGS are not ERRORS. Warnings will occur for a number of things that you might do that will result in possibly unexpected or unintended results. For instance the Warning in this thread was telling you that the length result might be different than 1 as you attempted with the length statement.

 

A very common warning when combining data, especially that started from different proc import steps, is that some variables have different lengths betweee the different sets. Depending on the order and method of combining them the data may actually get truncated to the shorter version. There is nothing quite a frustrating a tracking down an if condition such as: IF variable = "Black" that seems not work and finding out because of the above truncation at some step that the actual value has become "Blac".

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1635 views
  • 2 likes
  • 5 in conversation