BookmarkSubscribeRSS Feed
Adambo
Obsidian | Level 7

When changing the name of a dataset using a datastep to further manipulate the dataset, I see that some of the variables pos have changed. This results in some of the values shifting to the wrong variables.

How can I prevent changes other than the dataset's name when performing the following steps?

 

DATA want;

SET have;

RUN; 

 

Thanks,

Adam

20 REPLIES 20
PGStats
Opal | Level 21

In my experience with SAS, what you describe does not happen. Please post a reproducible example. 

PG
Adambo
Obsidian | Level 7

Here is a screen shot of the dataset before and after the datastep. A value seems to be appear in the first collumn. Below are the output of the Proc Contents for the two datasets. Note how the position changes for some variables.

 

lab.PNG

 

LAB1 Proc Contents

lab1.PNG

 

BIOMEDIC Proc Contents:

biomedic.PNG

 

PGStats
Opal | Level 21

Very confusing! Nothing matches after a simple copy operation? What is your theory as to what might have happened?

PG
Adambo
Obsidian | Level 7

My original file might be carrying some value from previous manipulations (ex, Retain), hidden values, or be corrupted. I have no idea. Thanks for confirming this is abnormal.    

Reeza
Super User

@Adambo wrote:

My original file might be carrying some value from previous manipulations (ex, Retain), hidden values, or be corrupted. I have no idea. Thanks for confirming this is abnormal.    


That doesn't make sense. 

 

If the code is as shown, with no retain or any other process in between the position in the buffer shouldn't change. Note that your variable order is staying the same. 

 

Can you post the code that causes including the proc contents before and after?

 

If you can replicate it I would contact SAS Tech Support. 

 

Not sure what version your on, but In SAS 9.4 POS isn't even included in the output from proc contents, there's a NPOS though. 

Adambo
Obsidian | Level 7

Someone else created the file i'm working on. I don't know what specific manipulations were done before it got to me.

Reeza
Super User

@Adambo wrote:

Someone else created the file i'm working on. I don't know what specific manipulations were done before it got to me.


Those don't matter, they're not maintained with the dataset. 

What matters is what you're doing. Are you seeing a note in log? About CEDA access perhaps?

Please post the code and log.

Adambo
Obsidian | Level 7

6635      DATA SAS.BIOMEDIC;
6636      SET SAS.LAB1;
6637      RUN;

NOTE: 51588 observations were read from "SAS.LAB1"
NOTE: Data set "SAS.BIOMEDIC" has 51588 observation(s) and 32 variable(s)
NOTE: The data step took :
      real time : 0.118
      cpu time  : 0.109


6638      quit; run;
6639      ODS _ALL_ CLOSE;

Reeza
Super User

POS -> Position in Buffer

NUMBER -> Position in Dataset

 

This results in some of the values shifting to the wrong variables.

 

According to Proc Compare output all variables and values are identical. 

 

 

Personally, I don't see an issue. If you want to understand why the location in the buffer has changed it could be as simple as you're writing the data to a different system. I would consider asking tech support for an answer if interested, but given everything provided the change in POS should not impact your work.

Shmuel
Garnet | Level 18

Please run the compare procedure:

 

Proc compare data=LAB1  compare=BIOMEDIC;

        ID   I_LABIM_KITID;

run;

 

Look at the output of this procedure. Are there unequal values ? 

If you don't understand the output report, post it and I shall try to clarify it.

 

You should not be aware of changes in posions of variables, as long as all observations are equal.

 

Shmuel

Adambo
Obsidian | Level 7

compare.PNG

SASKiwi
PROC Star

The PROC COMPARE proves that the 2 datasets are identical as far as column names and values are concerned which is what you would expect. It doesn't explain the change in column positions.

Shmuel
Garnet | Level 18

Clarification of results:

 

You got the note thal all observations are duplicate, that is because I_LABIM_KITID is not a unique key.

Change the ID line and add the other key varables in a format of:  ID key1 key2 key3 ... ;

 

You got also the message "Nomber of Observations with Some Compared Variables Unequal = 0" and

                                             "Nomber of Observations with All Compared Variables Equal = 51588" which is the whole dataset.

Those 2 messages are the proof that the 2 datasets are equal by values.

 

Regards, Shmuel

 

Adambo
Obsidian | Level 7

I tried Proc Compare with different KeyIDs. I thought this would not work as the KeyID values are different in the two datasets (as displayed to me) but it appears SAS still sees them as the same regardless.

 

So I reckon the two datasets are "technically" the same. So I guess this is something related to the display and my system.

 

compare.PNG

 

I reckon the two datasets are "technically" the same. So I guess this is something related to the display provided by my system.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 2176 views
  • 15 likes
  • 5 in conversation