BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
riccardo88
Calcite | Level 5

Dear Experts,

 

I´m Walking my first steps with SAS and I Need some help.

I have about 100 dataset that look like this:

 

Unbenannt.png

 

There are five different values for each dataset for the variable "Durchgang". In this case you can see two of them (112500, 121000). These numbers are different for every dataset.

 

What I would like to do is to write a Programm that identify these different values in every dataset and sobstitute them with the values 1, 2, 3, 4 and 5. So:

 

112500 -> 1

121000 -> 2

123040 -> 3

etc...

 

Thank you very much for your help!

 

(Version 9.4)

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data want;
  set tmp3.out_auf_: indsname=tmp;
  length dsname $200;
  dsname=tmp;
run;

data want;
  set want;
  by dsname durchgang;
retain cnt;
if first.dsname then cnt=0;
if first.durchgang then cnt=cnt+1;
if cnt <= 5 then durchgang=cnt;
run;

I don't have any test data, but something like that should work, basically cnt is incremented each time for a new durchgang, and the first five per file are renamed to the count.

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

As with all these questions which concern data which has been split up, putting same data together eradicates the problem:

data want;
  set tmp3.out_auf_: indsname=tmp;
  length dsname $200;
  dsname=tmp;
  if durchgang=112500 then durchgang=1;
  if durchgang=121000 then durchgang=2;
  if durchgang=123040 then durchgang=3;
run;

This takes all datasets from library tmp3 which have the prefix out_auf_, sets them all together, and creates a variable which is the name of the incoming dataset (in case you need to split the data later on).  You can then work with the one dataset and vastly simplify your coding and processing, avoiding messy code generation, macro etc.

riccardo88
Calcite | Level 5

Dear RW9,

 

Thank you for your answer!

 

I also think creating a single dataset makes a lot of sense.

The Problem is that those number are different for different datasets...

 

So for the first one I want to assign:

112500 -> 1

121000 -> 2

123040 -> 3

etc.

 

The second one may be:

109389 -> 1

182937 -> 2

100703 -> 3

etc.

 

And so on for onehundred times.

Is there a simple way to automatize this process?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You would have to give me some logical reason why those are being changed?  Is this decode information in a dataset, is it logically deduced maybe from position in dataset or some other variable.  Just saying that some values which are not the same in any dataset need to be changed, I am afraid is not enough information to program anything.  Need rules.

riccardo88
Calcite | Level 5

The electroencephalography device Attributes to every recording session a number (e.g 120500).

I have a dataset for every Person that did the Experiment with 5 different recording session, each one identified by a different number.

 

To compare different Experiment subject I would like the first session of every Experiment subject to be identified by the same number, 1.

The second session to be identified by 2, the third by 3 and so on....

 

so after that I can for example measure the mean of a value for ALL the Experiment subjects during the same session.

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data want;
  set tmp3.out_auf_: indsname=tmp;
  length dsname $200;
  dsname=tmp;
run;

data want;
  set want;
  by dsname durchgang;
retain cnt;
if first.dsname then cnt=0;
if first.durchgang then cnt=cnt+1;
if cnt <= 5 then durchgang=cnt;
run;

I don't have any test data, but something like that should work, basically cnt is incremented each time for a new durchgang, and the first five per file are renamed to the count.

riccardo88
Calcite | Level 5

Thank you very much! This works although incompletely.

Í'm probably doing something wrong.

 

My code:

 

data try;
  set tmp3.out_auf_: indsname=tmp;
  length dsname $200;
  dsname=tmp;
run;

data try_2;
  set try;
  by dsname Durchgang;
  retain cnt;
  if first.dsname then cnt=0;
  if first.durchgang then cnt=cnt+1;
  if cnt <= 5 then durchgang=cnt;
run;

 

my LOG (for the second datastep):

 

 

61   data try_2;
62     set try;
63     by dsname Durchgang;
64     retain cnt;
65     if first.dsname then cnt=0;
66     if first.durchgang then cnt=cnt+1;
67     if cnt <= 5 then durchgang=cnt;
68   run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      67:30
ERROR: BY variables are not properly sorted on data set WORK.TRY.
Durchgang=122900 VPNummer=100 Signalname=C4-A1 Date=2017/11/17 Time=12:35:04 FFT=1 Hz0_5=33.356
Hz1_0=31.843 Hz1_5=0.041 Hz2_0=6.043 Hz2_5=1.812 Hz3_0=4.102 Hz3_5=0.831 Hz4_0=6.459 Hz4_5=3.986
Hz5_0=1.192 Hz5_5=2.506 Hz6_0=0.76 Hz6_5=1.469 Hz7_0=1.388 Hz7_5=2.092 Hz8_0=3.081 Hz8_5=1.694
Hz9_0=2.093 Hz9_5=0.202 Hz10_0=0.863 Hz10_5=0.545 Hz11_0=0.242 Hz11_5=1.025 Hz12_0=0.625
Hz12_5=0.467 Hz13_0=0.136 Hz13_5=0.6 Hz14_0=0.452 Hz14_5=0.543 Hz15_0=1.699 Hz15_5=1.601
Hz16_0=0.899 Hz16_5=1.493 Hz17_0=1.723 Hz17_5=1.788 Hz18_0=1.611 Hz18_5=0.182 Hz19_0=0.11
Hz19_5=0.662 Hz20_0=0.969 Hz20_5=1.239 Hz21_0=1.353 Hz21_5=1.04 Hz22_0=0.292 Hz22_5=0.007
Hz23_0=0.102 Hz23_5=0.217 Hz24_0=0.641 Hz24_5=0.533 Hz25_0=0.304 new_time=12:35:04
starting_time_1=36372 starting_time_2=38772 starting_time_3=41354 starting_time_4=45124
starting_time_5=48064 dsname=TMP3.OUT_AUF_100_C4A1 FIRST.dsname=0 LAST.dsname=0 FIRST.Durchgang=0
LAST.Durchgang=0 cnt=4 _ERROR_=1 _N_=318
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 319 observations read from the data set WORK.TRY.
WARNING: The data set WORK.TRY_2 may be incomplete.  When this step was stopped there were 317
         observations and 64 variables.
WARNING: Datei WORK.TRY_2 wurde nicht ersetzt. Grund: da dieser Schritt angehalten wurde.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.04 seconds

 

 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

For the by group do:

by dsname Durchgang notsorted;
riccardo88
Calcite | Level 5

Oh Maaaan, you are good!

 

Thank you a lot!!

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 824 views
  • 0 likes
  • 2 in conversation