Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.1

Lecture 4

Performance Analysis

Performance Metrics, Postulates

IKC-MH.57 Introduction to High Performance and Parallel

Computing at November 03, 2023

Dr. Cem Özdo

gan

Engineering Sciences Department

Izmir Kâtip Çelebi University

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.2

Contents

1 Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation w ith Serial Sections Model

Skeptic Postulates For Parallel Architectures

Amdahl’s Law

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.3

Performance Analysis

•

Analysis of the performance measures of parallel

programs.

•

Two computational models;

1 the equal duration processes

2 parallel computation with serial sections.

•

Two measures;

1 speed-up factor

2 efﬁciency.

•

The impact of the communication overhead on the overall

speed per for mance of multiprocessors.

•

The scalability of parallel systems.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.4

Computation al Mo dels - Equal Duration Model I

Assume that a given computation can b e divided in to

concurrent tasks

for execution on the multiprocessor.

•

In this model (t

: execution time of the whole task using a

single processor),

•

a given task can be divided into n equal subt asks,

•

each of which can be executed by one processor,

•

the time taken by each processor to execute its subtask is

•

since all processors are executi ng their subtasks

simultaneously

, then the time taken to execute the whole

task is

•

The speed-up factor of a parallel system can be deﬁned as

•

the ratio between the time taken by a single processor to

solve a given problem

•

to the time taken by a parallel system consisting of

n p rocessors

to solve the same problem.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.5

Computation al Mo dels - Equal Duration Model II

•

Speed Up;

S(n) =

= n (1)

•

This equation indicates that, according to the equal

duration model, the speed-up factor resulting from using n

processors is equal to the number of processors used (n).

•

One important factor has been ignored in the above

derivation.

•

This factor is the communication overhead, t

, which

results from the time needed for processors to

communicate

and possibly exchange data while executing

their subtasks.

•

Then the actual time taken by each process or to execute

its subtask is given by

S(n) =

/n + t

1 + n ∗ t

(2)

•

This equation indicates that the relative values of t

and

affect the achieved speed-up factor.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.6

Computation al Mo dels - Equal Duration Model III

•

A number of cases can then be studied:

1 if t

≪ t

then the potential speed-up factor is approximately

2 if t

≫ t

then the potential speed-up factor is t

≪ 1

3 if t

= t

then the potential speed-up factor is n/n + 1

∼

for n ≫ 1.

•

In order to scale the speed-up factor to a value between 0

and 1, we divide it by the number of processors, n.

•

The resulting measure is called the efﬁciency, E .

•

The efﬁciency is a measure of the speed-up achieved

per processor.

•

According to the simple equal duration model, the

efﬁciency E is equal to 1, if the communication overhead

is ignored.

•

However if the communication overhead is taken into

consideration, the efﬁ ciency can be expressed as

E =

1 + n ∗ t

(3)

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.7

Computation al Mo dels - Equal Duration Model IV

•

Although simple, the equal duration model is however

unrealistic.

•

This is because it is based on the assumption that a given

task can be divided into a number of equal subtasks.

•

However, real algorithms contain some (serial) parts that

cannot be divided among processors.

•

These (ser ial) parts must be executed on a single

processor.

Figure: Example program segments.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.8

Computation al Mo dels - Equal Duration Model V

•

In Figure program segments, we assume that we start with

a value from each of the two arrays (vectors) a and b

stored in a processor of the available n processors.

•

The ﬁrst program block can be done in parallel ; that is, each

processor can compute an element from the array (vector)

c. The elements of array c are now distri buted among

processors, a nd each processor has an element.

•

The next program segment cannot be executed in parallel.

This block will require that t he e lements of array c be

communicated to one processor and are added up there.

•

The last program segment can be done in parallel. E ach

processor can update its elements of a and b.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.9

Computation al Mo dels - Parallel Comput ation I

•

It is assumed (or k nown) that a fraction f of the given task

(computation) is not dividable into concurrent subtasks.

•

The remaining part (1 − f ) is assumed to be dividable into

concurrent subtasks.

•

The time required to execute the task on n processors is

= t

∗ f + (1 − f ) ∗ (t

/n)

•

The speed-up factor is therefore given by

S(n) =

∗ f + (1 − f ) ∗ (t

/n)

1 + (n − 1) ∗ f

(4)

•

According to this equation, the potential speed-up due to

the use of n process ors is determined primarily by the

fraction of c ode that cannot be divided.

•

If the task (program) is completely serial, that is, f = 1,

then no speed-up can be achieved regardless of the

number of processors used.

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.10

Computation al Mo dels - Parallel Comput ation II

•

This principle is known as Amdahl’s law.

•

It is interesting to note that according to this law, the

maximum speed-up factor is given by

lim

n→∞

S(n) =

•

Therefore, the improvement in performance (speed) of a

parallel algorithm over a sequential one is

•

limited not by the number of processors employed

•

but rather

by the fraction of the alg orithm that cannot be parallelized

•

According to Amdahl’s law, researchers were led to

believe that a s ubstantial increase in speed-up factor

would not be possible by using parallel architectures.

•

NOT parallelizable;

•

communication overhead,

•

a seque ntial fraction, f

Performance Analysis

Dr. Cem Özdo

gan

LOGIK

Performance Analysis

Computational Models

Equal Duration Model

Parallel Computation with

Serial Sections Model

Skeptic Postulates For

Parallel Architectures

Amdahl’s Law

4.11

Postulates - Amdahl’s Law I

•

Amdahl’s law made it so pessimistic to build parallel

computer systems.

•

Due to the intrinsic limit set on the performance

improvement (speed) regardless of the number of

processors used.

•

An interesting observation to make here is that according

to Amdahl’s law, f is ﬁxed and does not scale with the

problem size, n.

•

However, it has been practically observed that some real

parallel algorithms have a fraction that is a function of n.

•

Let us assume that f is a function of n such that

lim

n→∞

f (n) = 0

lim

n→∞

S(n) = lim

n→∞

1 + (n − 1) ∗ f (n)

= n (5)

•

This is clearly in contradiction to Amdahl’s law.

•

It is therefore possible to achieve a linear speed-up

factor for large-sized problems, given that

lim

n→∞

f (n) = 0

a condition that has been practically observed.