Skip to content.

TUD TU Dresden

ARCS 2008 - Architecture of Computing Systems
TU Dresden » Faculty of Computer Science » ARCS 2008

Workshops & Tutorials

Workshops

Tutorials

The CELL Broadband Engine: Hardware and Software Architecture
Roland Seiffert, IBM Germany
Jochen Roth, IBM Germany
Mijo Safradin, IBM Germany
Evolvable Hardware
Jim Torresen and Kyrre Glette, University of Oslo (Norway)
Marco Platzner and Paul Kaufmann, University of Paderborn (Germany)
(Handout)
An Introduction to General-Purpose Computing on Graphics Processing Units
Dominik Göddeke, University of Dortmund (Germany)
Robert Strzodka, Max Planck Center, Germany
The NVidia Compute Unified Device Architecture (CUDA) and Programming Model
Simon Green, NVidia Inc., UK
Quantitative Emergence and Self-Organisation: A Specification of Terms
Christian Müller-Schloer, Leibniz-Universität Hannover
Hartmut Schmeck, Universität Karlsruhe (TH)

Tutorial Timetable

Monday, February 25

Room:E005E006E007E009
09:30-11:00Slot 1CELL
HW Architecture
GPGPU
Introduction
EVO
11:00-11:30— Coffee break —
11:30-13:00Slot 2CELL
SW Architecture
GPGPU
Introduction
EVOEMERG
13:00-14:00— Lunch —
14:00-15:30Slot 3CELL
Programming + Debugging
GPGPU
NVIDIA
EMERG
15:30-16:00— Coffee break —
16:00-17:30Slot 4CELL
Demos + Q&A
GPGPU
NVIDIA

Tutorial Abstracts

The CELL Broadband Engine: Hardware and Software Architecture

Cell BE has been designed to support a wide range of applications including digital media, entertainment, communications, medical imaging, security and surveillance, and HPC workloads. The first implementation, the Cell BE processor, is a single-chip multicore processor with nine processor elements operating on a shared, coherent memory. Each Cell BE comprises a power processor element (PPE) and eight synergistic processor elements (SPEs).

This tutorial will introduce the Cell Broadband Engine architecture platform from a hardware and software perspective. First, we will highlight the Cell architecture features briefly and talk about the history and future of the Cell Broadband Engine. The second part is focused on the Cell microprocessor architecture including the PPE, SPE, memory flow controller, element interconnect bus, resource allocation management, I/O, and memory interfaces. Afterwards we will continue with the Linux kernel, explaining the platform abstraction layer, integrated interrupt handler, I/O memory management unit, power management, hypervisor abstractions, south bridge drivers, and the SPU file system. The last part of the tutorial will focus on Cell programming models and software design methodology including PPE-SPE hybrid programming models, and multitasking SPEs.

Evolvable Hardware

Traditional hardware design aims at creating circuits which, once fabricated, remain static during run-time. This changed with the introduction of reconfigurable technology and devices which opened up the possibility of dynamic hardware. However, the potential of dynamic hardware for the construction of self-adaptive, self-optimizing and self-healing systems can only be realized if automatic design schemes are available. One such method for automatic design is evolvable hardware. Evolvable hardware was introduced more than ten years ago as a new way of designing electronic circuits. Only input/output relations of the desired function need to be specified, the design process is then left to an adaptive algorithm inspired from natural evolution. The design is based on incremental improvement of a population of initially randomly generated circuits. Circuits among the best ones have the highest probability of being combined to generate new and possibly better circuits. Combination is by crossover and mutation operation of the circuit description.

In the first part of this tutorial, we will give an overview of the large variety of schemes and architectures for evolvable hardware. Then we will present a number of real-world applications for which evolvable hardware has been applied already. We will also discuss current limitations of evolvable hardware and explain why only rather small circuits have been evolved so far. After pointing to recent research that tries to overcome these limitations, we turn to a novel feature provided with hardware evolution: run-time adaptation.

The second part of this tutorial follows a hands-on approach. Participants will have the opportunity to experiment with the MOVES simulation framework, and to observe and steer the evolutionary design of their own circuits. The MOVES framework is freely available and comes with several hardware representation models and evolutionary optimizers. Due to its modular and extensible design it is an ideal tool for starting own evolvable hardware research.

Contents of the tutorial
  • Introduction to evolvable hardware
  • Examples of real-world applications
  • Current limitations and research
  • Run-time adaptable systems
  • Hands-on: experiments with the MOVES framework

An Introduction to General-Purpose Computing on Graphics Processing Units
and
The NVidia Compute Unified Device Architecture (CUDA) and Programming Model

The graphics processing unit (GPU) on today's commodity video cards has evolved into an extremely powerful and flexible processor. GPUs provide superior memory bandwidth and computational horsepower, creating interest in them beyond the field of computer graphics. GPGPU stands for General- Purpose Computation on GPUs. Researchers have found that exploiting the GPU can accelerate some non-graphics problems by over an order of magnitude over the CPU. Several high level languages have emerged for graphics hardware, making this computational power accessible.

However, significant barriers still exist for the developer who wishes to use the inexpensive power of GPUs. These chips are designed for and driven by video game development; the programming model is unusual, resources are tightly constrained, and the underlying architectures are largely secret. This course provides detailed coverage of general-purpose computation on graphics hardware. We emphasize core computational building blocks, for example linear algebra, and review the tools, perils, and tricks of the trade in GPU programming. Example applications and case studies are discussed and evaluated, with a special focus on large-scale GPU cluster computing. Common misconceptions and concerns about GPGPU (e.g. precision vs. accuracy) are addressed.

NVIDIA's CUDA is a new system for general purpose computing on GPUs. CUDA is based on a new programming API which is entirely separate from the graphics driver. It uses the standard C language with extensions, and exposes new hardware features that are not available from OpenGL or Direct3D. The most important of these new features are shared memory, which can greatly improve the perfor- mance of bandwidth-limited applications, and an arbitrary load/store memory model, which enables many new algorithms which were previously difficult or impossible on the GPU.

Course material and additional information will be available online.

Course Outline
  • Introduction
  • GPU architecture
  • Data parallel algorithms
  • Languages and programming environments
  • Successful GPU applications
  • Introduction to CUDA
  • CUDA performance
  • CUDA application examples (GPU physics and imaging)
  • GPU integration in existing large applications
  • Roundup, summary and conclusions

Quantitative Emergence and Self-Organisation: A Specification of Terms

Organic Computing (OC) has emerged as a challenging vision for future information processing systems, based on the insight that already in the near future we will be surrounded by large collections of autonomous systems equipped with sensors and actuators to be aware of their environment, to communicate freely, and to organize themselves. The presence of networks of intelligent systems in our environment opens fascinating application areas but, at the same time, bears the problem of their controllability. Hence, we have to construct these systems - which we increasingly depend on - as robust, safe, flexible, and trustworthy as possible. In particular, a strong orientation of these systems towards human needs as opposed to a pure implementation of the technologically possible seems absolutely central. In order to achieve these goals, our technical systems will have to act more independently, flexibly, and autonomously, i.e. they will have to exhibit life-like properties. We call those systems "organic". Hence, an "Organic Computing System" is a technical system, which adapts dynamically to the current conditions of its environment. It will be self-organizing, self-configuring, self-healing, self-protecting, self-explaining, and context-aware.

Two key concepts of OC are the phenomena of emergence and self-organization. They are defined not very consistently in various research fields. In particular, quantitative notions of these terms have been missing so far. In this tutorial we will report the state of affairs as derived from the many discussions within the DFG special Priority Programme Organic Computing.

Content

The first part of this tutorial will concentrate on a quantitative definition of emergence. The presentation will try to reduce the somewhat fuzzy meaning of this term to a technically usable interpretation. It will start with examples of emergence from nature and technical systems and will discuss order and entropy in relation to emergence. Emergence will be defined as self-organized order. It will be shown that emergence is formally equivalent to information-theoretical redundancy. Application examples and some considerations of architectures using quantitative emergence will conclude the lecture.

The second part will briefly summarise the state of the art and present a characterisation of (controlled) selforganisation and adaptivity that is motivated by the main objectives of the OC initiative. We propose a system classification of robust, adaptable, and adaptive systems and define a degree of autonomy to be able to quantify how autonomously a system is working. The degree of autonomy distinguishes and measures external control which is exhibited directly by the user (no autonomy) from internal control of a system which might be fully controlled by an observer/controller architecture that is part of the system (full autonomy).

Ticker

Thanks for your contribution!

Dear Presenters!
Please, kindly consider to submit your conference presentation for publication on this website.

Contact

Prof. Dr.-Ing. Christian Hochberger
E-Mail: christian.hochberger@inf.tu-dresden.de

Prof. Dr.-Ing. habil. R.G. Spallek
E-Mail: rgs@ite.inf.tu-dresden.de

Telefax
+49 (0)351 463 38324
Street Address
Nöthnitzer Straße 46
01187 Dresden
Mail Address
Technische Universität Dresden
Fakultät Informatik
Institut für Technische Informatik
D-01062 Dresden