The subject of this course is the implementation of numerical algorithms on state-of-the-art high-performance computers. The following topics will be covered: high-performance computing architectures (modern CPUs, memory hierarchies, shared-memory vs distributed memory systems), performance optimization techniques(inlining, loop unrolling, profiling tools, assembly code inspection, optimized libraries), and methods of parallel programming (MPI, OpenMP). Modern accelerator systems (Cell, GPU) are also discussed. The course as a lab component which allows students to put into practice the knowledge acquired in the course. (lec 3) cr 3. Lecture (3.00).