KEK Computing Newsletter No.201

KEK Computing Newsletter

                                                 2006 年 5月 16日 発行 (No.201)
                                                     KEK Computing Newsletter


                     計算科学センターセミナーのお知らせ
                           KEK CRC Seminar Notice

       題名:  並列コンピューティングの教訓
       演者:  Gene Cooperman 教授
              North Eastern University, Boston, USA
       場所：  計算機北棟　第一会議室
       日時:   2006年5月22日　13時30分より

Cooperman教授は、並列計算を専門とする計算機科学の研究者です。スーパー
コンピュータに代表される高速メモリアクセスを要求されるアプリケーションから、
Data GRID環境まで、それぞれのケースで効率よく並列化するための研究を行って
きました。今回は、様々なアプリケーション並列化から学んだ教訓について講演を
していただきます。奮ってご参加ください。



Personal Lessons from Parallel Computing

Gene Cooperman,
College of Computer and Information Science Northeastern University, 
Boston, MA, USA


13:30 on 22nd May 2006
KEK Computing Research Center North Building Meeting Room 1


One of the lessons of the last two decades of parallel computing has been 
the immense influence of commodity components. This often seems to create 
a world turned upside down:  a world in which technologically superior 
computers sometimes fail in the marketplace, while a Sony or Nintendo video 
game machine may have more computing power than scientific workstations. 
This is a world in which scientists sometimes buy a high end video card for 
their workstation, and then run their CPU-intensive computations on the video
boards instead of using the CPU.  (Note, for example, Stanford University's
Brook GPU general purpose compiler for the Graphics Processing Unit.)

In one lesson from my own work, I found that the following excerpt of code
from my parallel shared memory program was running too slowly.

  Object * Z[n], Y[k];
  int X[n];
  for (int i=0; i < n; i++) Z[i] = Y[X[i]];

This led my student and me to invent a new and faster algorithm for the above
problem, which runs twice as fast as the code above. The key observation 
leading to the new algorithm was that the nature of commodity components had 
changed.  For this program, faster CPUs provide no advantage. The running 
time: the above program is limited purely by the speed of RAM.

In a second lesson from my work, I wished to use a cluster to execute 
a certain orbit computation from computational algebra, for a mathematical
group called the Baby Monster group.  Prior algorithms already required many 
thousands of CPU hours to enumerate the 173,067,389 points of an orbit of 
the J4 group.  The orbit of the Baby Monster group is almost a hundred times 
as large: 13,571,955,000 points.  Yet, my student and I enumerated this larger
orbit in only 750 CPU hours. The solution was to use disk instead of RAM 
for the primary data structures.  The key observation was that the problem 
allowed for a space-time tradeoff. By using the larger space provided 
by disks, we could correspondingly lower the time requirements.

As time permits, I will then discuss my own favorite parallel software 
architecture, task-oriented parallelism, and its large benefits for certain 
types of applications.


                                                           計算科学センター
KEK Computing Newsletter : No.201