Thursday, June 26, 2008

Eka: The story of an Indian champion

Sometimes it takes a trillion flops to make a Superhit. World’s fourth fastest supercomputer from India shocked everyone…pleasantly. But little do we know of the actual epicenter. Here’s some lowdown on what makes Eka a simple masterpiece!

Pratima Harigunani

OCTOBER was cruising towards its close. The clock was viciously steering towards the midnight stroke. Sweat-swathed eyebrows, nervous eyeballs, twitching palms, pounding heartbeats, crossed fingers, distraught glances and agitated minds. The air was bubbling with tension, anticipation, fears, prayers and tumultuous hope. An unknown campus in Pune's Hinjewadi outskirts was literally living a rocket launch NASA moment and just a hair distance from making history. It was nothing short in stature or excitement though. The 90 per cent run of India's supercomputer-in-making was successfully through, but the clincher was actually now. The last 10 per cent run-time. It might happen, it might not. Some kilometers away on his way and continuously on the phone, Dr. N Seetha Ram Krishna, project manager, CRL and one of the key architects, understandably still kept arming his team against the Murphy's ways, "It may fail, be prepared for everything." As the reverse countdown begin, every heart and hope in the jitter-packed room started racing high. Five, four, three, two, one and … YES! The supercomputer hit the 117.9 teraflop mark. At about 11 pm on October 31 at a TAT facility in Pune's Hinjewadi IT park outskirts at CRL, shrieks of joy, sighs of achievement, and euphoria was all that could be heard, seen and felt next. India's technological razor had made its sharpest cut again. The dream was finally alive. And one hour later when Dr Krishna looked around the same room he met another once-in-a-lifetime sight.


Exhausted with 22 hours of grinding toil for the past six months and worn out of a peak of excitement and tension just some minutes back, everyone in the same room dozed off into a blithe and well-accomplished sleep. "That's a lifetime experience." It surely was. India through CRL (Computational Research Labs), a Tata Sons' wholly owned subsidiary, had claimed its space in the world top 10 supercomputer league. Fourth in the global ranking and fastest in Asia. The 120 teraflop (sustained rating) supercomputer with a peak hit of 172 teraflop was actually a reality in October 2007, with the added pride of being the largest privately funded supercomputer in the world. And the story of this distinctive feat starts six months back? May be not. India's supercomputing lineage Supercomputing may have turned as spruce, breakneck, and swanky a game as F-1 today. But turn back the leaves of history and you find all the flavors of a long, patient, and sole marathon.

Performance was at the core then too but accompanied with stamina, persistence and time. Interestingly, the annals have their origins in the late 1980s when apparently India began developing supercomputers after being refused a Cray supercomputer by the US. For India, in particular, the epic of supercomputing scribbled its first page with the Param supercomputer programme. Param, the first Indian supercomputer, was developed by the Center for Developed and Advanced Computing (C-DAC). C- DAC pioneered the supercomputing movement in the 1990's giving India her first indigenous supercomputer in 1991 (PARAM 8000). The PARAM stock continued with PARAM 8000-600, PARAM 9000, PARAM open frame, PARAM 10000 (with 100 gigaflop (floating point operations per second) and finally PARAM-PADMA in 2003. Param Padma teraflop supercomputer with one teraflop of power (a tenfold increase over the country's previous supercomputer), heralded India's entry in the top-500 league. It was ranked 171 in a list of the world's most powerful supercomputers by Top 500, a respected rating agency for the high-end computing fraternity. Another member of India's elite circuit was Kabru, a cluster of smaller computers done at the Chennai-based Institute of Mathematical Sciences (IMS) that crossed the teraflop barrier. Other stellar names of the crème-de-la-crème are the NAL's Flosolver (National Aerospace Laboratory), Anupam (Bhabha Atomic Research Center), PACE (Advanced Numerical Research Group) and CHIPPS (from the Center for Development of Telematics — CDOT, Bangalore) and the supercomputer at the Institute of Genomics and Integrative Biology, New Delhi, which finds a place among the top 500 supercomputers of the world. That was the pedigree that supercomputing had to take over from as it moved on to Eka. And interestingly, in a privately-funded environment this time. The epic entered a new epoch.

The rules, the tracks, the gear, the pitstop, everything was new, faster, and more competitive this time. And the chequered flag was only six months away. Eka incubates Tata's HPC (High Performance Computing) initiative dates back to June 2006 with the aim of becoming the one-stop-shop and achieving the iconic journey from atoms to applications. It armed CRL, its subsidiary, with the mandate of the Eka (Sanskrit for the number One) Dream. This 75-member team, which was divided into hardware, system software and applications, had beyond the obvious challenge of achieving the supercomputing power as set, also the nigh-impossible goalpost of doing all that in flat six months. It had Dr Sunil Sherlekar, the Head of R&D in CRL and also one of the founders Dr Narendra Karmarkar. This concept was presented to the Tata Sons Board to get the funding. Since then Dr Karmakar left and Dr Sherlekar stayed as the remaining founder. Incidentally, Eka also claims the distinction of being the only supercomputer funded by a corporate. CRL had the task of fully integrating and designing Eka with an in-house developed technology. The race flagged off. In June 2007 when the building infrastructure was set ready in Hinjewadi, a 4000 sq ft floor area set in record time, becomes the data center to house Eka.August saw the initial machines with 16 teraflop peaks and the first prototype going operational. September, the building blocks get ordered, set in place in due time and October sees the 172 teraflop peak system operation. In six weeks record time, the actual 120 teraflops (i.e. or trillions of calculations per second performance) happen, and Eka is born.

Nuts, bolts and paraphernalia

Eka's mandates and appetite on the technical depth were not simple by any account. Among the many objectives for CRL, was the ability to accommodate scalable parallel storage and scale at the lowest footprint space, HVAC for air distribution and cooling optimization, energy optimized operations, usage of indigenous building blocks, fully-automated monitoring and control, and accommodation of multi-system architecture and networks. Eka is built with 1794 blade servers using common off-the-shelf hardware using quad-core Intel clovertown processors. It has 400 ton cooling capacity and 2.5 MW power requirement. Its benchmark is 117.9 teraflop, and achieved final performance of 120 teraflop on a sustained basis with 172 teraflop as the peak score. There are 28 Terabytes of memory with a storage counterpart of 80 terabytes. Eka used DDR 920 gbit/sec)Infiband interconnect .technology for interconnecting the overall processors and storage, Linux OS, XC3 3.2.1 open MPI developer environment compilers and library. Cabling overall spread over 45 km falls between 10 km electrical, 15 km Infiniband and 10 km Ethernet areas. Other numbers of note include 28 TB RAM, 80 TB disk.

The revolutions in the evolution

There were quite many new approaches and dimensions during the making of Eka. The key was in the design of the architecture, then designing algorithms for application classes (a term that itself is nascent in the scientific domain and an ongoing research game), and the mapping of algorithms to the architecture. CRL also chose the not-tried-before circular layout, which serendipitously took the form of an octagon for stacking the servers and switches. Supercomputers don't come without their share of burdens. Scalability is a major issue. CRL addressed scalability by doing away with the need of connecting all nodes to all. Judicious balancing, compute and communication, and guaranteed load balancing were some of the ideas attempted here. Another concern was the cost per teraflop, which the Eka team managed to handle with reduction in interconnect switches, cables and connectors so that Interconnect scales up linearly. Similarly, usability, another issue with supercomputers was faced in the eye with innovations in library and Maths kernels thus hiding complexity of the underlying hardware, as shared by Seetha Ram Krishna. Additionally were solutions like novel interconnect architecture based on projective geometry that takes care of complexity as well as near linear speed-up of applications. There were also better algorithms for specific applications that helped Eka. In addition, the key points of uniqueness in terms of architecture are high density packing, projective geometry interconnect, hybrid parallel programming paradigm, optical interconnects and of course the circular layout as mentioned before. Props and poles As Dr. Sherlekar points out, fair share of credit of Eka goes to enabling technologies in networking, storage, power and cabling etc. While some of them had evolved to an adequate extent already, thanks to ongoing innovations, some of the path-breaking innovations that made Eka possible, in fact, happened during the making of Eka, with the concerted efforts of CRL and other partners in respective areas.

Cabling, for instance, is one major example. Eka became the testing ground of early fibercable technology since it was not feasible to use erstwhile copper cables beyond seven meters due to breakage concerns. The new technology that covered 20 meters, is already in production mode now. From processors, multi-core, programmable voltage, Interconnect Infiniband, sensors to cooling technology, every paraphernalia was as cutting-edge as Eka itself turned out to be.

120 Teraflops-milking the COW

The proof of the pudding lies in eating it. The actual work starts now when the supercomputer can start working on problems that have been waiting for the power of a superhero. Eka's usage is outlined wide and deep. Its applications - current and potential - cover a long spectrum. From system architecture research, system software research, mathematical library development, large scientific problems, application porting, optimization and development to future technology development and data center development, the possibilities have just started surfacing. Talking of simulations Eka's purview covers computational fluid dynamics and nanomaterial simulation. As Head of Applications Group Dr.Rajendra Lagu rightly says, "The young scientific minds are the ones to be looked onto now. There are for sure many applications that nobody has ever thought of yet." Eka is open for grand challenges. The exciting areas ahead range from aeroacoustics, weather modeling, carbon nanotube modeling, CFD or Computational Fluid Dynamics, Number Theory, Motif discovery, molecular docking, aircraft simulation, bio-medical simulation, to business applications like SCM (Supply Chain Management), BI (Business Intelligence), email scanning, pattern detection, video surveillance and so on.

May be the next Tata car would be completely simulated on Eka. Some 'grey' matter Eka is not without its share of criticisms and correction points. At a CSI seminar, an elderly and senior scientist in this domain pointed out areas like use of analog devices that was conspicuous by its absence in Eka; the complex use of FPGS; the perception that differential calculus is the ultimate in mathematics. CRL's explanation to them goes thus. The latter is still a topic of research and even CRL is doing work on inverse problems in this area. As to the use of analog devices, there are constraints on programmability and limits of a fixed function in an era where digital is in vogue. Even switched capacity filters, the hybrids are not in vogue, explained Dr Lagu in response.

The human in the superhuman

The real beauty behind the mystery of Eka, the supercomputer is the very absence of mystery. As Seetha Ram Krishna emphatically demystifies it, "Any company can achieve this. There's nothing complex behind it. Just some knowledge, expertise, logic, ideas and electronics put together in a system I can make it possible. "Today, with Internet as the ever-accessible ocean of knowledge and science, anyone can make their own supercomputer, albeit with levels of scale," he says. It was not a cakewalk. Every day was an ordeal, every runtime taxing and expensive. Problems were consistent, dead-ends kept coming, speed-bumpers were frequent guests, pressure levels always northwards, a new bottleneck knocked everyday and the Eka team was on an incessant 'on-the-toes' mode. But the dream, the passion, the resolve were the fuels that never ran out and made Eka a reality. The future of supercomputing is something that will unfold interestingly. "A supercomputer is a force multiplier and the third pillar of science. They are becoming all-pervasive to the games we play and the digital content we consume," says Dr Lagu. The Top500 supercomputer list would remain on Eka and CRL's dashboard. But the list, updated every six months, would be as fickle as the speed with which faster systems keep popping up. The top-ranker IBM's BlueGene/L System has achieved a benchmark of 478.2 TFLOPS. Pitted against computing superpowers like China, Japan and the US computing, India's sprint will be something to watch out for. Competition with the likes of China's Dawning, Japan's NEC's Earth Simulator and IBM's Blue Gene stirps would not be easy. The race is going to turn ruthless but will surely have India on the fast tracks.

Next on the cards for CRL, are plans of building a bigger machine that incorporates accelerators in addition to building domain-specific software libraries tuned for Eka architecture. Besides this, there are on the anvil, innovations on the software stack and productivity tools for the imminent many-core revolution. CRL also intends to explore hybrid architectural solutions for peta class machines with the use of FPGAs and accelerators. "What's a factor of concern however is the critical need of development of human resources in this space. It was tough to make about 75 to 80 member Eka team and we know how we did it. The future problems won't be the hardware or software, but the peopleware." Dr.Lagu stresses. Eka and its likes could buck the trend and drive more Indian cerebra to experience the inimitable thrill that CRL's team experienced that historical night. After all, it was indeed an eka moment. It's intriguing at this juncture then to think of some words that the father of supercomputing Seymour Cray said: "It's always easy to do the next step and it's always impossible to do two steps at a time."

(Acknowledgements to Dr Sunil Sherlakar, s the Head of R&D in CRL; Dr Rajendra Lagu, Head of Applications Group, CRL and and N Seetha Ram Krishna - project manager, CRL - all key members of Eka team and CSI –Computer Society of India's Gireendra Kasmalkar Pune for sharing the Eka story)


Vital Stats ·
1794 blade servers · 400 ton cooling capacity
· 2.5 MW power
· Benchmark performance: 117.9 teraflop (processor speed of one trillion floating point operations per second)
· Peak score: 172 teraflop
· Sustained level: 120 teraflop
· Gigs: 16 GB RAM, 80 GB hard disk


©CyberMedia News

No comments: