Researchers save money using central research grid
John Kenyon at work on the University Research Grid, a rack full of Sun 4100 servers used for massive data simulations on campus. (Photo by: Ted Cook)
Friday, November 16, 2007
By: Jan Jones
One year ago, the Information Technology (IT) Division announced plans to house and administer a centralized university research grid. Today, that grid is alive and well and being used by campus researchers to run complicated simulations involving massive amounts of data at a much lower cost than setting up their own computer cluster.
The way it works, according to John Kenyon, who was hired as Research Grid Systems Administrator early this summer, is that IT provides the server rack, air-conditioned server room, a head node, file server, and his server administration.
At present, the cluster contains 63 computation nodes, with a combined 312 gigabytes of RAM and 252 processor cores, supplemented by a 24-terabyte file server. Of course, as more researchers join the grid, those numbers will continue to climb. Currently, the cluster is predominately used for artificial intelligence (AI) simulations, but is also running some protein analysis and network simulations.
Who is Using the Grid?
So far, two researchers are contributing members of the grid: Dr. Sushill Louis of the Computer Sciences (CS) Department, who is researching genetic algorithms, owns 26 nodes. Dr. Bobby Bryant, also of CS, owns 23 nodes and is researching neuro-evolutionary algorithms.
Others using the grid as guest researchers include Dr. Kevin Facemyer of Biochemistry, researching protein interaction and evolution; Fares Quedan of Mathematics, researching atmospherics; Murat Yuksel, CS, researching computer networking and communication; and George Bebis, CS, researching computer vision. There is also interest from two other researchers, as soon as the grid acquires specific software they need.
Resource Management is Key
Kenyon noted that IT is using an enterprise-grade queuing system to make sure jobs are run in the fairest and most efficient way possible. The Sun Grid Engine chooses whose job to run based on past usage and priority. For example, those who own nodes have top priority on the nodes they own. If someone does not own nodes, their job will attempt to run in the common pool. If unavailable, they will run on anyone’s idle nodes. However, if the owner of the nodes submits a job, those “guest” jobs are put on hold until the owner’s job is complete.
“Let me be clear,” Kenyon said in his October presentation. “You always have 100 percent access to the nodes that you buy!” However, he noted, any contribution to the grid helps all groups doing computational research.
“The common pool is not owned by anybody and no one gets priority there,” he said. To make sure those resources are fairly distributed, all departments get an equal share of time, all groups within a department get an equal share of the department’s time, unless the department chair directs otherwise, and all users in a group get an equal share of the group’s time, unless the professor directs otherwise.
Future Plans
After monitoring the campus research grid for the past six months, Kenyon’s first priority for the future is to get more computers and to make sure the grid is well known by the University research community so that it lives up to its true potential.
He also thinks it would be helpful to create a curriculum to teach people how to manipulate, analyze, and generally deal with massive amounts of data. For example, he would like to see classes or workshops that teach University researchers about computer programming, how to do a proper statistical analysis of research data and how best to present that data graphically for maximum impact.
