Register | Login

Hi,    |  Logout

                      FAQs

MEDIA RENDER FARM SOLUTION

ECAD GRID SOLUTION

EMAIL ACCELERATION SOLUTION

DATABASE ACCELERATION SOLUTION

                       High Performance NoSQL


THE GRID INDUSTRY PROBLEM

HPC, E-CAD and Super Computing clusters observe a variety of workloads that are deployed with varying data sizes. Grid computing improves datacenter efficiency and machine utilization by optimally mixing and allocating the workload on the first available compute machine. Even though grid architectures have made huge improvements in increasing the machine utilization, they have left the gains from intelligent data management on the table.

To fulfill the vision of grid computing, it will be important to make the grid more intelligent by matching the compute machine to the workload type and availability of data. Another emerging trend is to identify and prepare the data for the workloads ahead of time. The new datacenter solutions need a close understanding of the workload and the data so that the data management can be completely automated for maximum performance gains.

 

CURRENT GRID ARCHITECTURE

In a large grid the data is managed away from the servers in the storage farms as pools of NAS or SAN servers. For example in the E-CAD industry, validation and testing jobs are highly regressive jobs that operate on the same data set over and over again to make sure that defects are detected early. Current grid architectures have already realized the first round of efficiencies associated with making the applications independent of the underlying compute architecture. The next round of efficiencies are going to come from efficient and intelligent data management.

Data movement is today managed using scripts and it requires that lots of moving parts are working correctly to make the right data available inside the right set of machines. Very often the mechanics of data movement break down requiring manual intervention. Most of the times, grid administrators observe that the IO is slow and the NFS machines are on crawling but without any IO control, the administrator is severely challenged to solve the problem. Buying more storage hardware is often the recourse.

DATAGRES PERFACCEL GRID DATA MANAGEMENT SOLUTION

GRID DATA MANAGEMENT FEATURE SET

A solution is required that can do the following:
  • Reduce the latency of data movement to the server from the storage servers, so that CPU idling can be avoided or minimized
  • Provide grid aware intelligent data movement control with data consistency
  • Manage the dynamics of hot data inside a massively distributed environment to maximize the CPU utilization
  • Provide data locality awareness to the job schedulers to deploy jobs that can then be deployed on machines where data is already present
  • The solution needs to have inherent dynamic management capabilities spread across a very large number of computing nodes to provide adaptive intelligence and data analytics support for intelligent data movement
  • Also provide some nodes for data staging so that the data is locally available in proximity of the compute nodes
  • Grid Analytics to provide the IO heat map of the files and their use across thousands of jobs instantaneously
  • WAN Caching: As grids get distributed around the world, the ability of the grids to access data sets in remote locations with consistency can help improve productivity
 
What will Datagres PerfAccel Software do with the Grid data?

Further optimization of the IO can lead to several business benefits
  • Reduce Time-to-Market
  • Defect-free chip design and validation requires enormous compute and storage resources
  • Global spread of designers and collaborators require grids that can function globally by keeping teams productive
 

DATAGRES PERFACCEL SOLUTION CAPABILITIES

Productivity Gains

Ability to minimize the time that a developer needs to wait for the grid jobs to compile on his/her own machine as well as on the shared server and storage resources.
Productivity Gains

Different developers can collaborate with the right data sets and consistent data sets. Very often jobs start to have dependencies on other jobs. So, it is important that the right consistent data sets are available to each job before it starts and that the output of the job is available in the storage consistently.
Optimize the shared computing cloud/grid/cluster

Savings on cost and complexity: Datagres automatically detects the storage needs of the grid jobs and manages them automatically without any application integration, in one datacenter and across datacenters.
 

TECHNICAL SPECIFICATIONS