SciELO - Scientific Electronic Library Online

 
vol.17 número3Estrategia de procesamiento paralelo para la solución del problema térmico-mecánico acoplado aplicado a un sistema 4D utilizando el método de elemento finitoMétodo adaptativo paralelo para la selección de puntos de interés en estructuras: deformación craneal índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Comp. y Sist. vol.17 no.3 Ciudad de México Jul./Set. 2013

 

Artículos

 

Load Balancing for Parallel Computations with the Finite Element Method

 

Balanceo de Cargas para Computación en Paralelo con el Método de Elementos Finitos

 

José Luis González García1, Ramin Yahyapour1, and Andrei Tchernykh2

 

1 GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen, Göttingen, Lower Saxony, Germany. jose-luis.gonzalez-garcia@gwdg.de, ramin.yahyapour@gwdg.de

2 CICESE Research Center, Ensenada, Baja California, Mexico. chernykh@cicese.mx

 

Article received on 25/02/2013;
accepted on 27/07/2013.

 

Abstract

In this paper, we give an overview of efforts to improve current techniques of load-balancing and efficiency of finite element method (FEM) computations on large-scale parallel machines and introduce a multilevel load balancer to improve the local load imbalance. FEM is used to numerically approximate solutions of partial differential equations (PDEs) as well as integral equations. The PDEs domain is discretized into a mesh of information and usually solved using iterative methods. Distributing the mesh among the processors in a parallel computer, also known as the mesh-partitioning problem, was shown to be NP-complete. Many efforts are focused on graph-partitioning to parallelize and distribute the mesh of information. Data partitioning is important to efficiently execute applications in distributed systems. To address this problem, a variety of general-purpose libraries and techniques have been developed providing great effectiveness. But the load-balancing problem is not yet well solved. Today's large simulations require new techniques to scale on clusters of thousands of processors and to be resource aware due the increasing use of heterogeneous computing architectures as found in many-core computer systems. Existing libraries and algorithms need to be enhanced to support more complex applications and hardware architectures. We present trends in this field and discuss new ideas and approaches that take into account the new emerging requirements.

Keywords: Load balancing, FEM, HPC efficiency.

 

Resumen

En este artículo damos una vista general de los esfuerzos para mejorar las técnicas actuales de balanceo de cargas y eficiencia en el cómputo con el uso del método de elementos finitos (MEF o FEM por sus siglas en inglés) en máquinas paralelas de gran escala. Introducimos también un balanceo de cargas multinivel para mejorar las diferencias locales. El MEF es usado para aproximar numéricamente las soluciones a ecuaciones diferenciales parciales (EDP o PDE por sus siglas en inglés) o a ecuaciones integrales. El dominio de las EDP se hace discreto convirtiéndolo en una malla de información y usualmente se soluciona utilizando métodos iterativos. La distribución de la malla en los procesadores de una computadora paralela, también conocido como el problema de partición de la malla, es NP-completo. Muchos esfuerzos se enfocan en partición de grafos para paralelizar y distribuir la malla de información. La partición de la información es importante para ejecutar las aplicaciones eficientemente en sistemas distribuidos. Para abordar este problema, una variedad de librerías de propósito general y técnicas se han desarrollado proveyendo gran efectividad. Pero el problema del balanceo de cargas no está del todo solucionado. Las extensas simulaciones de hoy requieren nuevas técnicas para poder ser ejecutadas eficientemente en sistemas de miles de procesadores y para tomar en cuenta los recursos disponibles debido al extenso uso de arquitecturas heterogéneas en la actualidad. Las librerías y algoritmos actuales deben ser adaptados para ser capaces de manejar aplicaciones más complejas y diferentes arquitecturas de hardware. Nosotros presentamos las tendencias en este campo y discutimos nuevas ideas que consideran los requerimientos emergentes.

Palabras clave: Balanceo de cargas, método de elementos finitos, eficiencia en computación de alto desempeño.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

Acknowledgment

This work was partially supported by CONACYT under grant number 309370.

 

References

1. Blazy, S., Borchers, W., & Dralle, U. (1996). Parallelization methods for a characteristic's pressure correction scheme. Flow Simulation with High-Performance Computers: II, Notes on Numerical Fluid Mechanics, 48, 305-321.         [ Links ]

2. Diekmann, R., Dralle, U., Neugebauer, F., & Römke, T. (1996). PadFEM: A portable parallel FEM-tool. High-Performance Computing and Networking, Lecture Notes in Computer Science, 1067, 580-585.         [ Links ]

3. Olas, T., Karczewski, K., Tomas, A., & Wyrzykowski, R. (2002). FEM computations on clusters using different models of parallel programming. Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, 2328, 170-182.         [ Links ]

4. Zienkiewicz, O.C. & Taylor, R.L. (2000). The finite element method (5th ed.), vol.1. Oxford; Boston: Butterworth-Heinemann.         [ Links ]

5. Saad, Y. (2003). Iterative methods for sparse linear systems (2nd ed.) Philadelphia: Society for Industrial and Applied Mathematics.         [ Links ]

6. Verfürth, R. (1994). A posteriori error estimation and adaptive mesh-refinement techniques. Journal of Computational and Applied Mathematics, 50(1-3), 67-83.         [ Links ]

7. Diekmann, R., Meyer, D., & Monien, B. (1995). Parallel decomposition of unstructured FEM-meshes. Parallel Algorithms for Irregularly Structured Problems, Lecture Notes in Computer Science, 980, 199-215.         [ Links ]

8. Garey, M.R., Johnson, D.S., & Stockmeyer, L. (1976). Some simplified NP-complete graph problems. Theoretical Computer Science, 1(3), 237-267.         [ Links ]

9. Garey, M.R. & Johnson, D.S. (1979). Computers and intractability: A guide to the theory of NP-completeness. San Francisco: W.H. Freeman.         [ Links ]

10. Diekmann, R., Monien, B., & Preis, R. (1995). Using helpful sets to improve graph bisections. Interconnection networks and mapping and scheduling parallel computations, 21, 57-73.         [ Links ]

11. Farhat, C. (1988). A simple and efficient automatic FEM domain decomposer. Computers & Structures, 28(5), 579-602.         [ Links ]

12. Hendrickson, B. & Leland, R. (1995). An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM Journal on Scientific Computing, 16(2), 452-469.         [ Links ]

13. Karypis, G. & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1), 359-392.         [ Links ]

14. Pothen, A., Simon, H.D., & Liou, K.P. (1990). Partitioning sparse matrices with eigenvectors of graphs. SIAM Journal on Matrix Analysis and Applications, 11(3), 430-452.         [ Links ]

15. Simon, H.D. (1991). Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2(2-3), 135-148.         [ Links ]

16. Fiduccia, C.M. & Mattheyses, R.M. (1982). A linear-time heuristic for improving network partitions. 19th Design Automation Conference, Las Vegas, Nevada, 175-181.         [ Links ]

17. Kernighan B.W. & Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 49(1), 291 -307.         [ Links ]

18. Fox, G.C., Williams, R.D., & Messina, G.C. (1994). Parallel computing works!. San Francisco, CA: Morgan Kaufmann Publishers, Inc.         [ Links ],

19. Hülsemann, F., Kowarschik, M., Mohr, M., & Rüde, U. (2005). Parallel geometric multigrid. Numerical Solution of Partial Differential Equations on Parallel Computers, Lecture Notes in Computational Science and Engineering, 51, 165-208.         [ Links ]

20. Oliker, L. & Biswas, R. (1998). PLUM: Parallel load balancing for adaptive unstructured meshes. Journal of Parallel and Distributed Computing, 52(2), 50-177.         [ Links ]

21. Diekmann, R., Preis, R., Schlimbach, F., & Walshaw, C.H. (2000). Shape-optimized mesh partitioning and load balancing for parallel adaptive FEM. Parallel Computing, 26(12), 1555-1581.         [ Links ]

22. Bhandarkar, M.A. & Kalé, L.V. (2000). A parallel framework for explicit FEM. 7th International Conference on High Performance Computing, Bangalore, India, 385-394.         [ Links ]

23. Stewart, J.R. & Edwards, H.C. (2003). The SIERRA framework for developing advanced parallel mechanics applications. Large-Scale PDE-Constrained Optimization, Lecture Notes in Computational Science and Engineering, 30, 301 -315.         [ Links ]

24. Sandia National Laboratories, Trilinos. Retrieved from http://trilinos.sandia.gov/.         [ Links ]

25. Burstedde, C., Burtscher, M., Ghattas, O., Stadler, G., Tu, T., & Wilcox, L.C. (2009). ALPS: A framework for parallel adaptive PDE solution. Journal of Physics: Conference Series, San Diego, California, 180.         [ Links ]

26. Wyrzykowski, R., Olas, T., & Sczygiol, N. (2001). Object-oriented approach to finite element modeling on clusters. Applied Parallel Computing. New Paradigms for HPC in Industry and Academia, Lecture Notes in Computer Science, 1947, 250-257.         [ Links ]

27. Olas, T., Lesniak, R., Wyrzykowski, R., & Gepner, P. (2010). Parallel adaptive finite element package with dynamic load balancing for 3D thermo-mechanical problems. Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, 6067, 299-311.         [ Links ]

28. Logg, A. & Wells, G.N. (2010). DOLFIN: Automated finite element computing. ACM Transactions on Mathematical Software, 37(2), Article no. 20.         [ Links ]

29. Turek, S., Göddeke, D., Becker, C., Buijssen, S.H.M., & Wobker, H. (2010). FEAST - Realization of hardware-oriented numerics for HPC simulations with finite elements. Concurrency and Computation: Practice and Experience, 22(16), 2247-2265.         [ Links ]

30. Langtangen, H.P. (2003). Computational partial differential equations: Numerical methods and Diffpack programming (2nd ed.), 1, Berlin: Springer.         [ Links ]

31. Sundance. Retrieved from http://www.math.ttu.edu/~kelong/Sundance/html/.         [ Links ]

32. Dular, P. & Geuzaine, C. (s.f.). GetDP: A general environment for the treatment of discrete problems. Retrieved from http://geuz.org/getdp/.         [ Links ]

33. FreeFEM.org Retrieved from http://www.freefem.org/.         [ Links ]

34. Prud'homme, C., Chabannes, V., & Feel++ Group (2011). Retrieved from https://forge.imag.fr/projects/life/.         [ Links ]

35. Prud'homme, C. (2007). Life: Overview of a unified C++ implementation of the finite and spectral element methods in 1D, 2D and 3D. Applied Parallel Computing. State of the Art in Scientific Computing, Lecture Notes in Computer Science, 4699, 712-721.         [ Links ]

36. Jiao, X., Li, X.Y., & Ma, X. (1999). SIFFEA: Scalable integrated framework for finite element analysis. Computing in Object-Oriented Parallel Environments, Lecture Notes in Computer Science, 1732, 84-95.         [ Links ]

37. Heister, T., Kronbichler, M., & Bangerth, W. (2010). Massively parallel finite element programming. Recent Advances in the Message Passing Interface, Lecture Notes in Computer Science, 6305, 122-131.         [ Links ]

38. Bruaset, A.M. & Langtangen, H.P. (1997). A comprehensive set of tools for solving partial differential equations; Diffpack. Numerical Methods and Software Tools in Industrial Mathematics (61 -90), Boston, Mass: Birkhäuser.         [ Links ]

39. Kirk, B.S., Peterson, J.W., Stogner, R.H., & Carey, G.F. (2006). libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations. Engineering with Computers, 22(3-4), 237-254.         [ Links ]

40. Renard, Y. & Pommier, J. (2004-2013). GetFEM++. Retrieved from http://download.gna.org/getfem/html/homepage/index.html.         [ Links ]

41. Patzák, B. & Bittnar, Z. (2001). Design of object oriented finite element code. Advances in Engineering Software, 32(10-11), 759-767.         [ Links ]

42. Logg, A. (2007). Automating the finite element method. Archives of Computational Methods in Engineering, 14(2), 93-138.         [ Links ]

43. ANSYS. Retrieved from http://www.ansys.com/.         [ Links ]

44. ESI Group. Retrieved from http://www.esi-group.com/.         [ Links ]

45. TRANSVALOR Material forming simulation. (19842013). Retrieved from http://www.transvalor.com/.         [ Links ]

46. Kalé, L., Skeel, R., Bhandarkar, M., Brunner, R., Gursoy, A., Krawetz, N., Phillips, J., Shinozaki, A., Varadarajan, K., & Schulten, K. (1999). NAMD2: Greater scalability for parallel molecular dynamics. Journal of Computational Physics, 151(1), 283-312.         [ Links ]

47. Bangerth, W., Hartmann, R., & Kanschat, G. (2007). deal.II - A general-purpose object-oriented finite element library. ACM Transactions on Mathematical Software, 33(4).         [ Links ]

48. Turek, S. (1999). Efficient solvers for incompressible flow problems: An algorithmic and computational approach. Berlin, Germany: Springer-Verlag.         [ Links ]

49. Shivaratri, N.G., Krueger, P., & Singhal, M. (1992). Load distributing for locally distributed systems. Computer, 25(12), 33-44.         [ Links ]

50. Devine, K.D., et al. (2005). New challenges in dynamic load balancing. Applied Numerical Mathematics, 52(2-3), 133-152.         [ Links ]

51. Willebeek-LeMair, M.H. & Reeves, A.P. (1993). Strategies for dynamic load balancing on highly parallel computers. IEEE Transactions on Parallel and Distributed Systems, 4(9), 979-993.         [ Links ]

52. Walshaw, C.H., Cross, M., & McManus, K. (2000). Multiphase mesh partitioning. Applied Mathematical Modelling. 25(2), 123-140.         [ Links ]

53. Plimpton, S., Attaway, S., Hendrickson, B.A., Swegle, J., Vaughan, C., & Gardner, D. (1998). Parallel transient dynamics simulations: Algorithms for contact detection and smoothed particle hydrodynamics. Journal of Parallel and Distributed Computing, 50(1-2), 104-122.         [ Links ]

54. Hendrickson, B.A. (1998). Graph partitioning and parallel solvers: Has the emperor no clothes?. Proceedings of the 5th International Symposium on Solving Irregularly Structured Problems in Parallel, 218-225.         [ Links ]

55. Çatalyürek, Ü.V. & Aykanat, C. (1999). Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Transactions on Parallel and Distributed System, 10(7), 673-693.         [ Links ]

56. Caldwell, A.E., Kahng, A.B., & Markov, I.L. (2000). Design and implementation of move-based heuristics for VLSI hypergraph partitioning. Journal of Experimental Algorithmics, 5.         [ Links ]

57. Vastenhouw, R.H., & Bisseling, B. (2005). A two-dimensional data distribution method for parallel sparse matrix-vector multiplication. SIAM Review, 47(1), 67-95.         [ Links ]

58. Chang, C., Kurc, T., Sussman, A., Çatalyürek, Ü.V., & Saltz, J. (2001). A hypergraph-based workload partitioning strategy for parallel data aggregation. Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing.         [ Links ]

59. Ozdal, M.M. & Aykanat, C. (2004). Hypergraph models and algorithms for data-pattern-based clustering. Data Mining and Knowledge Discovery, 9(1), 29-57.         [ Links ]

60. Walshaw, C.H. & Cross, M. (2000). Mesh partitioning: A multilevel balancing and refinement algorithm. SIAM Journal on Scientific Computing, 22(1), 63-80.         [ Links ]

61. Vanderstraeten, D. & Keunings, R. (1995). Optimized partitioning of unstructured finite element meshes. International Journal for Numerical Methods in Engineering, 38(3), 433-450.         [ Links ]

62. Basermann, A., et al. (2000). Dynamic load-balancing of finite element applications with the DRAMA library. Applied Mathematical Modelling, 25(2), 83-98.         [ Links ]

63. Goehring, T. & Saad, Y. (1994). Heuristic algorithms for automatic graph partitioning. Minneapolis, U.S.A.         [ Links ]

64. Linde, Y., Buzo, A., & Gray, R.M. (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1), 84-95.         [ Links ]

65. Walshaw, C.H., Cross, M., & Everett, M.G. (1995). A localized algorithm for optimizing unstructured mesh partitions. International Journal of High Performance Computing Applications, 9(4), 280-295.         [ Links ]

66. Meyerhenke, H. & Schamberger, S. (2005). Balancing parallel adaptive FEM computations by solving systems of linear equations. Euro-Par 2005 Parallel Processing, 3648, 624-624.         [ Links ]

67. Heath, M.T. & Raghavan, P. (1995). A cartesian parallel nested dissection algorithm. SIAM Journal on Matrix Analysis and Applications, 16(1), 235253.         [ Links ]

68. Miller, G.L., Teng, S.H., Thurston, W., & Vavasis, S.A. (1993). Automatic mesh partitioning. Graphs Theory and Sparse Matrix Computation, 56, 57-84.         [ Links ]

69. Berger, M.J. & Bokhari, S.H. (1987). A partitioning strategy for nonuniform problems on multiprocessors. IEEE Transactions on Computers, 36(5), 570-580.         [ Links ]

70. Taylor, V.E. & Nour-Omid, B. (1994). A study of the factorization fill-in for a parallel implementation of the finite element method. International Journal for Numerical Methods in Engineering, 37(22), 3809-3823.         [ Links ]

71. Farhat, C., Lanteri, S., & Simon, H.D. (1995). TOP/DOMDEC - A software tool for mesh partitioning and parallel processing. Computing Systems in Engineering, 6(1), 13-26.         [ Links ]

72. Horton, G. (1993). A multi-level diffusion method for dynamic load balancing. Parallel Computing, 19(2), 209-218.         [ Links ]

73. Schamberger, S. (2005). A shape optimizing load distribution heuristic for parallel adaptive FEM computations. Parallel Computing Technologies, 3606, 263-277.         [ Links ]

74. Cybenko, G. (1989). Dynamic load balancing for distributed memory multiprocessors. Journal of Parallel and Distributed Computing, 7(2), 279-301.         [ Links ]

75. Liao, C.J. (1999). Efficient partitioning and load-balancing methods for finite element graphs on distributed memory multicomputers. Feng Chia University, Seatwen, Taiwan.         [ Links ]

76. Elsässer, R., Monien, B., & Preis, R. (2002). Diffusion schemes for load balancing on heterogeneous networks. Theory of Computing Systems, 35(3), 305-320.         [ Links ]

77. Schamberger, S. (2004). On partitioning FEM graphs using diffusion. Proceedings of the 18th International Parallel and Distributed Processing Symposium, 277.         [ Links ]

78. Heirich, A. & Taylor, S. (1994). A parabolic load balancing method. Pasadena, USA.         [ Links ]

79. Hendrickson, B.A. & Leland, R. (1995). A multilevel algorithm for partitioning graphs. Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM).         [ Links ]

80. Hendrickson, B.A. & Leland, R. (2012). Chaco: Software for partitioning graphs. Retrieved from http://www.sandia.gov/~bahendr/chaco.html.         [ Links ]

81. Hendrickson, B.A. & Leland, R. (1995). The Chaco user's guide: Version 2.0. Albuquerque, USA.         [ Links ]

82. Barnard, S.T. & Simon, H.D. (1994). Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency: Practice and Experience, 6(2), 101-117.         [ Links ]

83. Meyerhenke, H., Monien, B., & Schamberger, S. (2006). Accelerating shape optimizing load balancing for parallel FEM simulations by algebraic multigrid. Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium, 10.         [ Links ]

84. Karypis, G. & Kumar, V. (1995). Analysis of multilevel graph partitioning. Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), 29.         [ Links ]

85. Abou-Rjeili, A. & Karypis, G. (2006). Multilevel algorithms for partitioning power-law graphs. International Parallel and Distributed Processing Symposium, 10.         [ Links ]

86. Karypis, G. & Kumar, V. (1995). Analysis of multilevel graph partitioning. Minneapolis, USA.         [ Links ]

87. Simon, H.D. & Teng, S.H. (1993). How good is recursive bisection. Moffett Field, USA.         [ Links ]

88. Karypis, G. & Kumar, V. (2012). METIS-Serial graph partitioning and fill-reducing matrix ordering. Retrieved from http://glaros.dtc.umn.edu/gkhome/views/metis.         [ Links ]

89. Karypis, G. (2011). METIS A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. Minneapolis, USA.         [ Links ]

90. Pellegrini, F. (2012). SCOTCH: Static mapping, graph, mesh and hypergraph partitioning, and parallel and sequential sparse matrix ordering package. Retrieved from http://www.labri.u-bordeaux.fr/perso/pelegrin/scotch/.         [ Links ]

91. Pellegrini, F. (2010). Scotch and libScotch 5.1 user's guide. Talence, France,         [ Links ]

92. Pellegrini, F. & Roman, J. (1996). SCOTCH: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. High-Performance Computing and Networking, 1067, 493-498.         [ Links ]

93. Baños, R. & Gil, C. (2007). Graph and mesh partitioning: An overview of the current state-of-the-art. Mesh Partitioning Techniques and Domain Decomposition Methods (1-26). Stirlingshire, U.K.: Saxe-Coburg Publications.         [ Links ]

94. Karypis, G. & Kumar, V. (1998). Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1), 96-29.         [ Links ]

95. Battiti, R. & Bertossi, A.A. (1999). Greedy, prohibition, and reactive heuristics for graph partitioning. IEEE Transactions on Computers, 48(4), 361 -385.         [ Links ]

96. Walshaw, C.H. (2012). JOSTLE - Graph partitioning software. Retrieved from http://staffweb.cms.gre.ac.uk/~c.walshaw/jostle/.         [ Links ]

97. Walshaw, C.H. (2002). The serial JOSTLE library user guide: Version 3.0. London, U.K.         [ Links ],

98. Walshaw, C.H. & Cross, M. (2007). JOSTLE: Parallel multilevel graph-partitioning software - an overview. Mesh partitioning techniques and domain decomposition methods (27-58). Stirlingshire, U.K.: Saxe-Coburg Publications.         [ Links ]

99. Karypis, G. & Kumar, V. (1998). A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing, 48, 71 -85.         [ Links ]

100. Karypis, G. & Kumar, V. (1998). Multilevel algorithms for multi-constraint graph partitioning. Proceedings of the 1998 ACM/IEEE conference on Supercomputing.         [ Links ]

101. Hu, Y.F., Blake, R.J., & Emerson, D.R. (1998). An optimal migration algorithm for dynamic load balancing. Concurrency: Practice and Experience, 10(6), 467-483.         [ Links ]

102. Preis, R. (2012). PARTY Partitioning library. Retrieved from http://www2.cs.uni-paderborn.de/cs/ag-monien/PERSONAL/ROBSY/party.html.         [ Links ]

103. Preis, R. (1998). The PARTY Graphpartitioning -Library - User manual - Version 1.99. Paderborn, Germany.         [ Links ]

104. Preis, R. & Diekmann, R. (1997). PARTY - A software library for graph partitioning. Advances in Computational Mechanics with Parallel and Distributed Processing, 63-71.         [ Links ]

105. Hromkovic, J. & Monien, B. (1991). The bisection problem for graphs of degree 4 (configuring transputer systems). Mathematical Foundations of Computer Science 1991, 520, 211-220.         [ Links ]

106. Monien, B. & Preis, R. (2001). Upper bounds on the bisection width of 3- and 4-regular graphs. Mathematical Foundations of Computer Science 2001 , 2136, 524-536.         [ Links ]

107. Pellegrini, F. (1994). Static mapping by dual recursive bipartitioning of process and architecture graphs. Proceedings of the 1994 Scalable HighPerformance Computing Conference, 486-493.         [ Links ]

108. Karypis, G. (2012). hMETIS - Hypergraph & circuit partitioning. Retrieved from http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview.         [ Links ]

109. Karypis, G. & Kumar, V. (1998). hMETIS - A hypergraph partitioning package - Version 1.5.3. Minneapolis, U.S.A.         [ Links ],

110. Çatalyürek, Ü.V. (2012). PaToH v3.2. Retrieved from http://bmi.osu.edu/~umit/software.html.         [ Links ]

111. Çatalyürek, Ü.V. & Aykanat, C. (2011). PaToH: Partitioning tool for hypergraphs. Columbus, USA.         [ Links ]

112. Bisseling, R. (2012). Mondriaan for sparse matrix partitioning. Retrieved from http://www.staff.science.uu.nl/~bisse101/Mondriaan/mondriaan.html.         [ Links ]

113. Sandia National Laboratories (2012). Zoltan: Parallel partitioning, load balancing and datamanagement services. Retrieved from http://www.cs.sandia.gov/Zoltan/.         [ Links ]

114. Devine, K.D., Boman, E.G., Heaphy, R.T., Hendrickson, B.A., & Vaughan, C. (2002). Zoltan data management services for parallel dynamic applications. Computing in Science Engineering, 4(2), 90-96.         [ Links ]

115. Maerten, B., Roose, D., Basermann, A., Fingberg, J., & Lonsdale, G. (1999). DRAMA: A library for parallel dynamic load balancing of Finite element applications. Euro-Par 1999 Parallel Processing, 1685, 313-316.         [ Links ]

116. Faik, J., Flaherty, J. E., Gervasio, L.G., & Teresco, J.D. (2012). DRUM: The dynamic resource utilization model. Retrieved from http://j.teresco.org/research/drum/.         [ Links ]

117. Faik, J. (2005). A model for resource-aware load balancing on heterogeneous and non-dedicated clusters. Rensselaer Polytechnic Institute, Troy, USA.         [ Links ]

118. Chu, W.C., Yang, D.L., Yu, J.C., & Chung, Y.C. (2001). UMPAL An unstructured mesh partitioner and load balancer on world wide web. Journal of Information Science and Engineering, 17(4), 595-614.         [ Links ]

119. Hu, Y.F. & Blake, R.J. (1995). An Optimal dynamic load balancing algorithm. Daresbury, U. K.         [ Links ]

120. Burstedde, C., Wilcox, L.C., & Ghattas, O. (2011). p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM Journal on Scientific Computing, 33(3), 1103-1133.         [ Links ]

121. Sinha, S. & Parashar, M. (2002). Adaptive system sensitive partitioning of AMR applications on heterogeneous clusters. Cluster Computing, 5(4), 343-352.         [ Links ]

122. Walshaw, C.H. & Cross, M. (2001). Multilevel mesh partitioning for heterogeneous communication networks. Future Generation Computer Systems, 17(5), 601-623.         [ Links ]

123. Minyard, T. & Kallinderis, Y. (2000). Parallel load balancing for dynamic execution environments. Computer Methods in Applied Mechanics and Engineering, 189(4), 1295-1309.         [ Links ]

124. Teresco, J.D., Beall, M.W., Flaherty, J.E., & Shephard, M.S. (2000). A hierarchical partition model for adaptive finite element computation. Computer Methods in Applied Mechanics and Engineering, 184(2-4), 269-285.         [ Links ]

125. Dongarra, J.J., Moler, C.B., Bunch, J.R., & Stewart, G.W. (1979). UNPACK User's guide. Philadelphia, USA: Society for Industrial and Applied Mathematics.         [ Links ]

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons