Basilio B. Fraguela publications

Only the most representative papers are available on-line, but feel free to ask me for any paper of the list.

R. L. Castro, D. Andrade,B. B. Fraguela. AdAPT-S: Effective DNN pruning via unified accuracy and performance tuning. 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS'25), pp. 106-117. Milan (Italy), June 2025.

R. L. Castro (Advisors: D. Andrade and B.B. Fraguela). Development of techniques to optimize automated machine learning processes in high performance computing systems. PhD Thesis, Department of Computer Engineering, University of A Coruña (Spain). February 2025.

M. A. Martínez, B. B. Fraguela, J. C. Cabaleiro, F. F. Rivera. A new thread-level speculative automatic parallelization model and library based on duplicate code execution. The Journal of Supercomputing, 80(10):13714-13737, July 2024.

R. L. Castro, D. Andrade,B. B. Fraguela. STuning-DL: Model-driven autotuning of sparse GPU kernels for deep learning. IEEE Access, 12:70581-70599, May 2024.

R. L. Castro, A. Ivanov, D. Andrade, T. Ben-Nun, B. B. Fraguela, T. Hoefler. VENOM: A vectorized N:M format for unleashing the power of sparse tensor cores. International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2023), Article No.: 72, pp. 1-14. Denver (USA), November 2023.

R. L. Castro, D. Andrade, B. B. Fraguela. Probing the efficacy of hardware-aware weight pruning to optimize the SpMM routine on Ampere GPUs. 31st International Conference on Parallel Architectures and Compilation Techniques (PACT 2022), pp. 135-147. Chicago (USA), October 2022.

B.B. Fraguela, D. Andrade. The new UPC++ DepSpawn high performance library for data-flow computing with hybrid parallelism. International Conference on Computational Science (ICCS 2022), Lecture Notes in Computer Science Vol. 13350, Springer, pp. 761–774. London (United Kingdom), June 2022

M. A. Martínez, B. B. Fraguela, J. C. Cabaleiro. A highly optimized skeleton for unbalanced and deep divide-and-conquer algorithms on multi-core clusters. The Journal of Supercomputing, 78(8):10434–10454, May 2022.

M. A. Martínez, B. B. Fraguela, J. C. Cabaleiro. A parallel skeleton for divide-and-conquer unbalanced and deep problems. International Journal of Parallel Programming, 49(6):820-845, December 2021.

B. B. Fraguela, D. Andrade. A software cache autotuning strategy for dataflow computing with UPC++ DepSpawn. Computational and Mathematical Methods, 3(6):e1148, November 2021.

B. B. Fraguela, D. Andrade, J. González-Domínguez. ScalaParBiBit: Scaling the binary biclustering in distributed-memory systems. Cluster Computing, 24(3):2249-2268, September 2021.

R. L. Castro, D. Andrade, B. B. Fraguela. OpenCNN: a Winograd’s minimal filtering algorithm implementation in CUDA. Mathematics, 9(17):2033, September 2021.

B. B. Fraguela, D. Andrade. High performance dataflow computing in hybrid memory systems with UPC++ DepSpawn. The Journal of Supercomputing, 77(7):7676-7689, July 2021.

M. A. Martínez, B. B. Fraguela, J. C. Cabaleiro. A divide-and-conquer parallel skeleton for unbalanced and deep problems. 13th Intl. Symp. on High-level Parallel Programming and Applications (HLPP 2020), pp. 76-95. Oporto (Portugal), July 2020.

J. F. Fabeiro, D. Andrade, B.B. Fraguela, R. Doallo. An automatic optimizer for heterogeneous devices. Future Generation Computer systems, 106:572-584, May 2020.

P. Valero-Lara, D. Andrade, R. Sirvent, J. Labarta, B.B. Fraguela, R. Doallo. A fast solver for large tridiagonal systems on multi-core processors (Lass library). IEEE Access, 7:23365-23378, December 2019.

D. Barreiro-Ures, M. Francisco-Fernández, R. Cao, B. B. Fraguela, R. Doallo, J. L. González-Andújar, M. Reyes. Analysis of interval-grouped data in weed science: The binnednp Rcpp package. Ecology and Evolution, 9(19):10903-10915, October 2019.

B.B. Fraguela, D. Andrade. Easy dataflow programming in clusters with UPC++ DepSpawn. IEEE Transactions on Parallel and Distributed Systems, 30(6):1267-1282, June 2019.

D. R. Penas, A. Gómez, B.B. Fraguela, M.J. Martín, S. Cerviño. Enhanced global optimization methods applied to complex fisheries stock assessment models. Applied Soft Computing, 77:50-66, April 2019.

S. Vázquez, M. Amor, B.B. Fraguela. Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library. Journal of Parallel and Distributed Computing, 125:135–146, March 2019.

M. Viñas, B.B. Fraguela, D. Andrade, R. Doallo. Heterogeneous distributed computing based on high level abstractions. Concurrency and Computation: Practice and Experience, 30(17):e4664, September 2018.

D. Andrade, B.B. Fraguela, R. Doallo. Guiding the optimization of parallel codes on multicores using an analytical cache model. International Conference on Computational Science (ICCS 2018), Lecture Notes in Computer Science Vol. 10862, Springer, pp. 387-394. Wuxi (China), June 2018

C. Bozkus, B.B. Fraguela. Accelerating the HyperLogLog cardinality estimation algorithm. Scientific Programming, vol. 2017, Article ID 2040865, September 2017.

C.H. González, B.B. Fraguela. A general and efficient divide-and-conquer algorithm framework for multi-core clusters. Cluster Computing, 20(3):2605-2626, September 2017.

J. F. Fabeiro (Advisors: D. Andrade and B.B. Fraguela). Tools for improving performance portability in heterogeneous environments. PhD Thesis, Department of Computer Engineering, University of A Coruña (Spain). July 2017.

M. Viñas, B.B. Fraguela, D. Andrade, R. Doallo. Facilitating the development of stencil applications using the Heterogeneous Programming Library. Concurrency and Computation: Practice and Experience, 29(12):e4152, June 2017.

N. Losada, B.B. Fraguela, P. González, M.J. Martín. A portable and adaptable fault tolerance solution for heterogeneous applications. Journal of Parallel and Distributed Computing, 104:146-158, June 2017.

M. Viñas, B.B. Fraguela, D. Andrade, R. Doallo. High productivity multi-device exploitation with the Heterogeneous Programming Library. Journal of Parallel and Distributed Computing, 101:51-68, March 2017.

B.B. Fraguela. A comparison of task parallel frameworks based on implicit dependencies in multi-core environments. 50th Hawaii Intl. Conf. on System Sciences (HICSS 50), pp. 6202-6211. Waikoloa (USA), January 2017.

S. Vázquez, M.J. Martín, B.B. Fraguela, A. Gómez, A. Rodríguez, B. Elvarsson. Novel parallelization of Simulated Annealing and Hooke & Jeeves search algorithms for multicore systems with application to complex fisheries stock assessment models. Journal of Computational Science, 17(part 3):599-608, November 2016.

M. Viñas, B.B. Fraguela, D. Andrade, R. Doallo. Towards a high level approach for the programming of heterogeneous clusters. 45th Intl. Conf. on Parallel Processing Workshops (ICPPW 2016), pp. 106-114. Philadelphia (USA), August 2016.

J. F. Fabeiro, D. Andrade, B.B. Fraguela, R. Doallo. How to write performance portable codes using the Heterogeneous Programming Library. 19th Workshop on Compilers for Parallel Computing (CPC 2016). Valladolid (Spain), July 2016.

M. Viñas (Advisors: B.B. Fraguela and D. Andrade). Improving the programmability of heterogeneous systems by means of libraries. PhD Thesis, Department of Electronics and Systems, University of A Coruña (Spain). July 2016.

S. Altuntaş, Z. Bozkus, B.B. Fraguela. GPU accelerated molecular docking simulation with genetic algorithms. 19th European Conf. on Applications of Evolutionary Computation (EvoApps 2016), pp. 134-146. Porto (Portugal), April 2016.

J. F. Fabeiro, D. Andrade, B.B. Fraguela. Writing a performance-portable matrix multiplication. Parallel Computing, 52:65-77, February 2016.

J. F. Fabeiro, D. Andrade, B.B. Fraguela, R. Doallo. Automatic generation of optimized OpenCL codes using OCLoptimizer. The Computer Journal, 58(11):3057-3073, November 2015.

M. Viñas, B.B. Fraguela, Z. Bozkus, D. Andrade. Improving OpenCL programmability with the Heterogeneous Programming Library. International Conference on Computational Science (ICCS 2015), pp. 110-119. Reykjavik (Iceland), June 2015

M. Viñas, Z. Bozkus, B.B. Fraguela, D. Andrade, R. Doallo. Developing adaptive multi-device applications with the Heterogeneous Programming Library. The Journal of Supercomputing, 71(6):2204-2220, June 2015.

C.H. González (Advisor: B.B. Fraguela). Library-based solutions for algorithms with complex patterns of parallelism. PhD Thesis, Department of Electronics and Systems, University of A Coruña (Spain). April 2015.

C.H. González, B.B. Fraguela. Enhancing and evaluating the configuration capability of a skeleton for irregular computations. 23rd Euromicro Intl. Conf. on Parallel, Distributed, and Network-based Processing (EuroPDP 2015), pp. 119-127. Turku (Finland), March 2015.

C.H. González, B.B. Fraguela. An algorithm template for domain-based parallel irregular algorithms. International Journal of Parallel Programming, 42(6):948-967, December 2014.

J.F. Fabeiro, D. Andrade, B.B. Fraguela, R. Doallo. Writing self-adaptive codes for heterogeneous systems. 20th International Euro-par Conference (Euro-Par 2014), Lecture Notes in Computer Science Vol. 8632, Springer, pp. 800-811. Porto (Portugal), August 2014.

M. Viñas, Z. Bozkus, B.B. Fraguela, D. Andrade, R. Doallo. Exploiting multi-GPU systems using the Heterogeneous Programming Library. 14th Intl. Conf. on Computational and Mathematical Methods in Science and Engineering (CMMSE 2014), pp. 1280-1291. Rota (Spain), July 2014.

D. Rolán, D. Andrade, B.B. Fraguela, R. Doallo. A fine-grained thread-aware management policy for shared caches. Concurrency and Computation: Practice and Experience, 26(6):1355-1374, April 2014.

D. Andrade, B.B. Fraguela, R. Doallo. Address independent estimation of the boundaries of cache performance. Microprocessors and Microsystems, 38(2):137-151, March 2014.

D. Rolán, B.B. Fraguela, R. Doallo. Virtually Split Cache: An efficient mechanism to distribute instructions and data. ACM Transactions on Architecture and Code Optimization (TACO), 10(4):27, December 2013.

M. Viñas, Z. Bozkus, B.B. Fraguela. Exploiting heterogeneous parallelism with the Heterogeneous Programming Library. Journal of Parallel and Distributed Computing, 73(12):1627-1638, December 2013.

J. Lobeiras, M. Viñas, M. Amor, B.B. Fraguela, M. Arenaz, J.A. García, M. Castro. Parallelization of shallow water simulations on current multi-threaded systems. International Journal of High Performance Computing Applications, 27(4):493-512, November 2013.

C.H. González, B.B. Fraguela. A framework for argument-based task synchronization with automatic detection of dependencies. Parallel Computing, 39(9):475-489, September 2013.

C.H. González, B.B. Fraguela, D. Andrade, J.A. García, M.J. Castro. Numerical simulation of pollutant transport in a shallow-water system on the Cell heterogeneous processor. The Journal of Supercomputing, 65(3):1089-1103, September 2013.

J.F. Fabeiro, D. Andrade, B.B. Fraguela. OCLoptimizer: an iterative optimization tool for OpenCL. International Conference on Computational Science (ICCS 2013), pp. 1322-1331. Barcelona (Spain), June 2013

M. Viñas, J. Lobeiras, B.B. Fraguela, M. Arenaz, M. Amor, J.A. García, M.J. Castro, R. Doallo. A multi-GPU shallow water simulation with transport of contaminants. Concurrency and Computation: Practice and Experience, 25(8):1153-1169, June 2013

D. Andrade, B.B. Fraguela, R. Doallo. Accurate prediction of the behavior of multithreaded applications in shared caches. Parallel Computing, 39(1):36-57, January 2013.

D. Andrade, B.B. Fraguela, R. Doallo. Static analysis of the worst-case memory performance for irregular codes with indirections. ACM Transactions on Architecture and Code Optimization (TACO), 9(3):20, September 2012.

B.B. Fraguela, G. Bikshandi, J. Guo, M.J. Garzarán, D. Padua, C. von Praun. Optimization techniques for efficient HTA programs. Parallel Computing, 38(9):465–484, September 2012.

D. Rolán (Advisors: B.B. Fraguela and R. Doallo). Cache design strategies for efficient adaptive line placement. PhD Thesis, Department of Electronics and Systems, University of A Coruña (Spain). June 2012.

Z. Bozkus, B.B. Fraguela. A portable high-productivity approach to program heterogeneous systems. 21st International Heterogeneity in Computing Workshop (HCW 2012), in conjunction with IPDPS'12, pp. 163-173. Shanghai (China). May 2012.

J. González-Domínguez, G.L. Taboada, B.B. Fraguela, M.J. Martín, J. Touriño. Automatic mapping of parallel applications on multicore architectures using the Servet Benchmark Suite. Computers and Electrical Engineering, 38(2):258-269, March 2012.

D. Rolán, B.B. Fraguela, R. Doallo. Adaptive set-granular cooperative caching. 18th Intl. Symp. on High Performance Computer Architecture(HPCA-18), pp. 213-224. New Orleans (USA), February 2012.

A. de Vega, D. Andrade, B.B. Fraguela. An efficient parallel set container for multicore architectures. International Conference on Parallel Computing 2011 (ParCo 2011), pp. 369-376. Ghent(Belgium), September 2011.

M. Viñas, J. Lobeiras, B.B. Fraguela, M. Arenaz, M. Amor, R. Doallo. Simulation of pollutant transport in shallow water on a CUDA architecture. II Workshop on Exploitation of Hardware Accelerators (WEHA 2011), in conjunction with 2011 Intl. Conf. on High Performance Computing and Simulation (HPCS), pp. 664-670. Istanbul (Turkey), July 2011.

D. Rolán, B.B. Fraguela, R. Doallo. Reducing capacity and conflict misses using Set Saturation Levels. 17th Annual Intl. Conf. on High Performance Computing (HiPC 2010). Goa (India), December 2010.

B.B. Fraguela, D. Andrade, R.Doallo. Address-independent estimation of the worst-case memory performance. IEEE Transactions on Industrial Informatics, 6(4):664-677, November 2010.

C.H. González, B.B. Fraguela. A generic algorithm template for divide-and-conquer in multicore systems. 12th IEEE International Conference on High Performance Computing and Communications (IEEE HPCC-10), pp. 79-88. Melbourne (Australia), September 2010.

J. Lobeiras, M. Amor, M. Arenaz, B.B. Fraguela. Streaming-oriented parallelization of domain-independent irregular kernels. 3rd Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), in conjunction with the 16th International Euro-par Conference (Euro-Par 2010), pp. 64-71. Ischia (Italy), September 2010.

J. González-Domínguez, G.L. Taboada, B.B. Fraguela, M.J. Martín, J. Touriño. Servet: a benchmark suite for autotuning on multicore clusters. 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS'10). Atlanta (USA), April 2010.

D. Rolán, B.B. Fraguela, R. Doallo. Adaptive line placement with the Set Balancing Cache. 42nd International Symposium on Microarchitecture (MICRO-42), pp. 529-540. New York (USA), December 2009.

D.A. Mallón, J.C.. Mouriño, A. Gómez, G.L. Taboada, C. Teijeiro, J. Touriño, B.B. Fraguela, R. Doallo, B. Wibecan. UPC performance evaluation on a multicore system. 3rd Conference on Partitioned Global Address Space Programming Models (PGAS'09). Ashburn (USA), October 2009.

C. Teijeiro, G.L. Taboada, J. Touriño, B.B. Fraguela, R. Doallo, D.A. Mallón, A. Gómez, J.C. Mouriño, B. Wibecan. Evaluation of UPC programmability using classroom studies. 3rd Conference on Partitioned Global Address Space Programming Models (PGAS'09). Ashburn (USA), October 2009.

D.A. Mallón, G.L. Taboada, C. Teijeiro, J. Touriño, B.B. Fraguela, A. Gómez, R. Doallo, J.C. Mouriño. Performance evaluation of MPI, UPC and OpenMP on multicore architectures. 16th European PVM/MPI Users' Group Meeting (EuroPVM/MPI'09), Lecture Notes in Computer Science Vol. 5759, Springer-Verlag, pp.174-184. Espoo (Finland), September 2009.

B.B. Fraguela, Y. Voronenko, M. Püschel. Automatic tuning of Discrete Fourier Transforms driven by analytical modeling. 18th International Conference on Parallel Architectures and Compilation Techniques (PACT'09), pp. 271-280. Raleigh (USA), September 2009.

G.L. Taboada, C. Teijeiro, J. Touriño, B.B. Fraguela, R. Doallo, J.C. Mouriño, D.A. Mallón, A. Gómez. Performance evaluation of Unified Parallel C collective communications. 11th IEEE International Conference on High Performance Computing and Communications (IEEE HPCC-09), pp.329-338. Seoul (Korea), June 2009.

D. Andrade, B.B. Fraguela, R. Doallo. Static prediction of worst-case data cache performance in the absence of base address information. 15th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'09), pp. 45-54. San Francisco (USA), April 2009.

J. Brodman, B.B. Fraguela, M.J. Garzarán, D. Padua. New abstractions for data parallel programming. First USENIX Workshop on Hot Topics in Parallelism (HotPar'09). Berkeley (USA), March 2009.

D. Andrade, B.B. Fraguela, J. Brodman, D. Padua. Task-parallel versus data-parallel library-based programming in multicore systems. 17th EUROMICRO International Conference on Parallel, Distributed, and Network-based Processing (PDP 2009), pp. 101-110. Weimar (Germany), February 2009.

J. Guo, G. Bikshandi, B.B. Fraguela, D. Padua. Writing productive stencil codes with overlapped tiling. Concurrency and Computation: Practice and Experience, 21(1):25-39, January 2009.

J. Brodman, B.B. Fraguela, M.J. Garzarán, D. Padua. Design issues in parallel array languages for shared memory. Embedded Computer Systems: Architectures, Modeling, and Simulation, 8th International Workshop, SAMOS 2008, Lecture Notes in Computer Science Vol. 5114, Springer-Verlag, pp. 208-217. Samos (Greece). July 2008.

J. Guo, G. Bikshandi, B.B. Fraguela, M.J. Garzarán, D. Padua. Programming with tiles. ACM SIGPLAN 2008 Symposium on Principles and Practice of Parallel Programming (PPoPP'08), pp. 111-122. Salt Lake City (USA). February 2008.

D. Andrade, J. Brodman, B.B. Fraguela, D. Padua. Hierarchically Tiled Arrays Vs. Intel Threading Building Blocks for programming multicore systems. Programmability Issues for Multi-Core Computers, (MULTIPROG'08), in conjunction with HiPEAC'08. Goteborg (Sweden). January 2008.

D. Andrade, M. Arenaz, B.B. Fraguela, J. Touriño, R. Doallo. Automated and accurate cache behavior analysis for codes with irregular access patterns. Concurrency and Computation: Practice and Experience, 19(18):2407-2423, December 2007.

D. Andrade, B.B. Fraguela, R. Doallo. Precise automatable analytical modeling of the cache behavior of codes with indirections. ACM Transactions on Architecture and Code Optimization (TACO), 4(3), September 2007.

J. Guo (Advisors: D. Padua and B.B. Fraguela). Exploiting locality and parallelism with Hierarchically Tiled Arrays. PhD Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign. August 2007.

D. Andrade (Advisors: B.B. Fraguela and R., Doallo). Systematic analysis of the cache behavior of irregular codes. PhD Thesis, Department of Electronics and Systems, University of A Coruña (Spain). April 2007.

D. Andrade, B.B. Fraguela, R. Doallo. Cache behavior modelling for codes involving banded matrices. 19th Intl Workshop on Languages and Compilers for Parallel Computing (LCPC'06), Lecture Notes in Computer Science Vol. 4382, Springer-Verlag, pp. 205-219. New Orleans, November 2006.

G. Bikshandi, J. Guo, C. von Praun, G. Tanase, B.B. Fraguela, M.J. Garzarán, D. Padua, L. Rauchwerger. Design and use of htalib -- a library for Hierarchically Tiled Arrays. 19th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC'06), Lecture Notes in Computer Science Vol. 4382, Springer-Verlag, pp. 17-32. New Orleans. November 2006.

D. Andrade, B.B. Fraguela, R. Doallo. Analytical modeling of codes with arbitrary data-dependent conditional structures. Journal of Systems Architecture, 52(7):394-410, July 2006.

J. Guo, G. Bikshandi, D. Hoeflinger, G. Almási, B.B. Fraguela, M.J. Garzarán, D. Padua, C. von Praun. Hierarchically Tiled Arrays for parallelism and locality. Workshop on Performance Engineering Technology and Research Sponsored under the NSF Next Generation Software, (NGS 06), in conjunction with IPDPS'06. Rodas (Greece). April 2006.

G. Bikshandi, J. Guo, D. Hoeflinger, G. Almási, B.B. Fraguela, M.J. Garzarán, D. Padua, C. von Praun. Programming for parallelism and locality with Hierarchically Tiled Arrays. ACM SIGPLAN 2006 Symposium on Principles and Practice of Parallel Programming (PPoPP'06), pp. 48-57. New York (USA). March 2006.

B.B. Fraguela, M.G. Carmueja, D. Andrade. Optimal tile size selection guided by analytical models. Parallel Computing 2005 (ParCo 2005). Published as volume 33 of the Publication Series of the John von Neumann Institute for Computing (NIC), pp. 565-572. Málaga (Spain), September 2005.

B.B. Fraguela, J. Guo, G. Bikshandi, M.J. Garzarán, G. Almási, J. Moreira, D. Padua. The Hierarchically Tiled Arrays programming approach. 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems (LCR 2004), pp. 35-46. Houston (USA), October 2004.

G. Bikshandi, B.B. Fraguela, J. Guo, M.J. Garzarán, G. Almási, J. Moreira, D. Padua. Implementation of parallel numerical algorithms using Hierarchically Tiled Arrays. 17th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC'04). Lecture Notes in Computer Science Vol. 3602, Springer-Verlag, pp. 87-101. West Lafayette (USA), September 2004.

D. Andrade, B.B. Fraguela, R. Doallo. Modeling the cache behavior of codes with arbitrary data-dependent conditional structures. 9th Asia-Pacific Computer Systems Architecture Conference (ACSAC 2004), Lecture Notes in Computer Science Vol. 3189, Springer-Verlag, pp. 44-57. Beijing (China), September 2004.

B.B. Fraguela, R. Doallo, J. Touriño, E.L. Zapata. A compiler tool to predict memory hierarchy performance of scientific codes. Parallel Computing, 30(2):225-248, February 2004.

G. Almási, L. De Rose, B.B. Fraguela, J. Moreira, D. Padua. Programming for locality and parallelism with Hierarchically Tiled Arrays. 16th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC'03), Lecture Notes in Computer Science Vol. 2958, Springer-Verlag, pp. 162-176. College Station (USA), October 2003.

D. Andrade, B.B. Fraguela, R. Doallo. Cache behavior modeling of codes with data-dependent conditionals. 7th Intl. Workshop on Software and Compilers for Embedded Systems (SCOPES 2003), Lecture Notes in Computer Science Vol. 2826, Springer-Verlag, pp. 373-387. Vienna (Austria), September 2003.

B.B. Fraguela, D. Andrade, R. Doallo. Efficient and accurate analytical modeling of the cache behavior of complete scientific codes. IASTED Intl. Conf. on Applied Simulation and Modelling 2003 (ASM 2003), pp. 106-111. Marbella (Spain), September 2003.

B.B. Fraguela, J. Renau, P. Feautrier, D. Padua, J. Torrellas. Programming the FlexRAM parallel intelligent memory system. ACM SIGPLAN 2003 Symp. on Principles and Practice of Parallel Programming (PPoPP'03), pp. 49-60. San Diego (USA), June 2003.

B.B. Fraguela, R. Doallo, E.L. Zapata. Probabilistic Miss Equations: evaluating memory hierarchy performance. IEEE Transactions on Computers, 52(3):321-336, March 2003.

B.B. Fraguela, R. Doallo, E.L. Zapata. Automatic analytical modeling for the estimation of cache misses. Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'99), pp. 221-231. Newport Beach (USA), October 1999.

B.B. Fraguela, R. Doallo, E.L. Zapata. Memory hierarchy performance prediction for blocked sparse algorithms. Parallel Processing Letters, 9(3): 347-360. September 1999.

R. Doallo, B.B. Fraguela, E.L. Zapata. Set associative cache behavior optimization. 5th International Euro-Par Conference (Euro-Par'99), Lecture Notes in Computer Science Vol. 1685, Springer-Verlag, pp. 229-238. Toulouse (France), August-September 1999.

R. Doallo, B.B. Fraguela, E.L. Zapata. Direct mapped cache performance modeling for sparse matrix operations. 7th Euromicro Workshop on Parallel and Distributed Processing (PDP'99), IEEE Computer Society, pp. 331-338. Madeira (Portugal), February 1999.

B.B. Fraguela, R. Doallo, E.L. Zapata. Cache misses prediction for high performance sparse algorithms. 4th International Euro-Par Conference (Euro-Par'98), Lecture Notes in Computer Science Vol. 1470, Springer-Verlag, pp. 224-233. Southampton (UK), September 1998.

R. Doallo, B.B. Fraguela, E.L. Zapata. Cache probabilistic modeling for basic sparse algebra kernels involving matrices with a non uniform distribution. EUROMICRO'98, pp. 345-348. Vasteras (Sweden), August 1998.

B.B. Fraguela, R. Doallo, E.L. Zapata. Modeling set associative caches behavior for irregular computations. ACM Performance Evaluation Review, Special Issue (Proc. ACM SIGMETRICS'98/ PERFORMANCE'98), 26(1):192-201. Madison (USA), June 1998.

C.A. Moritz, K. Al-Tawil, B.B. Fraguela. Performance comparison of MPI on MPP and workstation clusters. 10th ISCA International Conference on Parallel and Distributed Computing Systems (PDCS'97), pp. 167-172. Nueva Orleans, October 1997.

R. Doallo, B.B. Fraguela, A. Quintela. Evaluation of vectorization/parallelization techniques: application to nonparametric curve estimation. Statistics and Computing, 6(4):347-351. December 1996.

R. Doallo, B.B. Fraguela, J. Touriño, E.L. Zapata. Parallel sparse modified Gram-Schmidt QR decomposition. 4th Intl. Conference on High-Performance Computing and Networking (HPCN'96), Lecture Notes in Computer Science Vol. 1067, Springer-Verlag, pp.646-653. Brussels (Belgium), April 1996.

J. L. Freire, B. B. Fraguela, V. M. Gulías. Extending Caml Light to perform distributed computation. GULP-PRODE'95, pp. 113-124. Marina di Vietri (Italy), September 1995.

© Copyright notice: the papers listed above are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.