Publications

21.

Giorgi, Roberto

Exploring Future Many-Core Architectures: The TERAFLUX Evaluation Framework Book Chapter

In: vol. Advances in Computers (ADV COMPUT), Elsevier, 2016, ISSN: 0065-2458.

22.

Giorgi, Roberto; Bettin, Nicola; Gai, Paolo; Martorell, Xavier; Rizzo, Antonio

AXIOM: A Flexible Platform for the Smart Home Book Chapter

In: Keramidas, Georgios; Voros, Nikolaos; Hbner, Michael (Ed.): vol. Springer International Publishing, pp. 57-74, Springer International Publishing, Cham, 2016, ISBN: 978-3-319-42304-3.

Abstract | Links | BibTeX

@inbook{Giorgi2016b,

title = {AXIOM: A Flexible Platform for the Smart Home},

author = {Giorgi, Roberto and Bettin, Nicola and Gai, Paolo and Martorell, Xavier and Rizzo, Antonio},

editor = {Keramidas, Georgios and Voros, Nikolaos and Hbner, Michael},

url = {http://dx.doi.org/10.1007/978-3-319-42304-3_3},

doi = {10.1007/978-3-319-42304-3_3},

isbn = {978-3-319-42304-3},

year  = {2016},

date = {2016-09-24},

journal = {Components and Services for IoT Platforms: Paving the Way for IoT Standards},

volume = {Springer International Publishing},

pages = {57-74},

publisher = {Springer International Publishing},

address = {Cham},

abstract = {The AXIOM hardware/software platform aims at bringing easy programmability on top of a cluster of processors by using a fast interconnect and FPGA as a basis for building a scalable embedded system. The Smart Home is one of the key scenarios in which AXIOM could be useful for the Internet-of-Things (IoT). In Smart Homes, everything is linked to the flow of information that from the on the field devices needs to arrive to the cloud servers. The information sensed in the environment will not be transmitted as is to the higher layers, but is somehow interpreted to provide a synthetic light-weight representation of the environment. In such a scenario, it is then clear that there is a need for peripheral nodes as well as intermediate gateways which needs to be able to perform high-performance computational loads. AXIOM provides the possibility of designing a cluster of low-power/low-budget boards, which could be packed inside a high-performance embedded low-cost product. The AXIOM boards are heterogeneous, thus allowing for even greater diversity which is needed in those kind of IoT scenarios. The cluster itself can then be integrated inside the IoT architectures as computational-power node, which could be the center of a distributed intelligence near the edges of the IoT network.},

howpublished = {Springer International Publishing},

keywords = {},

pubstate = {published},

tppubtype = {inbook}

}

Close

23.

Llort, Germán; eras, Antonio Filgu; ménez-Gonzál ez, Daniel Ji; Servat, Harald; Teruel, Xavier; rcadal, Estanislao Me; z, Carlos Álvare; Giménez, Judit; ell, Xavier Martor; dé, Eduard Aygua; Labarta, Jesús

The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT Proceedings

Springer International Publishing, vol. OpenMP: Memory, Devices and Tasks, 2016.

Abstract | Links | BibTeX

@proceedings{Llort2016,

title = {The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT},

author = {Germán Llort and Antonio Filgu eras and Daniel Ji ménez-Gonzál ez and Harald Servat and Xavier Teruel and Estanislao Me rcadal and Carlos Álvare z and Judit Giménez and Xavier Martor ell and Eduard Aygua dé and Jesús Labarta},

url = {https://link.springer.com/chapter/10.1007/978-3-319-45550-1_16},

doi = {10.1007/97 8-3-319-45 550-1_16},

year  = {2016},

date = {2016-09-21},

volume = {OpenMP: Memory, Devices and Tasks},

publisher = {Springer International Publishing},

abstract = {Heterogeneous systems are an important trend in the future of supercomputers, yet they can be hard to program and developers still lack powerful tools to gain understanding about how well their accelerated codes perform and how to improve them.



Having different types of hardware accelerators available, each with their own specific low-level APIs to program them, there is not yet a clear consensus on a standard way to retrieve information about the accelerator’s performance. To improve this scenario, OMPT is a novel performance monitoring interface that is being considered for integration into the OpenMP standard. OMPT allows analysis tools to monitor the execution of parallel OpenMP applications by providing detailed information about the activity of the runtime through a standard API. For accelerated devices, OMPT also facilitates the exchange of performance information between the runtime and the analysis tool. We implement part of the OMPT specification that refers to the use of accelerators both in the Nanos++ parallel runtime system and the Extrae tracing framework, obtaining detailed performance information about the execution of the tasks issued to the accelerated devices to later conduct insightful analysis.



Our work extends previous efforts in the field to expose detailed information from the OpenMP and OmpSs runtimes, regarding the activity and performance of task-based parallel applications. In this paper, we focus on the evaluation of FPGA devices studying the performance of two common kernels in scientific algorithms: matrix multiplication and Cholesky decomposition. Furthermore, this development is seamlessly applicable for the analysis of GPGPU accelerators and Intel® Xeon PhiTM co-processors operating under the OmpSs programming model.},

keywords = {},

pubstate = {published},

tppubtype = {proceedings}

}

Close

24.

Mazumdar, Somnath; Ayguade, Eduard; Bettin, Nicola; Bueno, Javier; Ermini, Sara; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martinez, Alvarez; Martorell, Xavier; Montefoschi, Francesco; Oro, David; Pnevmatikatos, Dionisis; Rizzo, Antonio; Theodoropoulos, Dimitris; Giorgi, Roberto

AXIOM: A Hardware-Software Platform for Cyber Physical Systems Journal Article

In: pp. 539–546, 2016, ISBN: 978-1-50 90-2817- 7.

Links | BibTeX

25.

Theodoropoulos, Dimitris; Pnevmatikatos, Dionisis; Garzarella, Stefano; Gai, Paolo; Rizzo, Antonio; Giorgi, Roberto

AXIOM: enabling parallel processing in cyber-physical systems. Proceedings Article

In: International Conference on Field-Programmable Logic and Applications, 2016.

Abstract | Links | BibTeX

26.

Alvarez, Carlos; Ayguade, Eduard; Bosch, Jaume; Bueno, Javier; Cherkashin, Artem; Filgueras, Antonio; Jiminez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Vidal, Miquel; Theodoropoulos, Dimitris; Pnevmatikatos, Dionisios N.; Catani, Davide; Oro, David; Fernandez, Carles; Segura, Carlos; Rodriguez, Javier; Hernando, Javier; Scordino, Claudio; Gai, Paolo; Passera, Pierluigi; Pomella, Alberto; Bettin, Nicola; Rizzo, Antonio; Giorgi, Roberto

The AXIOM Software Layers Journal Article

In: "ELSEVIER Microprocessors and Microsystems", 2016, ISSN: 0141-9331.

Abstract | Links | BibTeX

@article{Alvarez2016,

title = {The AXIOM Software Layers},

author = {Carlos Alvarez and Eduard Ayguade and Jaume Bosch and Javier Bueno and Artem Cherkashin and Antonio Filgueras and Daniel Jiminez-Gonzalez and Xavier Martorell and Nacho Navarro and Miquel Vidal and Dimitris Theodoropoulos and Dionisios N. Pnevmatikatos and Davide Catani and David Oro and Carles Fernandez and Carlos Segura and Javier Rodriguez and Javier Hernando and Claudio Scordino and Paolo Gai and Pierluigi Passera and Alberto Pomella and Nicola Bettin and Antonio Rizzo and Roberto Giorgi},

url = {http://www.sciencedirect.com/science/article/pii/S0141933116300850},

doi = {10.1016/j.micpro.2016.07.002},

issn = {0141-9331},

year  = {2016},

date = {2016-07-09},

journal = {"ELSEVIER Microprocessors and Microsystems"},

abstract = {Abstract People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed. The AXIOM project (Agile,  eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model,  leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close

27.

Giorgi, Roberto

Exploring Dataflow-based Thread Level Parallelism in Cyber-physical Systems Proceedings Article

In: pp. 295-300, ACM, New York, NY, USA, 2016, ISBN: 978-1-4503-4128-8.

Abstract | Links | BibTeX

28.

Scordino, Claudio; Morelli, Bruno

Sharing memory in modern distributed applications Proceedings

2016, ISBN: 978-1-4503-3739-7.

Abstract | Links | BibTeX

29.

Verdoscia, Lorenzo; Giorgi, Roberto

A Data-Flow Soft-Core Processor for Accelerating Scientific Calculation on FPGAs Journal Article

In: Mathematical Problems in Engineering, vol. 2016, no. 1, pp. 1-21, 2016, ISSN: 1563-5147.

Abstract | Links | BibTeX

30.

Burgio, Paolo; Alvarez, Carlos; Ayguadé, Eduard; Filgueras, Antonio; Jiménez-González, Daniel; Martorell, Xavier; Navarro, Nacho; Giorgi, Roberto

Simulating next-generation Cyber-physical computing platforms Journal Article

In: Ada User Journal, vol. 37, no. 1, pp. 59-63, 2016, ISSN: 1381-6551, (TO APPEAR).

Abstract | Links | BibTeX

31.

Mazumdar, Somnath; Giorgi, Roberto

A Survey on Hardware and Software Support for Thread Level Parallelism Journal Article

In: 2016.

Abstract | Links | BibTeX

@article{Mazumdar2016b,

title = {A Survey on Hardware and Software Support for Thread Level Parallelism},

author = {Somnath Mazumdar and Roberto Giorgi},

url = {https://arxiv.org/abs/1603.09274},

year = {2016},

date = {2016-03-01},

abstract = {To support growing massive parallelism, functional components and also the capabilities of current processors are changing and continue to do so. Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process. Further, each core can support multiple, concurrent thread execution. Hence, hardware and software support for threads is more and more needed to improve peak-performance capacity, overall system throughput, and has therefore been the subject of much research. This paper surveys, many of the proposed or currently available solutions for executing, distributing and managing threads both in hardware and software. The nature of current applications is diverse. To increase the system performance, all programming models may not be suitable to harness the built-in massive parallelism of multicore processors. Due to the heterogeneity in hardware, hybrid programming model (which combines the features of shared and distributed model) currently has become very promising. In this paper, first, we have given an overview of threads, threading mechanisms and its management issues during execution. Next, we discuss about different parallel programming models considering to their explicit thread support. We also review the programming models with respect to their support to shared-memory, distributed-memory and heterogeneity. Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used programming models. We also further discuss on software support for threads, to mainly increase the deterministic behavior during runtime. Finally, we conclude the paper by discussing some common issues related to the thread management.

A Survey on Hardware and Software Support for Thread Level Parallelism | Request PDF. Available from: https://www.researchgate.net/publication/301879025_A_Survey_on_Hardware_and_Software_Support_for_Thread_Level_Parallelism [accessed Feb 19 2018].},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close

To support growing massive parallelism, functional components and also the capabilities of current processors are changing and continue to do so. Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process. Further, each core can support multiple, concurrent thread execution. Hence, hardware and software support for threads is more and more needed to improve peak-performance capacity, overall system throughput, and has therefore been the subject of much research. This paper surveys, many of the proposed or currently available solutions for executing, distributing and managing threads both in hardware and software. The nature of current applications is diverse. To increase the system performance, all programming models may not be suitable to harness the built-in massive parallelism of multicore processors. Due to the heterogeneity in hardware, hybrid programming model (which combines the features of shared and distributed model) currently has become very promising. In this paper, first, we have given an overview of threads, threading mechanisms and its management issues during execution. Next, we discuss about different parallel programming models considering to their explicit thread support. We also review the programming models with respect to their support to shared-memory, distributed-memory and heterogeneity. Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used programming models. We also further discuss on software support for threads, to mainly increase the deterministic behavior during runtime. Finally, we conclude the paper by discussing some common issues related to the thread management.

A Survey on Hardware and Software Support for Thread Level Parallelism | Request PDF. Available from: https://www.researchgate.net/publication/301879025_A_Survey_on_Hardware_and_Software_Support_for_Thread_Level_Parallelism [accessed Feb 19 2018].

Close

32.

Giorgi, R.; Scionti, A.

A scalable thread scheduling co-processor based on data-flow principles Journal Article

In: vol. 53, pp. pp. 100–108, 2015, ISSN: 0167-739X.

Abstract | Links | BibTeX

33.

Giorgi, Roberto

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing Proceedings Article

In: Proceedings of the 13th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2015), 2015.

Abstract | Links | BibTeX

@inproceedings{Giorgi15d,

title = {Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing},

author = {Roberto Giorgi},

url = {http://www.axiom-project.eu/wp-content/uploads/2016/03/EUC15.pdf},

year  = {2015},

date = {2015-10-20},

booktitle = {Proceedings of the 13th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2015)},

abstract = {Embedded System toolchains are highly customized for a specific System-on-Chip (SoC). When the application needs more performance, the designer is typically forced to adopt a new SoC and possibly another toolchain. The rationale for not scaling performance by using, e.g., two SoCs, is that maintining most of the operations on-chip may allow for higher energy efficiency. We are exploring the feasibility and trade-offs of designing and manufacturing a new Single Board Computer (SBC) that could serve flexibly for a number of current and future applications, by allowing scalability through clusters of SBCs while keeping the same programming model for the SBC. This board is based on FPGAs and embedded processors, and its key points are: i) a fast custom interconnect for board-to-board communication and ii) an easily programmable environment which would allow both the off-loading of code into accelerators (either soft-IP blocks or hard-IP blocks) and, at the same time, the distribution of computation across boards. A key challenge to successfully deploying this paradigm is to properly distribute the threads across several boards without the explicit intervention of the programmer. In this paper we describe how to dynamically and efficiently distribute the computational threads in symbiosis with an appropriate memory model to allow the system scalability, so that we can double the performance by simply connecting two boards without i) changing the basic hardware components (e.g., to a different System-On-Chip) and ii) changing the programming model to follow the vendor specific toolchain. Our approach is to reduce data movement across boards. Our initial experiments have confirmed the feasibility of our approach.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

34.

Jimenez-Gonzalez, Daniel; Alvarez-Martinez, Carlos; Filgueras, Antonio; Martorell, Xavier; Langer, Jan; Noguera, Juanjo; Vissers, Kees

Coarse-Grain Performance Estimator for Heterogeneous Parallel Computing Architectures like Zynq All-Programmable SoC Journal Article

In: Second International Workshop on FPGAs for Software Programmers FSP 2015, vol. CoRR, 2015.

Abstract | Links | BibTeX

35.

Alvarez, Carlos; Ayguade, Eduard; Bueno, Javier; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Theodoropoulos, Dimitris; Pnevmatikatos, Dionisios; Catani, Davide; Scordino, Claudio; Gai, Paolo; Segura, Carlos; Fernandez, Carles; Oro, David; Rodriguez-Saeta, Javier; Passera, Pierluigi; Pomella, Alberto; Rizzo, Antonio; Giorgi, Roberto

The AXIOM Software Layers Journal Article

In: DSD 2015, 18th Euromicro Conference on Digital Systems Design (DSD), 2015.

Links | BibTeX

36.

Mondelli, Andrea; Ho, Nam; Scionti, Alberto; Solinas, Marco; Portero, Antoni; Giorgi, Roberto

Dataflow Support in x86_64 Multicore Architectures through Small Hardware Extensions Conference

2015.

Abstract | Links | BibTeX

37.

Theodoropoulos, Dimitris; Pnevmatikatos, Dionisis; Alvarez, Carlos; Ayguade, Eduard; Bueno, Javier; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Segura, Carlos; Fernandez, Carles; Oro, David; Saeta, Javier Rodriguez; Gai, Paolo; Rizzo, Antonio; Giorgi, Roberto

The AXIOM project (Agile, eXtensible, fast I/O Module) Journal Article

In: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV 2015, 2015.

Abstract | Links | BibTeX

@article{Theodoropoulos2015,

title = {The AXIOM project (Agile, eXtensible, fast I/O Module)},

author = {Dimitris Theodoropoulos and Dionisis Pnevmatikatos and Carlos Alvarez and Eduard Ayguade and Javier Bueno and Antonio Filgueras and Daniel Jimenez-Gonzalez and Xavier Martorell and Nacho Navarro and Carlos Segura and Carles Fernandez and David Oro and Javier Rodriguez Saeta and Paolo Gai and Antonio Rizzo and Roberto Giorgi},

url = {http://samos-conference.com/Resources_Samos_Websites/Proceedings_Repository_SAMOS/2015/Files/SS0_03.pdf},

year  = {2015},

date = {2015-07-21},

journal = {International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV 2015},

abstract = {The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power for the assigned tasks, consume the least possible energy for such task (energy efficiency), scale up through modularity, allow for an easy programmability across performance scaling, and exploit at best existing standards at minimal costs. Current solutions for providing enough computational power are mainly based on multi- or many-core architectures. For example, some current research projects (such as ADEPT or PSOCRATES) are already investigating how to join efforts from the High-Performance Computing (HPC) and the Embedded Computing domains, which are both focused on high power efficiency, while GPUs and new Dataflow platforms such as Maxeler, or in general FPGAs, are claimed as the most energy efficient. We present the project’s initial approach, ideas and key concepts, and describe the AXIOM preliminary architecture. Our starting point uses power efficient multi-core nodes, such as ARM cores and FPGA accelerators on the same die, as in the Xilinx Zynq. We will work to provide an integrated environment that supports programmability of the parallel, interconnected nodes that form a CPS system, and evaluate our ideas using demanding test application scenarios.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close

38.

Burresi, Giovanni; Giorgi, Roberto

A Field Experience for a Vehicle Recognition System using Magnetic Sensors Proceedings Article

In: IEEE MECO 2015, pp. 178-181, 2015, ISBN: 978-1-4799-8999-7.

Abstract | Links | BibTeX

39.

Verdoscia, Lorenzo; Vaccaro, Roberto; Giorgi, Roberto

A matrix multiplier case study for an evaluation of a configurable Dataflow-Machine Proceedings Article

In: ACM CF'15 - LP-EMS, pp. 1-6, 2015, ISBN: 978-1-4503-3358-0.

Abstract | Links | BibTeX

40.

Mondelli, Andrea; Ho, Nam; Scionti, Alberto; Solinas, Marco; Portero, Antoni; Giorgi, Roberto

Enhancing an x86_64 Multi-Core Architecture with Data-Flow Execution Support Proceedings Article

In: Article, ACM 2015 (Ed.): 2015, ISBN: 978-1-4503-3358-0.

Abstract | Links | BibTeX

AXIOM Project Cookies Policy