en Ne-XVP

Ne-XVP was a research project executed between 2006-2008 at NXP Semiconductors. The project undertook a holistic approach to define a next generation multimedia processing architecture for embedded MPSoCs that targets programmability, performance scalability, and silicon efficiency in an evolutionary way. The evolutionary way implies using existing processor cores such as NXP TriMedia as building blocks and supporting industry programming standards such as POSIX threads. Based on the technology-aware design space exploration, the project concluded that hardware accelerators facilitating task management and coherency coupled with right dimensioning of compute cores deliver good programmability, scalable performance and competitive silicon efficiency.

Research

Ne-XVP's research subjects and corresponding publications:

Asymmetric multicore architecture with generic accelerators ^[1]
Hardware multithreading in VLIWs ^[2]
Low-complexity cache coherence^[1]
Hardware accelerators for task scheduling and synchronization:
1. A Hardware Task Scheduler ^[3]
2. Hardware Synchronization Unit to sync threads ^[1]^[2]
3. Task Management Unit ^[4]
Instruction cache sharing ^[1]
Design Space Exploration with Performance Density as the optimization function ^[1]
Technology modeling for embedded processors ^[1]^[5]^[6]
Parallelization of complex multimedia algorithms (H.264, Frame Rate Conversion) ^[7]^[8]^[9]^[10]
Auto-parallelizing compilers
Time-aware programming languages in cooperation with the ACOTES project ^[11]
Visual programming
Task-level speculation
Porting GCC to Exposed Pipeline VLIW Processors ^[12]
Multiprogram workload for embedded processing
A 1-GHz embedded VLIW processor

Project members

Ghiath Al-Kadi
Zbigniew Chamski
Dmitry Cheresiz
Marc Duranton (project leader)
Surendra Guntur
Jan Hoogerbrugge
Anirban Lahiri
Ondrej Popp
Andrei Terechko
Alex Turjan
Clemens Wust
...

References

^ ^a ^b ^c ^d ^e ^f A. Terechko, J. Hoogerbrugge, G. Alkadi; S. Guntur; A. Lahiri; M. Duranton; C. Wust; P. Christie; A. Nackaerts; A. Kumar, "Balancing programmability and silicon efficiency of heterogeneous multicore architectures", ACM Transactions on Embedded Computing Systems, Special Issue on Real-time Multimedia, 2010.
^ ^a ^b J. Hoogerbrugge, A. Terechko, "A multithreaded multicore system for embedded media processing", Transactions on High-Performance Embedded Architectures and Compilers, Volume 4, Issue 2, 2008.
^ G. Al-Kadi, A.S. Terechko, "A Hardware Task Scheduler for Embedded Video Processing", in Proceedings of the 4th International Conference on High Performance and Embedded Architectures and Compilers, Paphos, Cyprus, January 25–28, 2009.
^ M. Sjalander, A. Terechko, M. Duranton; A Look-Ahead Task Management Unit for Embedded Multi-Core Architectures; Proceedings of the 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools; Pages 149-157; 2008; ISBN 978-0-7695-3277-6; IEEE Computer Society Washington, DC, USA.
^ A. Terechko, J. Hoogerbrugge; G. Al-Kadi; A. Lahiri; S. Guntur; M. Duranton; P. Christie; A. Nackaerts; A. Kumar, “Performance Density Exploration of Heterogeneous Multicore Architectures”, invited presentation at Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO’09), January 25, 2009, in conjunction with the 4th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC), Paphos, Cyprus, January 25–28, 2009.
^ P. Christie, A. Nackaerts, A. Kumar, A. S. Terechko, G. Doornbos, “Rapid Design Flows for Advanced Technology Pathfinding”, invited paper, International Electron Devices Meeting, San Francisco, 2008.
^ G. Al-Kadi, J. Hoogerbrugge, S. Guntur, A. Terechko, M. Duranton, “Meandering based parallel 3DRS algorithm for the multicore era”, in IEEE International Conference on Consumer Electronics, Las Vegas, USA, January 11–13, 2010.
^ A. Azevedo, B. Juurlink, C. Meenderinck, A. Terechko, J. Hoogerbrugge, M. Alvarez, A. Ramirez, M. Valero, “A Highly Scalable Parallel Implementation of H.264”, in Transactions on High-Performance Embedded Architectures and Compilers, Volume 4, Issue 2, pp. 404-418, 2009.
^ A. Azevedo, C. Meenderinck, B. Juurlink, A. Terechko, J. Hoogerbrugge, M. Alvarez, A. Ramirez, "Parallel H.264 Decoding on an Embedded Multicore Processor", in Proceedings of the 4th International Conference on High Performance and Embedded Architectures and Compilers, Paphos, Cyprus, January 2009.
^ M. Alvarez, A. Azevedo, C. Meenderinck, B. Juurlink, A. Terechko, J. Hoogerbrugge, A. Ramirez, "Analyzing Scalability Limits of H.264 Decoding Due to TLP Overhead", in Proceedings of 6th HiPEAC Industrial Workshop, November 2008.
^ ACOTES: http://www.hitech-projects.com/euprojects/ACOTES/
^ A. Turjan, D. Cheresiz, "Porting GCC to an exposed pipeline vector VLIW processor", GCC Developer's summit, Montreal, Québec, Canada, June 8–10, 2009.

[acmtecs-1] ^ ^a ^b ^c ^d ^e ^f A. Terechko, J. Hoogerbrugge, G. Alkadi; S. Guntur; A. Lahiri; M. Duranton; C. Wust; P. Christie; A. Nackaerts; A. Kumar, "Balancing programmability and silicon efficiency of heterogeneous multicore architectures", ACM Transactions on Embedded Computing Systems, Special Issue on Real-time Multimedia, 2010.

[multithreading-2] J. Hoogerbrugge, A. Terechko, "A multithreaded multicore system for embedded media processing", Transactions on High-Performance Embedded Architectures and Compilers, Volume 4, Issue 2, 2008.

[tsu-3] G. Al-Kadi, A.S. Terechko, "A Hardware Task Scheduler for Embedded Video Processing", in Proceedings of the 4th International Conference on High Performance and Embedded Architectures and Compilers, Paphos, Cyprus, January 25–28, 2009.

[euromicro-4] M. Sjalander, A. Terechko, M. Duranton; A Look-Ahead Task Management Unit for Embedded Multi-Core Architectures; Proceedings of the 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools; Pages 149-157; 2008; ISBN 978-0-7695-3277-6; IEEE Computer Society Washington, DC, USA.

[rapido-5] A. Terechko, J. Hoogerbrugge; G. Al-Kadi; A. Lahiri; S. Guntur; M. Duranton; P. Christie; A. Nackaerts; A. Kumar, “Performance Density Exploration of Heterogeneous Multicore Architectures”, invited presentation at Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO’09), January 25, 2009, in conjunction with the 4th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC), Paphos, Cyprus, January 25–28, 2009.

[christie-6] P. Christie, A. Nackaerts, A. Kumar, A. S. Terechko, G. Doornbos, “Rapid Design Flows for Advanced Technology Pathfinding”, invited paper, International Electron Devices Meeting, San Francisco, 2008.

[3drs-7] G. Al-Kadi, J. Hoogerbrugge, S. Guntur, A. Terechko, M. Duranton, “Meandering based parallel 3DRS algorithm for the multicore era”, in IEEE International Conference on Consumer Electronics, Las Vegas, USA, January 11–13, 2010.

[h264-8] A. Azevedo, B. Juurlink, C. Meenderinck, A. Terechko, J. Hoogerbrugge, M. Alvarez, A. Ramirez, M. Valero, “A Highly Scalable Parallel Implementation of H.264”, in Transactions on High-Performance Embedded Architectures and Compilers, Volume 4, Issue 2, pp. 404-418, 2009.

[hipeac2009-9] A. Azevedo, C. Meenderinck, B. Juurlink, A. Terechko, J. Hoogerbrugge, M. Alvarez, A. Ramirez, "Parallel H.264 Decoding on an Embedded Multicore Processor", in Proceedings of the 4th International Conference on High Performance and Embedded Architectures and Compilers, Paphos, Cyprus, January 2009.

[workshop2008-10] M. Alvarez, A. Azevedo, C. Meenderinck, B. Juurlink, A. Terechko, J. Hoogerbrugge, A. Ramirez, "Analyzing Scalability Limits of H.264 Decoding Due to TLP Overhead", in Proceedings of 6th HiPEAC Industrial Workshop, November 2008.

[acotes-11] ACOTES: http://www.hitech-projects.com/euprojects/ACOTES/

[gcc-12] A. Turjan, D. Cheresiz, "Porting GCC to an exposed pipeline vector VLIW processor", GCC Developer's summit, Montreal, Québec, Canada, June 8–10, 2009.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]