# Massively parallel processor array

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Massively_parallel_processor_array
> Markdown URL: https://mediated.wiki/source/Massively_parallel_processor_array.md
> Source: https://en.wikipedia.org/wiki/Massively_parallel_processor_array
> Source revision: 1304098387
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Type of integrated circuit

A **massively parallel processor array**, also known as a **multi purpose processor array** (**MPPA**) is a type of [integrated circuit](/source/Integrated_circuit) which has a [massively parallel](/source/Massively_parallel) array of hundreds or thousands of [CPUs](/source/Central_processing_unit) and [RAM](/source/Random-access_memory) memories. These processors pass work to one another through a [reconfigurable](/source/Reconfigurability) interconnect of [channels](/source/Channel_(communications)). By harnessing a large number of processors working in parallel, an MPPA chip can accomplish more demanding tasks than conventional chips. MPPAs are based on a software parallel [programming model](/source/Programming_model) for developing high-performance [embedded system](/source/Embedded_system) applications.

## Architecture

MPPA is a [MIMD](/source/Multiple_instruction%2C_multiple_data) (Multiple Instruction streams, Multiple Data) architecture, with [distributed memory](/source/Distributed_memory) accessed locally, not shared globally. Each processor is strictly encapsulated, accessing only its own code and memory. Point-to-point communication between processors is directly realized in the configurable interconnect.[1]

The MPPA's massive parallelism and its distributed memory MIMD architecture distinguishes it from [multicore](/source/Multi-core_(computing)) and [manycore](/source/Manycore_processor) architectures, which have fewer processors and an [SMP](/source/Symmetric_multiprocessing) or other [shared memory](/source/Shared_memory_architecture) architecture, mainly intended for general-purpose computing. It's also distinguished from [GPGPUs](/source/GPGPU) with [SIMD](/source/Single_instruction%2C_multiple_data) architectures, used for [HPC](/source/High-performance_computing) applications.[2]

## Programming

An MPPA application is developed by expressing it as a hierarchical [block diagram](/source/Block_diagram) or [workflow](/source/Workflow), whose basic objects run in parallel, each on their own processor. Likewise, large data objects may be broken up and distributed into local memories with parallel access. Objects communicate over a parallel structure of dedicated channels. The objective is to maximize aggregate throughput while minimizing local latency, optimizing performance and efficiency. An MPPA's [model of computation](/source/Model_of_computation) is similar to a [Kahn process network](/source/Kahn_process_network) or [communicating sequential processes](/source/Communicating_sequential_processes) (CSP).[3]

## Applications

MPPAs are used in high-performance [embedded systems](/source/Embedded_system) and [hardware acceleration](/source/Hardware_acceleration) of [desktop computer](/source/Desktop_computer) and [server](/source/Server_(computing)) applications, such as [video compression](/source/Video_compression),[4][5] [image processing](/source/Image_processing),[6] [medical imaging](/source/Medical_imaging), [network processing](/source/Network_processing), [software-defined radio](/source/Software-defined_radio) and other compute-intensive streaming media applications, which otherwise would use [FPGA](/source/FPGA), [DSP](/source/Digital_signal_processor) and/or [ASIC](/source/Application-specific_integrated_circuit) chips.

## Examples

MPPAs developed in companies include ones designed at: [Ambric](/source/Ambric), [PicoChip](/source/PicoChip), [Intel](/source/Intel),[7] [IntellaSys](https://en.wikipedia.org/w/index.php?title=IntellaSys&action=edit&redlink=1), [GreenArrays](https://en.wikipedia.org/w/index.php?title=GreenArrays&action=edit&redlink=1), [ASOCS](/source/ASOCS), [Tilera](/source/Tilera), [Kalray](/source/Kalray), [Coherent Logix](https://en.wikipedia.org/w/index.php?title=Coherent_Logix&action=edit&redlink=1), [Tabula](/source/Tabula_(company)), and [Adapteva](/source/Adapteva). [Aspex (Ericsson)](https://en.wikipedia.org/w/index.php?title=Aspex_(Ericsson)&action=edit&redlink=1) Linedancer differs in that it was a Massive wide *SIMD* Array rather than an MPPA. Strictly speaking it could qualify as [Associative processing](/source/Flynn's_taxonomy) due to all 4096 of the 3,000 gate cores each having its own Content-Addressable Memory.[8][9][10]

Fabricated MPPAs developed in universities include: 36-core[11] and 167-core[12] [Asynchronous Array of Simple Processors (AsAP)](/source/Asynchronous_Array_of_Simple_Processors) arrays from the [University of California, Davis](/source/University_of_California%2C_Davis), 16-core RAW[13] from [MIT](/source/MIT), and 16-core[14] and 24-core[15] arrays from [Fudan University](/source/Fudan_University).

The Chinese [Sunway](/source/Sunway_(processor)) project developed their own 260-core [SW26010](/source/SW26010) manycore chip for the [TaihuLight](/source/TaihuLight) supercomputer, which was, from June 2016 to June 2018, the world's fastest supercomputer.[16][17]

Anton 3 processors, designed by [D. E. Shaw Research](/source/D._E._Shaw_Research) for [molecular dynamics](/source/Molecular_dynamics) simulations, contain arrays of 576 processors arranged in a 12×24 tiled grid of pairs of cores; a routed network links these tiles together and extends off-chip to other nodes in a full system.[18][19]

## See also

- [Manycore processor](/source/Manycore_processor)

- [AI accelerator](/source/AI_accelerator)

- [Asynchronous array of simple processors](/source/Asynchronous_array_of_simple_processors)

- [SW26010](/source/SW26010)

- [Array processor](/source/Array_processor)

- [Transputer](/source/Transputer)

## References

1. **[^](#cite_ref-1)** Mike Butts (September–October 2007). "Synchronization through Communication in a Massively Parallel Processor Array". *[IEEE Micro](/source/IEEE_Micro)*. **27** (5). [IEEE Computer Society](/source/IEEE_Computer_Society): 32. [Bibcode](/source/Bibcode_(identifier)):[2007IMicr..27e..32A](https://ui.adsabs.harvard.edu/abs/2007IMicr..27e..32A). [doi](/source/Doi_(identifier)):[10.1109/MM.2007.4378781](https://doi.org/10.1109%2FMM.2007.4378781).

1. **[^](#cite_ref-2)** Mike Butts. "Multicore and Massively Parallel Platforms and Moore's Law Scalability". *Proceedings of the Embedded Systems Conference - Silicon Valley, April 2008*.

1. **[^](#cite_ref-3)** Mike Butts; Brad Budlong; Paul Wasson; Ed White (April 2008). *Reconfigurable Work Farms on a Massively Parallel Processor Array*. 2008 16th International Symposium on Field-Programmable Custom Computing Machines. [IEEE Computer Society](/source/IEEE_Computer_Society). [doi](/source/Doi_(identifier)):[10.1109/FCCM.2008.6](https://doi.org/10.1109%2FFCCM.2008.6).

1. **[^](#cite_ref-4)** Laurent Bonetto (May 16, 2008). ["Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 1)"](https://www.eetimes.com/massively-parallel-processing-arrays-mppas-for-embedded-hd-video-and-imaging-part-1/). Video/Imaging DesignLine. *[EE Times](/source/EE_Times)*.

1. **[^](#cite_ref-5)** Laurent Bonetto (July 18, 2008). ["Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 2)"](https://www.eetimes.com/massively-parallel-processing-arrays-mppas-for-embedded-hd-video-and-imaging-part-2/). Video/Imaging DesignLine. *[EE Times](/source/EE_Times)*.

1. **[^](#cite_ref-6)** Paul Chen (March 18, 2008). ["Multimode sensor processing using Massively Parallel Processor Arrays (MPPAs)"](https://www.eetimes.com/multimode-sensor-processing-using-massively-parallel-processor-arrays-mppas/). Programmable Logic DesignLine. *[EE Times](/source/EE_Times)*.

1. **[^](#cite_ref-7)** Vangal, Sriram R.; Howard, Jason; Ruhl, Gregory; Dighe, Saurabh; Wilson, Howard; Tschanz, James; Finan, David; et al. (2008). "An 80-tile sub-100-w teraflops processor in 65-nm cmos". *[IEEE Journal of Solid-State Circuits](/source/IEEE_Journal_of_Solid-State_Circuits)*. **43** (1): 29–41. [Bibcode](/source/Bibcode_(identifier)):[2008IJSSC..43...29V](https://ui.adsabs.harvard.edu/abs/2008IJSSC..43...29V). [doi](/source/Doi_(identifier)):[10.1109/JSSC.2007.910957](https://doi.org/10.1109%2FJSSC.2007.910957).

1. **[^](#cite_ref-8)** Krikelis, A. (1990). ["Artificial Neural Network on a Massively Parallel Associative Architecture"](https://link.springer.com/chapter/10.1007/978-94-009-0643-3_39). *International Neural Network Conference*. p. 673. [doi](/source/Doi_(identifier)):[10.1007/978-94-009-0643-3_39](https://doi.org/10.1007%2F978-94-009-0643-3_39). [ISBN](/source/ISBN_(identifier)) [978-0-7923-0831-7](https://en.wikipedia.org/wiki/Special:BookSources/978-0-7923-0831-7).

1. **[^](#cite_ref-9)** ["Effective Monte Carlo simulation on System-V massively parallel associative string processing architecture"](https://web.archive.org/web/20210606003056/https://core.ac.uk/download/pdf/25268094.pdf) (PDF). Archived from [the original](https://core.ac.uk/download/pdf/25268094.pdf) (PDF) on 2021-06-06.

1. **[^](#cite_ref-10)** ["A Programmable Processor with 4096 Processing Units for Media Applications"](https://www.researchgate.net/publication/2915463).

1. **[^](#cite_ref-11)** Yu, Zhiyi; Meeuwsen, Michael; Apperson, Ryan; Sattari, Omar; Lai, Michael; Webb, Jeremy; Work, Eric; Mohsenin, Tinoosh; Singh, Mandeep; Baas, Bevan (2006). *An asynchronous array of simple processors for DSP applications*. IEEE International Solid-State Circuits Conference (ISSCC’06). Vol. 49. pp. 428–429. [doi](/source/Doi_(identifier)):[10.1109/ISSCC.2006.1696225](https://doi.org/10.1109%2FISSCC.2006.1696225).

1. **[^](#cite_ref-12)** Truong, Dean; Cheng, Wayne; Mohsenin, Tinoosh; Yu, Zhiyi; Jacobson, Toney; Landge, Gouri; Meeuwsen, Michael; et al. (2008). *A 167-processor 65 nm computational platform with per-processor dynamic supply voltage and dynamic clock frequency scaling*. Symposium on VLSI Circuits. pp. 22–23. [doi](/source/Doi_(identifier)):[10.1109/VLSIC.2008.4585936](https://doi.org/10.1109%2FVLSIC.2008.4585936).

1. **[^](#cite_ref-13)** Michael Bedford Taylor; Jason Kim; Jason Miller; David Wentzlaff; Fae Ghodrat; Ben Greenwald; Henry Hoffmann; Paul Johnson; Walter Lee; Arvind Saraf; Nathan Shnidman; Volker Strumpen; Saman Amarasinghe; Anant Agarwal (February 2003). "A 16-issue multiple-program-counter microprocessor with point-to-point scalar operand network". *Proceedings of the IEEE International Solid-State Circuits Conference*. [doi](/source/Doi_(identifier)):[10.1109/ISSCC.2003.1234253](https://doi.org/10.1109%2FISSCC.2003.1234253).

1. **[^](#cite_ref-14)** Yu, Zhiyi; You, Kaidi; Xiao, Ruijin; Quan, Heng; Ou, Peng; Ying, Yan; Yang, Haofan; Zeng, Xiaoyang (2012). "An 800MHz 320mW 16-core processor with message-passing and shared-memory inter-core communication mechanisms". *2012 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. IEEE. pp. 64–66. [doi](/source/Doi_(identifier)):[10.1109/ISSCC.2012.6176931](https://doi.org/10.1109%2FISSCC.2012.6176931).

1. **[^](#cite_ref-15)** Ou, Peng; Zhang, Jiajie; Quan, Heng; Li, Yi; He, Maofei; Yu, Zheng; Yu, Xueqiu; et al. (2013). "A 65nm 39GOPS/W 24-core processor with 11 Tb/s/W packet-controlled circuit-switched double-layer network-on-chip and heterogeneous execution array". *2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. IEEE. pp. 56–57. [doi](/source/Doi_(identifier)):[10.1109/ISSCC.2013.6487635](https://doi.org/10.1109%2FISSCC.2013.6487635).

1. **[^](#cite_ref-dongarra2016_16-0)** Dongarra, Jack (June 20, 2016). ["Report on the Sunway TaihuLight System"](http://www.netlib.org/utk/people/JackDongarra/PAPERS/sunway-report-2016.pdf) (PDF). *www.netlib.org*. Retrieved June 20, 2016.

1. **[^](#cite_ref-17)** Fu, Haohuan; Liao, Junfeng; Yang, Jinzhe; et al. (2016). ["The Sunway TaihuLight Supercomputer: System and Applications"](https://doi.org/10.1007%2Fs11432-016-5588-7). *Sci. China Inf. Sci*. **59** (7) 072001. [doi](/source/Doi_(identifier)):[10.1007/s11432-016-5588-7](https://doi.org/10.1007%2Fs11432-016-5588-7).

1. **[^](#cite_ref-18)** Shaw, David E.; Adams, Peter J.; Azaria, Asaph; Bank, Joseph A.; Batson, Brannon; Bell, Alistair; Bergdorf, Michael; Bhatt, Jhanvi; Butts, J. Adam; Correia, Timothy; Dirks, Robert M.; Dror, Ron O.; Eastwood, Michael P.; Edwards, Bruce; Even, Amos (2021-11-14). "Anton 3". *Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis*. St. Louis Missouri: ACM. pp. 1–11. [doi](/source/Doi_(identifier)):[10.1145/3458817.3487397](https://doi.org/10.1145%2F3458817.3487397). [ISBN](/source/ISBN_(identifier)) [978-1-4503-8442-1](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4503-8442-1). [S2CID](/source/S2CID_(identifier)) [239036976](https://api.semanticscholar.org/CorpusID:239036976).

1. **[^](#cite_ref-19)** Adams, Peter J.; Batson, Brannon; Bell, Alistair; Bhatt, Jhanvi; Butts, J. Adam; Correia, Timothy; Edwards, Bruce; Feldmann, Peter; Fenton, Christopher H.; Forte, Anthony; Gagliardo, Joseph; Gill, Gennette; Gorlatova, Maria; Greskamp, Brian; Grossman, J.P. (2021-08-22). "The ΛNTON 3 ASIC: A Fire-Breathing Monster for Molecular Dynamics Simulations". *2021 IEEE Hot Chips 33 Symposium (HCS)*. Palo Alto, CA, USA: IEEE. pp. 1–22. [doi](/source/Doi_(identifier)):[10.1109/HCS52781.2021.9567084](https://doi.org/10.1109%2FHCS52781.2021.9567084). [ISBN](/source/ISBN_(identifier)) [978-1-6654-1397-8](https://en.wikipedia.org/wiki/Special:BookSources/978-1-6654-1397-8). [S2CID](/source/S2CID_(identifier)) [239039245](https://api.semanticscholar.org/CorpusID:239039245).

---
Adapted from the Wikipedia article [Massively parallel processor array](https://en.wikipedia.org/wiki/Massively_parallel_processor_array) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Massively_parallel_processor_array?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
