# Dataflow

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Dataflow
> Markdown URL: https://mediated.wiki/source/Dataflow.md
> Source: https://en.wikipedia.org/wiki/Dataflow
> Source revision: 1302387372
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Computing concept

This article is about software engineering. For the flow of data within a computer network, see [Traffic flow (computer networking)](/source/Traffic_flow_(computer_networking)). For the graphical representation of flow of data within an information system, see [data flow diagram](/source/Data_flow_diagram). For the hardware architecture, see [Dataflow architecture](/source/Dataflow_architecture). For the Dubai-based company, see [DataFlow Group](/source/DataFlow_Group).

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages) This article's lead section may be too short to adequately summarize the key points. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (November 2013) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Dataflow" – news · newspapers · books · scholar · JSTOR (September 2016) (Learn how and when to remove this message) (Learn how and when to remove this message)

In [computing](/source/Computing), **dataflow** is a broad concept, which has various meanings depending on the application and context. In the context of [software architecture](/source/Software_architecture), data flow relates to [stream processing](/source/Stream_processing) or [reactive programming](/source/Reactive_programming).

## Software architecture

[Dataflow computing](/source/Dataflow_programming) is a software paradigm based on the idea of representing computations as a [directed graph](/source/Directed_graph), where nodes are computations and data flow along the edges.[1] Dataflow can also be called [stream processing](/source/Stream_processing) or [reactive programming](/source/Reactive_programming).[2]

There have been multiple data-flow/stream processing languages of various forms (see [Stream processing](/source/Stream_processing)). Data-flow hardware (see [Dataflow architecture](/source/Dataflow_architecture)) is an alternative to the classic [von Neumann architecture](/source/Von_Neumann_architecture). The most obvious example of data-flow programming is the subset known as [reactive programming](/source/Reactive_programming) with spreadsheets. As a user enters new values, they are instantly transmitted to the next logical "actor" or formula for calculation.

[Distributed data flows](/source/Distributed_data_flow) have also been proposed as a programming abstraction that captures the dynamics of distributed multi-protocols. The data-centric perspective characteristic of data flow programming promotes high-level functional specifications and simplifies formal reasoning about system components.

## Hardware architecture

Main article: [Dataflow architecture](/source/Dataflow_architecture)

Hardware architectures for dataflow was a major topic in [computer architecture](/source/Computer_architecture) research in the 1970s and early 1980s. [Jack Dennis](/source/Jack_Dennis) of the [Massachusetts Institute of Technology](/source/Massachusetts_Institute_of_Technology) (MIT) pioneered the field of static dataflow architectures. Designs that use conventional memory addresses as data dependency tags are called static dataflow machines. These machines did not allow multiple instances of the same routines to be executed simultaneously because the simple tags could not differentiate between them. Designs that use [content-addressable memory](/source/Content-addressable_memory) are called dynamic dataflow machines by [Arvind](/source/Arvind_(computer_scientist)). They use tags in memory to facilitate parallelism. Data flows around the computer through the components of the computer. It gets entered from the input devices and can leave through output devices (printer etc.). An example for a hardware structure like in a dataflow machine can be found in analog computers or more precisely differential analyzers.

Main article: [Spatial architecture](/source/Spatial_architecture)

In [hardware accelerators](/source/Hardware_accelerators) composed of many processing elements that collectively coordinate to parallelize a [compute kernel](/source/Compute_kernel), dataflow refers to the pattern in which data is transferred between processing elements to satisfy data dependencies and complete the computation. These architectures inherit many of the concepts of dataflow architectures and apply them to more specialized workloads, such as [AI acceleration](/source/AI_accelerator). However, unlike dataflow architectures, the computation is not actively driven by data dependencies, rather, the simple data dependencies of the accelerated kernel are used to program the whole architecture prior to its execution.[3]

## Concurrency

A dataflow network is a network of concurrently executing processes or automata that can communicate by sending data over *channels* (see [message passing](/source/Message_passing).)

In [Kahn process networks](/source/Kahn_process_networks), named after [Gilles Kahn](/source/Gilles_Kahn), the processes are *determinate*. This implies that each determinate process computes a [continuous function](/source/Continuous_function) from input streams to output streams, and that a network of determinate processes is itself determinate, thus computing a continuous function. This implies that the behavior of such networks can be described by a set of recursive equations, which can be solved using [fixed point theory](/source/Fixed_point_theory). The movement and transformation of the data is represented by a series of shapes and lines.

## Other meanings

Dataflow can also refer to:

- [Power BI](/source/Power_BI) Dataflow, a [Power Query](/source/Power_Query) implementation in the cloud used for transforming source data into [cleansed](/source/Data_cleansing) Power BI Datasets to be used by Power BI report developers through the [Microsoft Dataverse](/source/Microsoft_Dataverse) (formerly called Microsoft Common Data Service).

- [Google Cloud Dataflow](/source/Google_Cloud_Dataflow), a fully managed service for executing Apache Beam pipelines within the Google Cloud Platform ecosystem.

## See also

The dictionary definition of [*dataflow*](https://en.wiktionary.org/wiki/dataflow) at Wiktionary

- [Binary Modular Dataflow Machine](/source/Binary_Modular_Dataflow_Machine) (BMDFM)

- [Communicating sequential processes](/source/Communicating_sequential_processes)

- [Complex event processing](/source/Complex_event_processing)

- [Data-flow diagram](/source/Data-flow_diagram)

- [Data-flow analysis](/source/Data-flow_analysis), a type of program analysis

- [Data stream](/source/Data_stream)

- [Dataflow programming](/source/Dataflow_programming) (a programming language paradigm)

- [Erlang (programming language)](/source/Erlang_(programming_language))

- [Flow-based programming](/source/Flow-based_programming) (FBP)

- [Flow control (data)](/source/Flow_control_(data))

- [Functional reactive programming](/source/Functional_reactive_programming)

- [Lazy evaluation](/source/Lazy_evaluation)

- [Lucid (programming language)](/source/Lucid_(programming_language))

- [Oz (programming language)](/source/Oz_(programming_language))

- [Packet flow](/source/Packet_flow)

- [Pipeline (computing)](/source/Pipeline_(computing))

- [Pure Data](/source/Pure_Data)

- [State transition](/source/State_transition)

- [TensorFlow](/source/TensorFlow)

- [Theano](/source/Theano_(software))

- [Ward-Mellor methodology](/source/Ward-Mellor_methodology)

## References

1. **[^](#cite_ref-sig_1-0)** Schwarzkopf, Malte (7 March 2020). ["The Remarkable Utility of Dataflow Computing"](https://www.sigops.org/2020/the-remarkable-utility-of-dataflow-computing/). *ACM SIGOPS*. Retrieved 31 July 2022.

1. **[^](#cite_ref-2)** [A Short Intro to Stream Processing](http://www.jonathanbeard.io/blog/2015/09/19/streaming-and-dataflow.html)

1. **[^](#cite_ref-3)** Parashar, Angshuman; Raina, Priyanka; Shao, Yakun Sophia; Chen, Yu-Hsin; Ying, Victor A.; Mukkara, Anurag; Venkatesan, Rangharajan; Khailany, Brucek; Keckler, Stephen W.; Emer, Joel (2019). "Timeloop: A Systematic Approach to DNN Accelerator Evaluation". *2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)*. pp. 304–315. [doi](/source/Doi_(identifier)):[10.1109/ISPASS.2019.00042](https://doi.org/10.1109%2FISPASS.2019.00042). [ISBN](/source/ISBN_(identifier)) [978-1-7281-0746-2](https://en.wikipedia.org/wiki/Special:BookSources/978-1-7281-0746-2).

---
Adapted from the Wikipedia article [Dataflow](https://en.wikipedia.org/wiki/Dataflow) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Dataflow?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
