It is often proposed to solve the computational problem in a flexible manner by using multiple identical general-purpose processors in parallel (a multiple-instruction, multiple-data, or MIMD approach). Such methods, though, may not achieve the needed number of operations per second without large numbers of processors; in that case communications bottlenecks can arise and programmers can find difficulty in efficiently parallelizing software. A less-well-known form of parallel computation, based on streams, is conceptually closer to to the ways in which people think about algorithms and seems to the authors to offer a more cost-effective, scalable, highly-integratable approach to flexible computing for video.
In this paper, we explain the concept of stream-based processing, and describe why it is a good match to video data. Stream-based computing combined with automatic resource allocation can make the parallelization of the computation automatic at run-time, permitting scalable computing (the same software runs on differently-configured systems) and multitasking. Within this framework, we discuss our implementation of streams on Cheops, a compact data-flow digital video processor developed by the MIT Media Laboratory. We also discuss stream implementations on several other architectures, and how to apply the lessons learned to future programmable hardware for digital video processing.