He developed a number of techniques used in modern interconnection networks including routing-based deadlock avoidance, wormhole routing, link-level retry, virtual channels, global adaptive routing, and high-radix routers.[jargon] He has developed efficient mechanisms for communication, synchronization, and naming in parallel computers including message-driven computing and fast capability-based addressing. He has developed a number of stream processors starting in 1995 including Imagine, for graphics, signal, and image processing, and Merrimac, for scientific computing.[citation needed]
He has published over 200 papers as well as the textbooks Digital Systems Engineering with John Poulton, and Principles and Practices of Interconnection Networks with Brian Towles. He was inventor or co-inventor on over 70 granted patents.
An author quoted him saying: "Locality is efficiency, Efficiency is power, Power is performance, Performance is king".[5]
From 1986 to 1997 he taught at MIT where he and his group built the J–Machine and the M–Machine,[8] parallel machines emphasizing low overhead synchronization and communication. During his MIT times he claims to have collaborated on developing design of Cray T3D and Cray T3E supercomputers. He became the Willard R. and Inez Kerr Bell Professor in the Stanford University School of Engineering and chairman of the computer science department at Stanford. He served as chairman for twelve years before moving on to Nvidia.[9]
Dally's corporate involvements include various collaborations at Cray Research since 1989. He did Internet router work at Avici Systems starting in 1997, was chief technical officer at Velio Communications from 1999 until its 2003 acquisition by LSI Logic, founder and chairman of Stream Processors, Inc until it folded.[7]
In January 2009 he was appointed chief scientist of Nvidia.[11] He worked full-time at Nvidia, while supervising about 12 of his graduate students at Stanford.[12] He is currently chief scientist and SVP of Nvidia Research.[13]
Among many contributions to technology at Nvidia, Dally also kick-started optical interconnects for GPU[14] and computing systems[15] using micro ring modulators utilizing multiple wavelengths.[16][17] These systems can lead to the adoption of very high bandwidth, low energy per bit optical interconnects[18] in GPUs[19] and also lead to circuit switched GPU datacenters with significant boost to AI computing efficiency.
In 2009, he was elected to the National Academy of Engineering for contributions to the design of high-performance interconnect networks and parallel computer architectures.
He received the 2010 ACM/IEEE Eckert–Mauchly Award for "outstanding contributions to the architecture of interconnection networks and parallel computers."[20]
Personal life
Dally is married and has two children. He had a flight mishap in 1992 when the Cessna 210 he was flying from Hanscom Field, Massachusetts to Farmingdale, New York in bad weather conditions experienced an oil leak. He was forced to make a crash landing in the Long Island Sound and was retrieved by a rescue sailboat.[21]
^Johnson, Matt (2011). An Analysis of Linux Scalability to Many Cores. p. 4. Locality is efficiency, Efficiency is power, Power is performance, Performance is king