Question 2. Discuss the architectural trends in today’s microprocessor.
Introduction to Today’s Microprocessor trends
From their humble beginning 25 years ago, microprocessors have proliferated into an astounding range of chips, powering devices ranging from telephones to supercomputers. Today, microprocessors for personal computers get widespread attention--and have enabled Intel to become the world's largest semiconductor maker. In addition, embedded microprocessors are at the heart of a diverse range of devices that have become staples of affluent consumers worldwide.
The past decade has seen the evolution of microprocessor packaging from a simple protective scheme to a complex combination of different elements that enable microprocessor performance while still providing the basic function of protection. Packaging today's microprocessor on the one hand entails tailoring the package to enable microprocessor performance, a complex task considering the rapid rate of microprocessor performance growth. This challenge is in terms of schedule and technical complexity. On the other hand, the package forms the interface between the microprocessor and the external world of the motherboard and the computing system. In this capacity, package design must allow for an easy interface and must meet a diverse set of form factor requirements.
The package provides a conduit for the microprocessor through a space transformation allowing small-scale features on the silicon to be electrically connected to the external environment. This is a challenging geometrical problem and requires that packaging interconnection densities must closely track the evolution of microprocessor interconnection densities. In connecting the die to the motherboard, the package must also ensure that the connections do not unduly inhibit the microprocessor performance by introducing unnecessary electrical impediments usually referred to as package “parasitics.” As microprocessors have evolved, they have increased in speed, which in turn needs increasingly sophisticated power delivery schemes. Another consequence of microprocessor evolution has been increasing power dissipation. Package design must now provide a path for thermal dissipation, requiring a better understanding of the thermal characteristics of packaging materials and design. Package design also requires a good understanding of the structural characteristics of the package to ensure it is designed for reliability and robustness. Attention is increasingly focussed today on understanding the electrical, thermal and mechanical characteristics of packaging to optimize all these aspects.
The package is also the interface that connects the microprocessor to the motherboard. In this capacity it must have a compatible interface to allow for easy acceptance on the motherboard as well as the system design. The form factor of the package is a critical element for easy interface to the motherboard. The requirements are usually different in different market segments and often drive the need for form factors that are tailored to these different segments. For instance, the height of the package is critical to enable a microprocessor in a mobile market where a slim and low weight package is critical to success. On the other hand, the ability to dissipate high power, and hence features that enables this, are critical in a server or desktop market segment. Cost, compatibility and fit within the computer system are key parameters that must be designed for in making a microprocessor successful. This challenges us into concurrently developing multiple solutions and technologies geared towards specific market segments.
Aside from the challenges of package design, there is a need to develop efficient and cost-effective manufacturing processes that allow us to meet the schedule and volume demands of today's market places. These have presented us with interesting challenges in understanding the manufacturability, testability and reliability of packaging. Some of these issues are discussed in greater detail in this issue.
MAJC is an example of the design architecture of today’s microprocessor. We can easily related to the current trend by analysing the architecture of MAJC .
MAJC (pronounced "magic") is an acronym for "Microprocessor Architecture for Java Computing." MAJC is a microprocessor architecture designed to meet the broadband demands of the 21st century. Addressing the challenge of high bandwidth and the need for state-of-the-art computational performance, MAJC architecture is characterized by:
- Scalability to take full advantage of advances in semiconductor technology.
- Broad scalabilty to systems with large numbers of processors.
- A new standard of performance for applications with DSP or New Media computational needs.
- Focus on bandwidth throughput.
Processor Needs into the 21st Century
Several microprocessor trends were identified and accommodated in the design of the MAJC Architecture:
- Convergence of communication media and computers (audio, video, and data) require processors to compute information at wire speed.
- Advancements in semiconductor technology will provide rapidly-increasing resources on each microprocessor chip.
- As microprocessors are used in increasingly disparate applications from smart cards to supercomputers there is great value in the ability to create a wide span of implementations from a given processor architecture.
- Software, over time, will become independent of specific instruction sets; Just-In-Time (JIT) compilation techniques are expected to predominate for general-purpose processors and eliminate binary compatibility issues.
- Bandwidth between processors, memory, and I/O devices needs to be available to move information in real-time.
- The content processed by computers is becoming increasingly media-rich; DSP-like functions are required to process this media content.
Features of Today’s Microprocessors.
- Modular Architecture
To support the creation of a wide range of implementations the architecture supports modular implementations. A basic implementation might comprise a single processor unit with four functional units. By replicating those design elements, an implementation can be built that includes a few or even hundreds of processors, each with four functional units, each of which can operate on many data items simultaneously with parallel-operation (SIMD) instructions. Conversely, a tiny application-specific implementation can be derived from the basic one by trimming the complement of functional units down to one or two and/or removing hardware support for any instructions not needed in its target application.
- Software Portability
The architecture was designed to efficiently execute code generated by installation-time or just-in-time (JIT) compilation techniques. It may be the first commercial architecture designed without a requirement for binary compatibility between generations. This allows implementations to evolve over time without accumulating the baggage required to support old binaries, as traditional architectures have always done. Instead, software portability across implementations is obtained through use of architecture-neutral means of software distribution.
- Multiple Levels of Parallelism
The architecture provides the ability to exploit parallelism at many levels - at the data word level through SIMD instructions, at the instruction level through multiple functional units per processor, at the thread-of-execution level through support for multithreaded software, and at the system level through its intrinsic support for "MPs-on-a-chip" (multiple processor units per implementation). A implementation with more than one functional unit per processor unit provides MSIMD: multiple single-instruction multiple-data parallelism.
- Multiple Processor Units per Cluster
Although a MAJC implementation can be a single processor unit, the architecture explicitly incorporates the concept of multiple processors per implementation. Given 21st century semiconductor density, each such array of processor units or "processor cluster" can be implemented on a single chip. As semiconductor technology advances, clusters with more processors per chip can be implemented.
- Multiple Functional Units per Processor Unit
Every MAJC processor unit can issue multiple instructions simultaneously, one to each of its functional units. Most implementations are expected to provide two to four functional units per processor unit.
- Multithreaded Software
Execution of multithreaded software comes naturally given the architecture's ability to execute multiple threads simultaneously on multiple processor units. MAJC implementations with hardware support for vertical microthreading can efficiently execute multiple threads on each processor unit.
- SIMD Instructions
At the lowest level of parallelism, MAJC architecture provides SIMD (Single Instruction/ Multiple Data) or "vector" instructions. A SIMD instruction executing in a single functional unit could perform the same operation on multiple data items simultaneously.
- Integral Support for Media-Rich Data
The MAJC architecture is particularly well-suited for processing media-rich content because it directly supports common media data types and can process multiple simultaneous operations on that data. Processing power is multiplied on three levels: Single Instruction/Multiple Data (SIMD) DSP-like instructions in each functional unit, multiple functional units per processor unit, and multiple processor units per processor cluster.
- Balanced Performance: Processor versus Memory and I/O
A MAJC implementation is designed to utilize several techniques to balance processor speed with access to external memory and I/O devices:
- 100's of general-purpose registers per processor unit, which reduce the frequency of memory accesses
- Load-Group instructions, which increase bandwidth into the processor by simultaneously loading multiple registers from memory or an I/O device
- Store buffering, which increases bandwidth out of the processor by optimizing Store operations initiated by software
- Data Type-Independent Registers
The general-purpose register file in a MAJC implementation is datatype-agnostic: any register can hold information of any data type and be accessed by any instruction. In particular, there is no distinction between integer and floating-point registers. This allows registers to be allocated as needed by each application, without restrictions imposed by hardware partitioning of the register set.
- Instruction Grouping
Grouping instructions across multiple functional units can be performed dynamically in hardware (as in a superscalar processor), statically by a compiler, or by some combination of the two. Rather than devoting valuable chip area to hardware grouping logic, MAJC relies primarily on software compilers to group instructions across functional units.
- Data and Address Size
A MAJC implementation may implement either 32- or 64-bit addressing and data operations, as dictated by the needs of its target applications.
- Context Switch Optimization
Process (task) context switch time can be reduced by using the architecture's "register dirty bits", which allow an operating system to minimize the number of registers saved and restored during a context switch.
Memory Byte Order
The MAJC architecture's native byte order is "big-endian"; that is, multibyte values are stored in memory with the most significant byte at the lowest address and the least significant byte at the highest address. However, a MAJC implementation can manipulate data stored in any memory byte-order (notably "little-endian"). The BYTESHUFFLE instruction can reorder bytes efficiently in an arbitrary manner. Also, an implementation may define an Alternate Space Identifier (ASI) dedicated to performing automatic byte reordering whenever corresponding Load and Store from Alternate Address Space instructions are executed.