SKIP THE SHIPPING
Use code NOSHIP during checkout to save 40% on eligible eBooks, now through January 5. Shop now.
Register your product to gain access to bonus material or receive a coupon.
"This series of books is truly an important part of my library. ... I would recommend them to anyone doing hardware design or support, as well as to any developers who write low-level system code." -Paula Tomlinson, Windows Developer's Journal
Pentiumi Pro and Pentiumi II System Architecture, Second Edition, details the internal architecture of these two processors, describing their hardware and software characteristics, the bus protocol they use to communicate with the system, and the overall machine architecture. It also describes the BIOS Update Feature.
Written for computer hardware and software engineers, this book offers insight into how the Pentium Pro and Pentium II family of processors translates legacy x86 code into RISC instructions, executes them out-of-order, and then reassembles the result to match the original program flow. In detailing the Pentium Pro and Pentium II processors' internal operations, the book reveals why the processors generate various transaction types and how they monitor bus traffic generated by other entities to ensure cache consistency.
This new edition includes comprehensive coverage of the Pentium II processor. It highlights the differences between the Pentium Pro and Pentium II processors, in particular, the Slot 1 connector and the processor cartridge design utilized by the Pentium II and intended for use in future Intel processors. It features the Pentium II's support for the MMX instruction set and registers, and shows how it is optimized for 16-bit code execution. This book also describes the Pentium II's L2 cache and its support for power-conservation modes.
Pentiumi Pro and Pentiumi II System Architecture, Second Edition, also covers:
About This Book.
The MindShare Architecture Series.
Cautionary Note.
What This Book Covers.
What This Book Does Not Cover.
Organization of This Book.
Who This Book Is For.
Prerequisite Knowledge.
Documentation Conventions.
Hexadecimal Notation.
Binary Notation.
Decimal Notation.
Signal Name Representation.
Warning.
Identification of Bit Fields (Logical Groups of Bits or Signals).
Register Field References.
Resources.
Visit Our Web Site.
We Want Your Feedback.
I. SYSTEM OVERVIEW.
1. System Overview.Introduction.
What Is a Cluster?
What Is a Quad or 4-Way System?
Bootstrap Processor.
Starting Up Other Processors.
Relationship of Processors to Main Memory.
Processors' Relationships to Each Other.
Host/PCI Bridges.
Bridges' Relationship to Processors.
Bridges' Relationship to PCI Masters and Main Memory.
Bridges' Relationship to PCI Targets.
Bridges' Relationship to EISA or ISA Targets.
Bridges' Relationship to Each Other.
Bridge's Relationship to EISA and ISA Masters and DMA.
II. PROCESSOR'S HARDWARE CHARACTERISTICS.
Hardware Section 1: The Processor.Two Bus Interfaces.
External Bus.
Bus on Earlier Processors Inefficient for Multiprocessing.
Pentium Bus Has Limited Transaction Pipelining Capability.
Pentium Pro Bus Tuned for Multiprocessing.
IA = Legacy.
Instruction Set.
IA Instructions Vary in Length and Are Complex.
Pentium Pro Translates IA Instructions into RISC Instructions.
In-Order Front End.
Out-of-Order Middle.
In-Order Rear End.
Register Set.
IA Register Set Is Small.
Pentium Pro Has 40 General-Purpose Registers.
Elimination of False Register Dependencies.
Introduction to the Internal Architecture.
3. Processor Power-On Configuration.Automatically Configured Features.
Example of Captured Configuration Information.
Setup and Hold Time Requirements.
Run BIST Option.
Error Observation Options.
In-Order Queue Depth Selection.
Power-On Restart Address Selection.
FRC Mode Enable/Disable.
APIC ID Selection.
Selecting Tri-State Mode.
Processor Core Speed Selection.
Processor's Agent and APIC ID Assignment.
FRC Mode.
Program-Accessible Startup Features.
4. Processor Startup.Selection of Processor's Agent and APIC IDs.
Processor's State After Reset.
EDX Contains Processor Identification Info.
State of Caches and Processor's Ability to Cache.
Selection of Bootstrap Processor (BSP).
Introduction.
BSP Selection Process.
APIC Arbitration Background.
Startup APIC Arbitration ID Assignment.
BSP Selection Process.
Example of APIC Bus Traffic Captured during BSP Selection.
Initial BSP Memory Accesses.
General.
When Caching Disabled, Prefetcher Always Does 32-byte Code Reads.
State 2: 1st Line Read (and jump at FFFFFFF0h executed).
State 10: Branch Trace Message for Jump at FFFFFFF0h.
State 16: Branch Target Line Read (and 2nd jump executed).
State 26: Branch Trace Message for 2nd Jump.
State 32: CLI Fetched and Executed.
State 42: CLD Fetched and Executed.
State 50: JMP Rel/Fwd Fetched and Executed.
State 58: Branch Trace Message for JMP Rel/Fwd.
State 64: SHL EDX,10h Fetched and Executed.
State 74: MOV DX,048Bh Fetched and Executed.
State 82: AND Fetched and Executed.
State 90: OUT Fetched and Executed.
State 98: IO Write to 48Bh.
State 106: OR Fetched and Executed.
State 114: MOV BX,CS Fetched and Executed.
State 122: MOV SS,BX Fetched and Executed.
State 130: JE Fetched and Executed.
State 138: Branch Trace Message for JE.
How APs Are Started.
AP Detection by the POST/BIOS.
Introduction.
The POST/BIOS Code.
The FindAndInitAllCPUs Routine.
The OS Is Loaded and Control Is Passed to It.
Uni-Processor OS.
MP OS.
5. The Fetch, Decode, Execute Engine.Please Note.
Introduction.
Enabling the Caches.
Prefetcher.
Issues Sequential Read Requests to Code Cache.
Introduction to Prefetcher Operation.
Brief Description of Pentium Pro Processor.
Beginning, Middle and End.
In-Order Front End.
Out-of-Order (OOO) Middle.
In-Order Rear End.
Intro to the Instruction Pipeline.
In-Order Front End.
Instruction Fetch Stages.
IFU1 Stage: 32-Byte Line Fetched from Code Cache.
IFU2 Stage: Marking Boundaries and Dynamic Branch Prediction.
IFU3 Stage: Align Instructions for Delivery to Decoders.
Decode Stages.
DEC1 Stage: Translate IA Instructions into Micro-Ops.
Micro Instruction Sequencer (MIS).
DEC2 Stage: Move Micro-Ops to ID Queue.
Queue Micro-Ops for Placement in Pool.
Second Chance for Branch Prediction.
RAT Stage: Overcoming the Small IA Register Set.
ReOrder Buffer (ROB) Stage.
Instruction Pool (ROB) Is a Circular Buffer.
Out-of-Order (OOO) Middle.
In-Order Rear End (RET1 and RET2 Stages).
Three Scenarios.
Scenario One: Reset Just Removed.
Starvation!
First Instruction Fetch.
First Memory Read Bus Transaction.
Eight Bytes Placed in Prefetch Streaming Buffer.
Instruction Boundaries Marked and BTB Checked.
Between One and Three Instructions Decoded into Micro-Ops.
Source Operand Location Selected (RAT).
Micro-Ops Advanced to ROB and RS.
Micro-Ops Dispatched for Execution.
Micro-Ops Executed.
Result to ROB Entry (and other micro-ops if necessary).
Micro-op Ready for Retirement?
Micro-Op Retired.
Scenario Two: Processor's Caches Just Enabled.
Scenario Three: Caches Enabled for Some Time.
Memory Data Accesses — Loads and Stores.
Handling Loads.
Handling Stores.
Description of Branch Prediction.
486 Branch Handling.
Pentium Branch Prediction.
Pentium Pro Branch Prediction.
Mispredicted Branches Are VERY Costly!
Dynamic Branch Prediction.
General.
Yeh's Prediction Algorithm.
Return Stack Buffer (RSB).
Static Branch Prediction.
Code Optimization.
General.
Reduce Number of Branches.
Follow Static Branch Prediction Algorithm.
Identify and Improve Unpredictable Branches.
Don't Intermingle Code and Data.
Align Data.
Avoid Serializing Instructions.
Where Possible, Do Context Switches in Software.
Eliminate Partial Stalls: Small Write Followed by Full-Register Read.
Data Segment Register Changes Serialize Execution.
6. Rules of Conduct.The Problem.
General.
A Memory-Mapped IO Example.
Pentium Solution.
Pentium Pro Solution.
State of the MTRRs after Reset.
Memory Types.
Uncacheable (UC) Memory.
Write-Combining (WC) Memory.
Write-Through (WT) Memory.
Write-Protect (WP) Memory.
Write-Back (WB) Memory.
Rules as Defined by MTRRs.
Rules of Conduct Provided in Bus Transaction.
MTRRs and Paging: When Worlds Collide.
Detailed Description of the MTRRs.
7. The Processor Caches.Cache Overview.
Introduction to Data Cache Features.
Introduction to Code Cache Features.
Introduction to L2 Cache Features.
Introduction to Snooping.
Determining Processor's Cache Sizes and Structures.
L1 Code Cache.
Code Cache Uses MESI Subset: S and I.
Code Cache Contains Only Raw Code.
Code Cache View of Memory Space.
Code TLB (ITLB).
Code Cache Lookup.
Code Cache Hit.
Code Cache Miss.
Code Cache LRU Algorithm: Make Room for the New Guy.
Code Cache Castout.
Code Cache Snoop Ports.
L1 Data Cache.
Data Cache Uses MESI Cache Protocol.
Data Cache View of Memory Space.
Data TLB (DTLB).
Data Cache Lookup.
Data Cache Hit.
Relationship of L2 and L1 Caches.
Relationship of L2 to L1 Code Cache.
Relationship of L2 and L1 Data Cache.
Read Miss on L1 and L2.
Read Miss on All Other Caches.
Read Hit on E or S Line in One or More Other Caches.
Read Hit on Modified Line in One Other Cache.
Write Hit on L1 Data Cache.
Write Hit on S Line in Data Cache.
Write Hit on E Line in Data Cache.
Write Hit on M Line in Data Cache.
Write Miss on L1 Data Cache.
L1 Data Cache Castout.
Data Cache LRU Algorithm: Make Room for the New Guy.
Data Cache Pipeline.
Data Cache Is Non-Blocking.
Earlier Processor Caches Blocked, but Who Cares?
Pentium Pro Data Cache Is Non-Blocking, and That's Important!
Data Cache has Two Service Ports.
Two Address and Two Data Buses.
Simultaneous Load/Store Constraint.
Data Cache Snoop Ports.
Unified L2 Cache.
L2 Cache Uses MESI Protocol.
L2 Cache View of Memory Space.
Request Received.
L2 Cache Lookup.
L2 Cache Hit.
L2 Cache Miss.
L2 Cache LRU Algorithm: Make Room for the New Guy.
L2 Cache Pipeline.
L2 Cache Snoop Ports.
Toggle Mode Transfer Order.
Self-Modifying Code and Self-Snooping.
Description.
Don't Let Your Data and Code Get Too Friendly!
ECC Error Handling.
Procedure to Disable All Caching.
Hardware Section 2: Bus Intro and Arbitration.Introduction.
Everything's Relative.
All Signals Active Low.
Powerful Pullups Snap Lines High Fast.
The Layout.
Synchronous Bus.
Setup and Hold Specs.
Setup Time.
Hold Time.
How High Is High and How Low Is Low?
After You See Something, You have One Clock to Do Something About It.
9. Bus Basics.Agents.
Agent Types.
Multiple Personalities.
Uniprocessor vs. Multiprocessor Bus.
Request Agents.
Request Agent Types.
Agent ID.
What Agent ID Used For.
How Agent ID Assigned.
Transaction Phases.
Pentium Transaction Phases.
Pentium Pro Transaction Phases.
Transaction Pipelining.
Bus Divided into Signal Groups.
Step One: Gain Ownership of Request Signal Group.
Step Two: Issue Transaction Request.
Step Three: Yield Request Signal Group, Proceed to Next Signal Group.
Phases Proceed in Predefined Order.
Request Phase.
Error Phase.
Snoop Phase.
Response Phase.
Data Phase.
Next Agent Can't Use Signal Group Until Current Agent Done with It.
Transaction Tracking.
Request Agent Transaction Tracking.
Snoop Agent Transaction Tracking.
Response Agent Transaction Tracking.
The IOQ.
10. Obtaining Bus Ownership.Request Phase.
Symmetric Agent Arbitration — Democracy at Work.
No External Arbiter Required.
Agent ID Assignment.
Arbitration Algorithm.
Rotating ID.
Busy/Idle State.
Bus Parking.
Be Fair!
What Signal Group Are You Arbitrating For?
Requesting Ownership.
Example of One Symmetric Agent Requesting Ownership.
Example of Two Symmetric Agents Requesting Ownership.
Definition of an Arbitration Event.
Once BREQn# Asserted, Keep Asserted Until Ownership Attained.
Example Case Where Transaction Cancelled Before Started.
Bus Parking Revisited.
Priority Agent Arbitration — Despotism.
Example Priority Agents.
Priority Agent Beats Symmetric Agents, Unless....
Using Simple Approach, Priority Agent Suffers Penalty.
Smarter Priority Agent Gets Ownership Faster.
Ownership Attained in 2 BCLKs.
Ownership Attained in 3 BCLKs.
Be Fair to the Common People.
Priority Agent Parking.
Locking — Shared Resource Acquisition.
Shared Resource Concept.
Testing Availability and Gaining Ownership of Shared Resources.
Race Condition Can Present Problem.
Guaranteeing Atomicity of Read/Modify/Write.
LOCK Instruction Prefix.
Processor Automatically Asserts LOCK# for Some Operations.
Use Locked RMW to Obtain and Give Up Semaphore Ownership.
Duration of Locked Transaction Series.
Back-to-Back RMW Operations.
Locking a Cache Line.
Advantage of Cache Line Locking.
New Directory Bit — Cache Line Locked.
Read and Invalidate Transaction (RWITM, or Kill).
Line in E or M State.
Semaphore Not in Processor's L1 or L2 Cache.
Semaphore in Cache in E State.
Semaphore in Cache in S State.
Semaphore in Cache in M State.
Blocking New Requests — Stop! I'm Full!
BNR# Is Shared Signal.
Stalled/Throttled/Free Indicator.
Open Gate, Let One Out, Close Gate.
Open Gate, Leave It Open, Let Them All Out.
Gate Wide Open and then Slammed Shut.
BNR# Behavior at Powerup.
BNR# and the Built-In Self-Test (BIST).
BNR# Behavior During Runtime.
Hardware Section 3: The Transaction Phases.Caution.
Request Phase.
Introduction to the Request Phase.
Request Signal Group Is Multiplexed.
Introduction to the Transaction Types.
Contents of Request Packet A.
32-bit vs. 36-bit Addresses.
Contents of Request Packet B.
Error Phase.
In-Flight Corruption.
Who Samples AERR#?
Request Agent.
Other Bus Agents.
Who Drives AERR#?
Request Agent's Response to AERR# Assertion.
Other Guys Are Very Polite.
12. The Snoop Phase.Agents Involved in Snoop Phase.
Snoop Phase Has Two Purposes.
Snoop Result Signals Are Shared, DEFER# Isn't.
Snoop Phase Duration Is Variable.
Is There a Snoop Stall Duration Limit?
Memory Transaction Snooping.
Snoop's Effects on Caches.
After Snoop Stall, How Soon Can Next Snoop Result be Presented?
Self-Snooping.
Non-Memory Transactions Have a Snoop Phase.
Transaction Retry and Deferral.
Permission to Defer Transaction Completion.
DEFER# Assertion Delays Transaction Completion.
Transaction Retry.
Transaction Deferral.
Mail Delivery Analogy.
Example System Operation Overview.
The Wrong Way.
The Right Way.
Bridge Should be a Faithful Messenger.
Detailed Deferred Transaction Description.
What if HITM# and DEFER# both Asserted?
How Does Locking Change Things?
13. The Response and Data Phases.Note on Deferred Transactions.
Purpose of Response Phase.
Response Phase Signal Group.
Response Phase Start Point.
Response Phase End Point.
List of Responses.
Response Phase May Complete Transaction.
Data Phase Signal Group.
Five Example Scenarios.
Transaction That Doesn't Transfer Data.
Read That Doesn't Hit a Modified Line and Is Not Deferred.
Basics.
Detailed Description.
How Does Response Agent Know Transfer Length?
What's the Earliest That DBSY# Can be Deasserted?
Relaxed DBSY# Deassertion.
Write That Doesn't Hit a Modified Line and Isn't Deferred.
Basics.
Previous Transaction May Involve a Write.
Earliest TRDY# Assertion Is 1 Clock After Previous Response Issued.
When Does Request Agent First Sample TRDY#?
When Does Request Agent Start Using Data Bus?
When Can TRDY# Be Deasserted?
When Does Request Agent Take Ownership of Data Bus?
Deliver the Data.
On AERR# or Hard Failure Response.
Snoop Agents Change State of Line from E->I or S->I.
Read That Hits a Modified Line.
Basics.
Transaction Starts as a Read from Memory.
From Memory Standpoint, Changes from Read to Write.
Memory Asserts TRDY# to Accept Data.
Memory Must Drive Response At Right Time.
Snoop Agent Asserts DBSY# and Memory Drives Response.
Snoop Agent Supplies Line to Memory and to Request Agent.
Snoop Agent Changes State of Line from M->S.
Write That Hits a Modified Line.
Data Phase Wait States.
Special Case — Single Quadword, 0-Wait State Transfer.
Response Phase Parity.
Hardware Section 4: Other Bus Topics.Introduction to Transaction Deferral.
Example System Model.
Typical PC Server Model.
The Problem.
Possible Solutions.
An Example Read.
Read Transaction Memorized and Deferred Response Issued.
Bridge Performs PCI Read Transaction.
Deferred Reply Transaction Issued.
Original Request Agent Selected.
Bridge Provides Snoop Result.
Response Phase — Role Reversal.
Data Phase.
Trackers Retire Transaction.
Other Possible Responses.
An Example Write.
Transaction and Write Data Memorized, Deferred Response Issued.
PCI Transaction Performed and Data Delivered to Target.
Deferred Reply Transaction Issued.
Original Request Agent Selected.
Bridge Provides Snoop Result.
Response Phase — Role Reversal.
There Is No Data Phase.
Trackers Retire Transaction.
Other Possible Responses.
Pentium Pro Support for Transaction Deferral.
15. IO Transactions.Introduction.
IO Address Range.
Data Transfer Length.
Behavior Permitted by Specification.
How Pentium Pro Processor Operates.
16. Central Agent Transactions.Point-to-Point vs. Broadcast.
Interrupt Acknowledge Transaction.
Background.
How Pentium Pro Is Different.
Host/PCI Bridge Is Response Agent.
Special Transaction.
General.
Message Types.
Branch Trace Message Transaction Used for Program Debug.
What's the Problem?
What's the Solution?
Enabling Branch Trace Message Capability.
Branch Trace Message Transaction.
Packet A Composition.
Packet B Composition.
Proper Response.
Data Composition.
17. Other Signals.Error Reporting Signals.
Bus Initialize (BINIT#).
Description.
Assertion/Deassertion Protocol.
Bus Error (BERR#).
Description.
BERR#/BINIT# Assertion/Deassertion Protocol.
Internal Error (IERR#).
Functional Redundancy Check Error (FRCERR).
PC-Compatibility Signals.
A20 Mask (A20M#).
FERR# and IGNNE#.
Diagnostic Support Signals.
Interrupt-Related Signals.
Processor Present Signals.
Power Supply Pins.
Miscellaneous Signals.
III. PENTIUM II PROCESSOR.
18. Pentium II Processor.Introduction.
Single-Edge Cartridge.
Pentium and Pentium Pro Sockets.
Pentium II Processor Cartridge.
Processor Side of SEC Substrate.
General.
Processor Core.
Non-Processor Side of SEC Substrate.
Cartridge Block Diagram.
Dual-Independent Bus Architecture (DIBA).
Caches.
L1 Code and Data Caches.
L2 Cache.
Cache Error Protection.
Processor Signature.
CPUID Cache Geometry Information.
Fast System Call Instructions.
Frequency of the Processor Core and Buses.
Signal Differences Between Pentium II and Pentium Pro.
MMX.
16-bit Code Optimization.
Pentium Pro Not Optimized.
Pentium II Shadows the Data Segment Registers.
Multiprocessor Capability.
Pentium Pro Processor Bus Arbitration.
Pentium II Processor Bus Arbitration.
Power-Conservation Modes.
Introduction.
Normal State.
AutoHalt Power Down State.
Stop Grant State.
Halt/Grant Snoop State.
Sleep State.
Deep Sleep State.
Voltage Identification.
Treatment of Unused Bus Pins.
Unused Reserved Pins.
TESTHI Pins.
When APIC Signals Are Unused.
Unused GTL+ Inputs.
Unused Active Low CMOS Inputs.
Unused Active High Inputs.
Unused Outputs.
Test Access Port (TAP).
Deschutes Version of the Pentium II Processor.
Slot 2.
Pentium II Chip Sets.
Boxed Processor.
IV. PROCESSOR'S SOFTWARE CHARACTERISTICS.
19. Instruction Set Enhancements.Introduction.
CPUID Instruction Enhanced.
Before Executing CPUID, Determine if Supported.
Basic Description.
Vendor ID and Max Input Value Request.
Request for Vendor ID String and Max EAX Value.
Request for Version and Supported Features.
Request for Cache and TLB Information.
CPUID Is a Serializing Instruction.
Serializing Instructions Impact Performance.
Conditional Move (CMOV) Eliminates Branches.
Conditional FP Move (FCMOV) Eliminates Branches.
FCOMI, FCOMIP, FUCOMI, and FUCOMIP.
Read Performance Monitoring Counter (RDPMC).
What's RDPMC Used For?
Who Can Execute RDPMC?
RDPMC Not Serializing Instruction.
RDPMC Description.
Read Time Stamp Counter (RDTSC).
What's RDTSC Used For.
Who Can Execute RDTSC?
RDTSC Doesn't Serialize.
RDTSC Description.
My Favorite — UD2.
Accessing MSRs.
Testing for Processor MSR Support.
Causes GP Exception If....
Input Parameters.
20. Register Set Enhancements.New Registers.
Introduction.
DebugCTL, LastBranch and LastException MSRs.
Introduction.
Last Branch, Interrupt or Exception Recording.
Single-Step Exception on Branch, Exception or Interrupt.
MSR Not Defined in Earlier Pentium Pro Documentation.
Disable Instruction Streaming Buffer.
Disable Cache Line Boundary Lock.
New Bits in Pre-Existent Registers.
CR4 Enhancements.
CR3 Enhancements.
Local APIC Base Address Relocation.
21. BIOS Update Feature.The Problem.
The Solution.
The BIOS Update Image.
Introduction.
BIOS Update Header Data Structure.
The BIOS Update Loader.
CPUID Instruction Enhanced.
Determining if New Update Supercedes Previously-Loaded Update.
Effect of RESET# on Previously-Loaded Update.
When Must Update Load Take Place?
Updates in a Multiprocessor System.
22. Paging Enhancements.Background on Paging.
Page Size Extension (PSE) Feature.
The Problem.
The Solution — Big Pages.
How It Works.
Physical Address Extension (PAE) Feature.
How Paging Normally Works.
What Is the PAE?
How Is the PAE Enabled?
Changes to the Paging-Related Data Structures.
Programmer Still Restricted to 32-bit Addresses and 220 Pages.
Pages Can be Distributed Throughout Lower 64GB.
CR3 Contains Base Address of PDPT.
Format of PDPT Entry.
TLB Flush Necessary after PDPT Entry Change.
Format of Page Directory Entry.
Format of Page Table Entry.
The PAE and the Page Size Extension (PSE).
Global Page Feature.
The Problem.
The Solution.
Propagation of Page Table Entry Changes to Multiple Processors.
23. Interrupt Enhancements.New Exceptions.
Added APIC Functionality.
VM86 Mode Extensions.
VM86 Mode Background.
Interrupt-Related Problems and VM86 Tasks.
Software Overhead Associated with CLI/STI Execution.
Attempted Execution of CLI by VM86 Task.
Attempted Execution of STI Instruction.
Servicing of Software Interrupts by DOS or OS.
Solution — VM86 Mode Extensions.
Introduction.
CLI/STI Solution.
EFLAGS[VIF] = 1, EFLAGS[IF] = 1, Interrupt Occurs.
EFLAGS[VIF] = 0, EFLAGS[IF] = 1, Interrupt Occurs.
Software Interrupt Redirection Solution.
Virtual Interrupt Handling in Protected Mode.
24. Machine Check Architecture.Purpose of Machine Check Architecture.
Machine Check Architecture in the Pentium Processor.
Testing for Machine Check Support.
Machine Check Exception.
Machine Check Architecture Register Set.
Composition of Global Register Set.
MCG_CAP Register.
MCG_STATUS Register.
MCG_CTL Register.
Composition of Each Register Bank.
General.
MCi_STATUS Register.
MSR Addresses of the Machine Check Registers.
Initialization of Register Set.
Machine Check Architecture Error Format.
Simple Error Codes.
Compound Error Codes.
External Bus Error Interpretation.
25. Performance Monitoring and Timestamp.Time Stamp Counter Facility.
Time Stamp Counter (TSC) Definition.
Detecting Presence of the TSC.
Accessing the Time Stamp Counter.
Reading the TSC Using RDTSC Instruction.
Reading the TSC Using RDMSR Instruction.
Writing to the TSC.
Performance Monitoring Facility.
Purpose of the Performance Monitoring Facility.
Performance Monitoring Registers.
PerfEvtSel0 and PerfEvtSel1 MSRs.
PerfCtr0 and PerfCtr1.
Accessing the Performance Monitoring Registers.
Accessing the PerfEvtSel MSRs.
Accessing the PerfCtr MSRs.
Accessing Using RDPMC Instruction.
Accessing Using RDMSR/WRMSR Instructions.
Event Types.
Starting and Stopping the Counters.
Starting the Counters.
Stopping the Counters.
Performance Monitoring Interrupt on Overflow.
26. MMX: Matrix Math Extensions.Please Note.
Problems Addressed by MMX.
Problem: Math on Packed Bytes/Words/Dwords.
Solution: MMX Matrix Math/Logical Operations.
Problem: Data Not Packed.
Solution: MMX Pack and Unpack Instructions.
Problem: Math Overflows/Underflows.
Solution: Saturating Math.
Problem: Comparisons and Branches.
Solution: MMX Parallel Comparisons.
Single Instruction, Multiple Data (SIMD).
Detecting Presence of MMX.
Changes to Programming Environment.
General.
Handling a Task Switch.
When Exiting MMX Routine, Execute EMMS.
MMX Instruction Set.
Instruction Groups.
Instruction Syntax.
Instruction Set.
Pentium II MMX Execution Units. 0201309734T04062001
The MindShare Architecture Series
The MindShare Architecture book series includes: ISA System Architecture, EISA System Architecture, 80486 System Architecture, PCI System Architecture, Pentium System Architecture, PCMCIA System Architecture, PowerPC System Architecture, Plug-and-Play System Architecture, CardBus System Architecture, Protected Mode Software Architecture, USB System Architecture, Pentium Pro and Pentium II System Architecture, and FireWire System Architecture: IEEE 1394. The book series is published by Addison-Wesley.
Rather than duplicating common information in each book, the series uses the building-block approach. ISA System Architecture is the core book upon which most of the others build. The figure below illustrates the relationship of the books to each other.
Cautionary Note
The reader should keep in mind that MindShare's book series often deals with rapidly-evolving technologies. This being the case, it should be recognized that each book is a "snapshot" of the state of the targeted technology at the time that the book was completed. We attempt to update each book on a timely basis to reflect changes in the targeted technology, but, due to various factors (waiting for the next version of the spec to be "frozen," the time necessary to make the changes, and the time to produce the books and get them out to the distribution channels), there will always be a delay.
What This Book Covers
The purpose of this book is to provide a detailed description of the Pentium Pro and Pentium II processors both from the hardware and the software perspectives. As with our other x86 processor books, this book builds upon and does not duplicate information provided in our books on the previous generation processors. As an example, our Pentium Processor System Architecture book provided a detailed description of the APIC module, while this book only describes differences between the two implementations.
What this Book Does not Cover
This book does not describe the x86 instruction repertoire. There are a host of books on the market that already provide this information. It does, however, describe the new instructions added to the instruction set.
Who this Book is For
This book is intended for use by hardware and software design and support personnel. Due to the clear, concise explanatory methods used to describe each subject, personnel outside of the design field may also find the text useful.