⚙️ V8 引擎优化深度解析

从 Turbofan Pipeline 到图归约、Lowering 与调度器

基于源码深度解析
2026-03-11 | 技术深度解读

目录

  1. 优化编译总体链路
  2. 核心数据结构:Graph / Node / Operator
  3. 图归约 GraphReducer
  4. 类型驱动 Lowering
  5. Schedule 与代码生成
  6. 性能特征、调试方法与实践

V8 优化编译器在做什么?

  • Turbofan 负责把热点 JavaScript/内建代码编译为高质量机器码
  • 核心不是“直接翻译源码”,而是构建 IR 图 后不断规约、Lowering、调度、选择指令
  • 优化目标:减少动态类型开销、降低分支与内存访问成本、生成适合目标架构的代码
Pipeline = 分阶段降低动态性

本次解读的核心文件

文件职责
pipeline.h / pipeline.cc串起整个优化编译流程
graph-reducer.h / .cc统一执行 reducer 链,反复消除冗余节点
simplified-lowering.h / .cc把高层节点降到机器可执行表示
scheduler.cc为后端生成合法且高效的执行顺序
node.hIR 图节点和边的底层结构

src/compiler/pipeline.cc · graph-reducer.cc · simplified-lowering.cc · scheduler.cc

入口:Pipeline 类暴露的能力

class TFGraph;
class InstructionSequence;
class JSGraph;
class JSHeapBroker;
class MachineGraph;
class Schedule;
class SourcePositionTable;
struct WasmCompilationData;
class TFPipelineData;
class ZoneStats;

struct InstructionRangesAsJSON {
  const InstructionSequence* sequence;
  const ZoneVector<std::pair<int, int>>* instr_origins;
};

std::ostream& operator<<(std::ostream& out, const InstructionRangesAsJSON& s);

class Pipeline : public AllStatic {
 public:
  // Returns a new compilation job for the given JavaScript function.
  static V8_EXPORT_PRIVATE std::unique_ptr<TurbofanCompilationJob>
  NewCompilationJob(Isolate* isolate, Handle<JSFunction> function,
                    CodeKind code_kind, bool has_script,
                    BytecodeOffset osr_offset = BytecodeOffset::None());

  using CodeAssemblerGenerator =
      std::function<void(compiler::CodeAssemblerState*)>;
  using CodeAssemblerInstaller =
      std::function<void(Builtin builtin, Handle<Code> code)>;

  static std::unique_ptr<TurbofanCompilationJob>
  NewCSLinkageCodeStubBuiltinCompilationJob(
      Isolate* isolate, Builtin builtin, CodeAssemblerGenerator generator,
      CodeAssemblerInstaller installer,
      const AssemblerOptions& assembler_options,
      CallDescriptors::Key interface_descriptor, const char* name,
      const ProfileDataFromFile* profile_data, int finalize_order);

  static std::unique_ptr<TurbofanCompilationJob>
  NewJSLinkageCodeStubBuiltinCompilationJob(
      Isolate* isolate, Builtin builtin, CodeAssemblerGenerator generator,
      CodeAssemblerInstaller installer,
      const AssemblerOptions& assembler_options, int argc, const char* name,
      const ProfileDataFromFile* profile_data, int finalize_order);

  using TurboshaftAssemblerGenerator =
      std::function<void(compiler::turboshaft::PipelineData*, Isolate*,
                         compiler::turboshaft::Graph&, Zone*)>;
  using TurboshaftAssemblerInstaller = CodeAssemblerInstaller;
  • NewCompilationJob:为 JS 函数创建优化编译任务
  • GenerateCodeForTesting:测试路径,方便独立跑管线
  • 同一套 Pipeline 还支持 builtin / wasm 等不同来源

Pipeline 为什么是中心中枢?

前端输入
  • 字节码 / Builtin / Wasm
  • 类型反馈与编译信息
  • Linkage / 调用约定
后端输出
  • Schedule
  • InstructionSequence
  • Code 对象

它不直接完成所有优化,而是像 orchestrator 一样按阶段把数据交给专门组件。

NewCompilationJob:优化任务如何被创建

std::unique_ptr<TurbofanCompilationJob> Pipeline::NewCompilationJob(
    Isolate* isolate, Handle<JSFunction> function, CodeKind code_kind,
    bool has_script, BytecodeOffset osr_offset) {
  Handle<SharedFunctionInfo> shared(function->shared(), isolate);
  return std::make_unique<PipelineCompilationJob>(isolate, shared, function,
                                                  osr_offset, code_kind);
}

bool PipelineImpl::ComputeScheduledGraph() {
  TFPipelineData* data = this->data_;

  // We should only schedule the graph if it is not scheduled yet.
  DCHECK_NULL(data->schedule());

  RUN_MAYBE_ABORT(ComputeSchedulePhase);
  TraceScheduleAndVerify(data->info(), data, data->schedule(), "schedule");
  return true;
}

#if V8_ENABLE_WASM_SIMD256_REVEC
bool PipelineImpl::Revectorize() { return Run<RevectorizePhase>(); }
#endif  // V8_ENABLE_WASM_SIMD256_REVEC

OptimizedCompilationInfo* PipelineImpl::info() const { return data_->info(); }

Isolate* PipelineImpl::isolate() const { return data_->isolate(); }

CodeGenerator* PipelineImpl::code_generator() const {
  return data_->code_generator();
}

ObserveNodeManager* PipelineImpl::observe_node_manager() const {
  return data_->observe_node_manager();
}

std::ostream& operator<<(std::ostream& out, const InstructionRangesAsJSON& s) {
  const int max = static_cast<int>(s.sequence->LastInstructionIndex());

  out << ", \"nodeIdToInstructionRange\": {";
  bool need_comma = false;
  for (size_t i = 0; i < s.instr_origins->size(); ++i) {
    std::pair<int, int> offset = (*s.instr_origins)[i];
    if (offset.first == -1) continue;
    const int first = max - offset.first + 1;
    const int second = max - offset.second + 1;
    if (need_comma) out << ", ";
    out << "\"" << i << "\": [" << first << ", " << second << "]";
    need_comma = true;
  }
  out << "}";
  out << ", \"blockIdToInstructionRange\": {";
  need_comma = false;
  for (auto block : s.sequence->instruction_blocks()) {
    if (need_comma) out << ", ";
    out << "\"" << block->rpo_number() << "\": [" << block->code_start() << ", "
        << block->code_end() << "]";
    need_comma = true;
  }
  out << "}";
  return out;
}

}  // namespace compiler
}  // namespace internal
}  // namespace v8
  • 接收 Isolate、函数句柄、CodeKind、OSR 信息
  • 根据函数与反馈准备 OptimizedCompilationInfo
  • 最终返回 TurbofanCompilationJob

优化编译全景图

Bytecode / Builtin ↓ 构建 TFGraph / JSGraph ↓ GraphReducer 链式规约 ↓ Typer / SimplifiedLowering ↓ Scheduler / Instruction Selection ↓ Register Allocation / CodeGen ↓ Machine Code
真正的性能来自每一步都把“高层语义”变成“更确定、更接近硬件”的表示。

Node:IR 图的最小计算单元

// Copyright 2013 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#ifndef V8_COMPILER_NODE_H_
#define V8_COMPILER_NODE_H_

#include "src/common/globals.h"
#include "src/compiler/graph-zone-traits.h"
#include "src/compiler/opcodes.h"
#include "src/compiler/operator.h"
#include "src/compiler/turbofan-types.h"
#include "src/zone/zone-containers.h"

namespace v8 {
namespace internal {
namespace compiler {

// Forward declarations.
class Edge;
class TFGraph;

// Marks are used during traversal of the graph to distinguish states of nodes.
// Each node has a mark which is a monotonically increasing integer, and a
// {NodeMarker} has a range of values that indicate states of a node.
using Mark = uint32_t;

// NodeIds are identifying numbers for nodes that can be used to index auxiliary
// out-of-line data associated with each node.
using NodeId = uint32_t;

// A Node is the basic primitive of graphs. Nodes are chained together by
// input/use chains but by default otherwise contain only an identifying number
// which specific applications of graphs and nodes can use to index auxiliary
// out-of-line data, especially transient data.
//
// In addition Nodes only contain a mutable Operator that may change during
// compilation, e.g. during lowering passes. Other information that needs to be
// associated with Nodes during compilation must be stored out-of-line indexed
// by the Node's id.
class V8_EXPORT_PRIVATE Node final {
 public:
  static Node* New(Zone* zone, NodeId id, const Operator* op, int input_count,
                   Node* const* inputs, bool has_extensible_inputs);
  static Node* Clone(Zone* zone, NodeId id, const Node* node);

  inline bool IsDead() const;
  void Kill();

  const Operator* op() const { return op_; }

  constexpr IrOpcode::Value opcode() const {
    DCHECK_GE(IrOpcode::kLast, op_->opcode());
    return static_cast<IrOpcode::Value>(op_->opcode());
  }

  NodeId id() const { return IdField::decode(bit_field_); }

  int InputCount() const {
    return has_inline_inputs() ? InlineCountField::decode(bit_field_)
                               : outline_inputs()->count_;
  }

#ifdef DEBUG
  void Verify();
#else
  inline void Verify() {}
#endif

  Node* InputAt(int index) const {
    DCHECK_LE(0, index);
    DCHECK_LT(index, InputCount());
    return *GetInputPtrConst(index);
  }

  void ReplaceInput(int index, Node* new_to) {
    DCHECK_LE(0, index);
    DCHECK_LT(index, InputCount());
    ZoneNodePtr* input_ptr = GetInputPtr(index);
    Node* old_to = *input_ptr;
    if (old_to != new_to) {
      Use* use = GetUsePtr(index);
      if (old_to) old_to->RemoveUse(use);
      *input_ptr = new_to;
      if (new_to) new_to->AppendUse(use);
    }
  }

  void AppendInput(Zone* zone, Node* new_to);
  void InsertInput(Zone* zone, int index, Node* new_to);
  void InsertInputs(Zone* zone, int index, int count);
  // Returns the removed input.
  Node* RemoveInput(int index);
  void NullAllInputs();
  void TrimInputCount(int new_input_count);
  // Can trim, extend by appending new inputs, or do nothing.
  void EnsureInputCount(Zone* zone, int new_input_count);

  int UseCount() const;
  int BranchUseCount() const;
  void ReplaceUses(Node* replace_to);

  class InputEdges;
  inline InputEdges input_edges();

  class Inputs;
  inline Inputs inputs() const;
  inline base::Vector<Node*> inputs_vector() const;

  class UseEdges final {
   public:
    using value_type = Edge;

    class iterator;
    inline iterator begin() const;
    inline iterator end() const;

    bool empty() const;

    explicit UseEdges(Node* node) : node_(node) {}
  • Node 带有 operator、输入、use 链与 id
  • 值边 / effect 边 / control 边共同表达 JS 语义与执行顺序
  • V8 通过图而不是 AST 做优化,更适合跨基本块重写

Node 的输入边 / 使用边设计

class Node::InputEdges final {
 public:
  using value_type = Edge;

  class iterator;
  inline iterator begin() const;
  inline iterator end() const;

  bool empty() const { return count_ == 0; }
  int count() const { return count_; }

  inline value_type operator[](int index) const;

  InputEdges(ZoneNodePtr* input_root, Use* use_root, int count)
      : input_root_(input_root), use_root_(use_root), count_(count) {}

 private:
  ZoneNodePtr* input_root_;
  Use* use_root_;
  int count_;
};

class V8_EXPORT_PRIVATE Node::Inputs final {
 public:
  using value_type = Node*;

  class const_iterator;
  inline const_iterator begin() const;
  inline const_iterator end() const;

  bool empty() const { return count_ == 0; }
  int count() const { return count_; }

  inline value_type operator[](int index) const;

  explicit Inputs(ZoneNodePtr const* input_root, int count)
      : input_root_(input_root), count_(count) {}

 private:
  ZoneNodePtr const* input_root_;
  int count_;
};

// An encapsulation for information associated with a single use of a node as an
// input from another node, allowing access to both the defining node and
// the node having the input.
class Edge final {
 public:
  Node* from() const { return use_->from(); }
  Node* to() const { return *input_ptr_; }
  int index() const {
    int const index = use_->input_index();
    DCHECK_LT(index, use_->from()->InputCount());
    return index;
  }

  bool operator==(const Edge& other) { return input_ptr_ == other.input_ptr_; }
  bool operator!=(const Edge& other) { return !(*this == other); }

  void UpdateTo(Node* new_to) {
    Node* old_to = *input_ptr_;
    if (old_to != new_to) {
      if (old_to) old_to->RemoveUse(use_);
      *input_ptr_ = new_to;
      if (new_to) new_to->AppendUse(use_);
    }
  }

 private:
  friend class Node::UseEdges::iterator;
  friend class Node::InputEdges;
  friend class Node::InputEdges::iterator;

  Edge(Node::Use* use, ZoneNodePtr* input_ptr)
      : use_(use), input_ptr_(input_ptr) {
    DCHECK_NOT_NULL(use);
    DCHECK_NOT_NULL(input_ptr);
    DCHECK_EQ(input_ptr, use->input_ptr());
  }

  Node::Use* use_;
  ZoneNodePtr* input_ptr_;
};

bool Node::IsDead() const {
  Node::Inputs inputs = this->inputs();
  return inputs.count() > 0 && inputs[0] == nullptr;
}

Node::InputEdges Node::input_edges() {
  int inline_count = InlineCountField::decode(bit_field_);
  if (inline_count != kOutlineMarker) {
    return InputEdges(inline_inputs(), reinterpret_cast<Use*>(this) - 1,
                      inline_count);
  } else {
    return InputEdges(outline_inputs()->inputs(),
                      reinterpret_cast<Use*>(outline_inputs()) - 1,
                      outline_inputs()->count_);
  }
}

Node::Inputs Node::inputs() const {
  int inline_count = InlineCountField::decode(bit_field_);
  if (inline_count != kOutlineMarker) {
    return Inputs(inline_inputs(), inline_count);
  } else {
    return Inputs(outline_inputs()->inputs(), outline_inputs()->count_);
  }
}

base::Vector<Node*> Node::inputs_vector() const {
  int inline_count = InlineCountField::decode(bit_field_);
  if (inline_count != kOutlineMarker) {
    return base::VectorOf<Node*>(inline_inputs(), inline_count);
  } else {
    return base::VectorOf<Node*>(outline_inputs()->inputs(),
                                 outline_inputs()->count_);
  }
}

// A forward iterator to visit the edges for the input dependencies of a node.
class Node::InputEdges::iterator final {
 public:
  using iterator_category = std::forward_iterator_tag;
  using difference_type = std::ptrdiff_t;
  using value_type = Edge;
  using pointer = Edge*;
  using reference = Edge&;

  iterator() : use_(nullptr), input_ptr_(nullptr) {}
  iterator(const iterator& other) = default;

  Edge operator*() const { return Edge(use_, input_ptr_); }
  bool operator==(const iterator& other) const {
    return input_ptr_ == other.input_ptr_;
  }
  bool operator!=(const iterator& other) const { return !(*this == other); }
  iterator& operator++() {
    input_ptr_++;
    use_--;
    return *this;
  }
  iterator operator++(int);
  iterator& operator+=(difference_type offset) {
    input_ptr_ += offset;
    use_ -= offset;
    return *this;
  }
  iterator operator+(difference_type offset) const {
    return iterator(use_ - offset, input_ptr_ + offset);
  }
  difference_type operator-(const iterator& other) const {
    return input_ptr_ - other.input_ptr_;
  }

 private:
  friend class Node;

  explicit iterator(Use* use, ZoneNodePtr* input_ptr)
      : use_(use), input_ptr_(input_ptr) {}

  Use* use_;
  ZoneNodePtr* input_ptr_;
};


Node::InputEdges::iterator Node::InputEdges::begin() const {
  return Node::InputEdges::iterator(use_root_, input_root_);
}


Node::InputEdges::iterator Node::InputEdges::end() const {
  return Node::InputEdges::iterator(use_root_ - count_, input_root_ + count_);
}

Edge Node::InputEdges::operator[](int index) const {
  return Edge(use_root_ + index, input_root_ + index);
}

// A forward iterator to visit the inputs of a node.
class Node::Inputs::const_iterator final {
 public:
  using iterator_category = std::forward_iterator_tag;
  using difference_type = std::ptrdiff_t;
  using value_type = Node*;
  using pointer = const value_type*;
  using reference = value_type&;

  const_iterator(const const_iterator& other) = default;

  Node* operator*() const { return *input_ptr_; }
  bool operator==(const const_iterator& other) const {
    return input_ptr_ == other.input_ptr_;
  }
  bool operator!=(const const_iterator& other) const {
    return !(*this == other);
  }
  const_iterator& operator++() {
    ++input_ptr_;
    return *this;
  }
  const_iterator operator++(int);
  const_iterator& operator+=(difference_type offset) {
    input_ptr_ += offset;
    return *this;
  }
  const_iterator operator+(difference_type offset) const {
    return const_iterator(input_ptr_ + offset);
  }
  difference_type operator-(const const_iterator& other) const {
    return input_ptr_ - other.input_ptr_;
  }

 private:
  friend class Node::Inputs;

  explicit const_iterator(ZoneNodePtr const* input_ptr)
      : input_ptr_(input_ptr) {}

  ZoneNodePtr const* input_ptr_;
};


Node::Inputs::const_iterator Node::Inputs::begin() const {
  return const_iterator(input_root_);
}


Node::Inputs::const_iterator Node::Inputs::end() const {
  return const_iterator(input_root_ + count_);
}

Node* Node::Inputs::operator[](int index) const { return input_root_[index]; }

// A forward iterator to visit the uses edges of a node.
class Node::UseEdges::iterator final {
 public:
  iterator(const iterator& other) = default;

  Edge operator*() const { return Edge(current_, current_->input_ptr()); }
  bool operator==(const iterator& other) const {
    return current_ == other.current_;
  }
  bool operator!=(const iterator& other) const { return !(*this == other); }
  iterator& operator++() {
    DCHECK_NOT_NULL(current_);
    current_ = next_;
    next_ = current_ ? static_cast<Node::Use*>(current_->next) : nullptr;
    return *this;
  }
  iterator operator++(int);

 private:
  friend class Node::UseEdges;

  iterator() : current_(nullptr), next_(nullptr) {}
  explicit iterator(Node* node)
      : current_(node->first_use_),
        next_(current_ ? static_cast<Node::Use*>(current_->next) : nullptr) {}

  Node::Use* current_;
  Node::Use* next_;
};


Node::UseEdges::iterator Node::UseEdges::begin() const {
  return Node::UseEdges::iterator(this->node_);
}


Node::UseEdges::iterator Node::UseEdges::end() const {
  return Node::UseEdges::iterator();
}


// A forward iterator to visit the uses of a node.
class Node::Uses::const_iterator final {
 public:
  using iterator_category = std::forward_iterator_tag;
  using difference_type = int;
  using value_type = Node*;
  using pointer = Node**;
  using reference = Node*&;

  Node* operator*() const { return current_->from(); }
  bool operator==(const const_iterator& other) const {
    return other.current_ == current_;
  }
  bool operator!=(const const_iterator& other) const {
    return other.current_ != current_;
  }
  const_iterator& operator++() {
    DCHECK_NOT_NULL(current_);
    // Checking no use gets mutated while iterating through them, a potential
    // very tricky cause of bug.
    current_ = current_->next;
#ifdef DEBUG
    DCHECK_EQ(current_, next_);
    next_ = current_ ? current_->next : nullptr;
#endif
    return *this;
  }
  const_iterator operator++(int);

 private:
  friend class Node::Uses;

  const_iterator() : current_(nullptr) {}
  explicit const_iterator(Node* node)
  • 显式维护 input 与 use 关系,让替换节点时能高效更新所有依赖
  • 优化器最常做的事就是:换节点、删节点、重接边

为什么优化器喜欢图,不喜欢线性 IR?

图 IR 优势具体收益
显式依赖常量折叠、公共子表达式消除更自然
Effect / Control 可分离区分“值相同”和“副作用顺序必须保留”
节点可重写Reducer 可以局部改图而不必整段重建
适配多后端同一高层图可 Lower 到不同机器表示

GraphReducer:统一驱动所有 reducer


class TFGraph;
class JSHeapBroker;
class Node;
class ObserveNodeManager;

// NodeIds are identifying numbers for nodes that can be used to index auxiliary
// out-of-line data associated with each node.
using NodeId = uint32_t;

// Possible outcomes for decisions.
enum class Decision : uint8_t { kUnknown, kTrue, kFalse };

// Represents the result of trying to reduce a node in the graph.
class Reduction final {
 public:
  explicit Reduction(Node* replacement = nullptr) : replacement_(replacement) {}

  Node* replacement() const { return replacement_; }
  bool Changed() const { return replacement() != nullptr; }
  Reduction FollowedBy(Reduction next) const {
    if (next.Changed()) return next;
    return *this;
  }

 private:
  Node* replacement_;
};


// A reducer can reduce or simplify a given node based on its operator and
// inputs. This class functions as an extension point for the graph reducer for
// language-specific reductions (e.g. reduction based on types or constant
// folding of low-level operators) can be integrated into the graph reduction
// phase.
class V8_EXPORT_PRIVATE Reducer {
 public:
  virtual ~Reducer() = default;

  // Only used for tracing, when using the --trace_turbo_reduction flag.
  virtual const char* reducer_name() const = 0;

  // Try to reduce a node if possible.
  Reduction Reduce(Node* node, ObserveNodeManager* observe_node_manager);

  // Invoked by the {GraphReducer} when all nodes are done.  Can be used to
  // do additional reductions at the end, which in turn can cause a new round
  // of reductions.
  virtual void Finalize();

  // Helper functions for subclasses to produce reductions for a node.
  static Reduction NoChange() { return Reduction(); }
  static Reduction Replace(Node* node) { return Reduction(node); }
  static Reduction Changed(Node* node) { return Reduction(node); }

 private:
  virtual Reduction Reduce(Node* node) = 0;
};


// An advanced reducer can also edit the graphs by changing and replacing nodes
// other than the one currently being reduced.
class AdvancedReducer : public Reducer {
 public:
  // Observe the actions of this reducer.
  class Editor {
   public:
    virtual ~Editor() = default;

    // Replace {node} with {replacement}.
    virtual void Replace(Node* node, Node* replacement) = 0;
    virtual void Replace(Node* node, Node* replacement, NodeId max_id) = 0;
    // Revisit the {node} again later.
    virtual void Revisit(Node* node) = 0;
    // Replace value uses of {node} with {value} and effect uses of {node} with
    // {effect}. If {effect == nullptr}, then use the effect input to {node}.
    // All control uses will be relaxed assuming {node} cannot throw.
    virtual void ReplaceWithValue(Node* node, Node* value, Node* effect,
                                  Node* control) = 0;
  };

  explicit AdvancedReducer(Editor* editor) : editor_(editor) {}

 protected:
  // Helper functions for subclasses to produce reductions for a node.
  static Reduction Replace(Node* node) { return Reducer::Replace(node); }

  // Helper functions for subclasses to edit the graph.
  void Replace(Node* node, Node* replacement) {
    DCHECK_NOT_NULL(editor_);
    editor_->Replace(node, replacement);
  }
  void Replace(Node* node, Node* replacement, NodeId max_id) {
    return editor_->Replace(node, replacement, max_id);
  }
  void Revisit(Node* node) {
    DCHECK_NOT_NULL(editor_);
    editor_->Revisit(node);
  }
  void ReplaceWithValue(Node* node, Node* value, Node* effect = nullptr,
                        Node* control = nullptr) {
    DCHECK_NOT_NULL(editor_);
    editor_->ReplaceWithValue(node, value, effect, control);
  }

  // Relax the effects of {node} by immediately replacing effect and control
  // uses of {node} with the effect and control input to {node}.
  // TODO(turbofan): replace the effect input to {node} with {graph->start()}.
  void RelaxEffectsAndControls(Node* node) {
    ReplaceWithValue(node, node, nullptr, nullptr);
  }

  // Relax the control uses of {node} by immediately replacing them with the
  // either the given {control} node, or the control input to {node}.
  void RelaxControls(Node* node, Node* control = nullptr) {
    ReplaceWithValue(node, node, node, control);
  }

  void MergeControlToEnd(TFGraph* graph, CommonOperatorBuilder* common,
                         Node* node) {
    NodeProperties::MergeControlToEnd(graph, common, node);
    Revisit(graph->end());
  }

 private:
  Editor* const editor_;
};


// Performs an iterative reduction of a node graph.
class V8_EXPORT_PRIVATE GraphReducer
    : public NON_EXPORTED_BASE(AdvancedReducer::Editor) {
 public:
  GraphReducer(Zone* zone, TFGraph* graph, TickCounter* tick_counter,
               JSHeapBroker* broker, Node* dead = nullptr,
               ObserveNodeManager* observe_node_manager = nullptr);
  ~GraphReducer() override;

  GraphReducer(const GraphReducer&) = delete;
  GraphReducer& operator=(const GraphReducer&) = delete;

  TFGraph* graph() const { return graph_; }

  void AddReducer(Reducer* reducer);

  // Reduce a single node.
  void ReduceNode(Node* const);
  // Reduce the whole graph.
  void ReduceGraph();

 private:
  enum class State : uint8_t;
  struct NodeState {
    Node* node;
    int input_index;
  };
  • Reducer:只处理当前节点
  • AdvancedReducer:可编辑整张图
  • GraphReducer:负责遍历、revisit、替换与稳定收敛

Reduction 对象:最简单但极关键的协议

  • NoChange():当前 reducer 无法处理,让下一个 reducer 继续
  • Replace(node):当前节点可被替换
  • FollowedBy():组合多次规约结果

这让每个 reducer 都只关心局部变换,复杂度被框架吸收

ReduceGraph:从 graph end 开始反向吃全图

    }
  }
  DCHECK(revisit_.empty());
  DCHECK(stack_.empty());
}


void GraphReducer::ReduceGraph() { ReduceNode(graph()->end()); }


Reduction GraphReducer::Reduce(Node* const node) {
  auto skip = reducers_.end();
  for (auto i = reducers_.begin(); i != reducers_.end();) {
    if (i != skip) {
      tick_counter_->TickAndMaybeEnterSafepoint();
      Reduction reduction = (*i)->Reduce(node, observe_node_manager_);
      if (!reduction.Changed()) {
        // No change from this reducer.
      } else if (reduction.replacement() == node) {
        // {replacement} == {node} represents an in-place reduction. Rerun
        // all the other reducers for this node, as now there may be more
        // opportunities for reduction.
        if (v8_flags.trace_turbo_reduction) {
          UnparkedScopeIfNeeded unparked(broker_);
          // TODO(neis): Disallow racy handle dereference once we stop
          // supporting --no-local-heaps --no-concurrent-inlining.
          AllowHandleDereference allow_deref;
          StdoutStream{} << "- In-place update of #" << *node << " by reducer "
                         << (*i)->reducer_name() << std::endl;
        }
        skip = i;
        i = reducers_.begin();
        continue;
      } else {
        // {node} was replaced by another node.
        if (v8_flags.trace_turbo_reduction) {
          UnparkedScopeIfNeeded unparked(broker_);
          // TODO(neis): Disallow racy handle dereference once we stop
          // supporting --no-local-heaps --no-concurrent-inlining.
          AllowHandleDereference allow_deref;
          StdoutStream{} << "- Replacement of #" << *node << " with #"
                         << *(reduction.replacement()) << " by reducer "
                         << (*i)->reducer_name() << std::endl;
        }
        return reduction;
      }
    }
    ++i;
  }
  if (skip == reducers_.end()) {
    // No change from any reducer.
    return Reducer::NoChange();
  }
  // At least one reducer did some in-place reduction.
  return Reducer::Changed(node);
}


void GraphReducer::ReduceTop() {
  NodeState& entry = stack_.top();
  Node* node = entry.node;
  DCHECK_EQ(State::kOnStack, state_.Get(node));

  if (node->IsDead()) return Pop();  // Node was killed while on stack.

  Node::Inputs node_inputs = node->inputs();

  // Recurse on an input if necessary.
  int start = entry.input_index < node_inputs.count() ? entry.input_index : 0;
  for (int i = start; i < node_inputs.count(); ++i) {
    Node* input = node_inputs[i];
  • ReduceGraph 直接从 end 节点启动
  • 通过栈和状态机递归式处理依赖
  • 先保证输入可用,再尝试规约当前节点

ReduceTop:单节点规约状态机

void GraphReducer::ReduceTop() {
  NodeState& entry = stack_.top();
  Node* node = entry.node;
  DCHECK_EQ(State::kOnStack, state_.Get(node));

  if (node->IsDead()) return Pop();  // Node was killed while on stack.

  Node::Inputs node_inputs = node->inputs();

  // Recurse on an input if necessary.
  int start = entry.input_index < node_inputs.count() ? entry.input_index : 0;
  for (int i = start; i < node_inputs.count(); ++i) {
    Node* input = node_inputs[i];
    if (input != node && Recurse(input)) {
      entry.input_index = i + 1;
      return;
    }
  }
  for (int i = 0; i < start; ++i) {
    Node* input = node_inputs[i];
    if (input != node && Recurse(input)) {
      entry.input_index = i + 1;
      return;
    }
  }

  // Remember the max node id before reduction.
  NodeId const max_id = static_cast<NodeId>(graph()->NodeCount() - 1);

  // All inputs should be visited or on stack. Apply reductions to node.
  Reduction reduction = Reduce(node);

  // If there was no reduction, pop {node} and continue.
  if (!reduction.Changed()) return Pop();

  // Check if the reduction is an in-place update of the {node}.
  Node* const replacement = reduction.replacement();
  if (replacement == node) {
    for (Node* const user : node->uses()) {
      DCHECK_IMPLIES(user == node, state_.Get(node) != State::kVisited);
      Revisit(user);
    }

    // In-place update of {node}, may need to recurse on an input.
    node_inputs = node->inputs();
    for (int i = 0; i < node_inputs.count(); ++i) {
      Node* input = node_inputs[i];
      if (input != node && Recurse(input)) {
        entry.input_index = i + 1;
        return;
      }
    }
  }

  // After reducing the node, pop it off the stack.
  Pop();

  // Check if we have a new replacement.
  if (replacement != node) {
    Replace(node, replacement, max_id);
  }
}


void GraphReducer::Replace(Node* node, Node* replacement) {
  Replace(node, replacement, std::numeric_limits<NodeId>::max());
}


void GraphReducer::Replace(Node* node, Node* replacement, NodeId max_id) {
  if (node == graph()->start()) graph()->SetStart(replacement);
  if (node == graph()->end()) graph()->SetEnd(replacement);
  if (replacement->id() <= max_id) {
    // {replacement} is an old node, so unlink {node} and assume that
    // {replacement} was already reduced and finish.
    for (Edge edge : node->use_edges()) {
      Node* const user = edge.from();
      Verifier::VerifyEdgeInputReplacement(edge, replacement);
      edge.UpdateTo(replacement);
  • 区分未访问、正在访问、已处理等状态
  • 避免死循环,也避免在图变形时丢掉节点
  • 如果某个 reducer 改了图,相关节点会被 revisit

ReplaceWithValue:值/副作用/控制三路分发

void GraphReducer::ReplaceWithValue(Node* node, Node* value, Node* effect,
                                    Node* control) {
  if (effect == nullptr && node->op()->EffectInputCount() > 0) {
    effect = NodeProperties::GetEffectInput(node);
  }
  if (control == nullptr && node->op()->ControlInputCount() > 0) {
    control = NodeProperties::GetControlInput(node);
  }

  // Requires distinguishing between value, effect and control edges.
  for (Edge edge : node->use_edges()) {
    Node* const user = edge.from();
    DCHECK(!user->IsDead());
    if (NodeProperties::IsControlEdge(edge)) {
      if (user->opcode() == IrOpcode::kIfSuccess) {
        Replace(user, control);
      } else if (user->opcode() == IrOpcode::kIfException) {
        DCHECK_NOT_NULL(dead_);
        edge.UpdateTo(dead_);
        Revisit(user);
      } else {
        DCHECK_NOT_NULL(control);
        edge.UpdateTo(control);
        Revisit(user);
      }
    } else if (NodeProperties::IsEffectEdge(edge)) {
      DCHECK_NOT_NULL(effect);
      edge.UpdateTo(effect);
      Revisit(user);
    } else {
      DCHECK_NOT_NULL(value);
      edge.UpdateTo(value);
      Revisit(user);
    }
  }
}


void GraphReducer::Pop() {
  Node* node = stack_.top().node;
  state_.Set(node, State::kVisited);
  stack_.pop();
}


void GraphReducer::Push(Node* const node) {
  DCHECK_NE(State::kOnStack, state_.Get(node));
  state_.Set(node, State::kOnStack);
  stack_.push({node, 0});
}


bool GraphReducer::Recurse(Node* node) {
  if (state_.Get(node) > State::kRevisit) return false;
  Push(node);
  return true;
}


void GraphReducer::Revisit(Node* node) {
  if (state_.Get(node) == State::kVisited) {
    state_.Set(node, State::kRevisit);
    revisit_.push(node);
  }
}

}  // namespace compiler
}  // namespace internal
}  // namespace v8
  • V8 不只是“把 A 替换成 B”
  • 还要分别处理 value uses、effect uses、control uses
  • 这正是 JS 有异常、调用、副作用时仍能安全优化的关键

Reducer 链的设计思想

前面
  • 常量折叠
  • 类型驱动简化
  • 语义等价替换
后面
  • 控制流清理
  • 机器表示收敛
  • 死节点清扫

一串小而专的 reducer,比一个超级优化器更容易维护与组合。

案例:冗余检查如何被消掉

Load x ↓ CheckSmi(x) ↓ CheckSmi(x) ← 冗余 ↓ Add(x, 1)
如果类型反馈已经证明 x 为 Smi,后续重复检查就能被 reducer 改写为 value 直通,从而省掉分支与 deopt 点。

案例:控制流也能被优化

  • 已知条件恒真/恒假 → 分支折叠
  • 异常不再可能发生 → 放松 effect/control use
  • Dead 节点传播 → 大块子图被切掉

GraphReducer 的强项是让这些变化在 统一框架 中发生,而不是分散在几十处 ad-hoc 代码。

SimplifiedLowering:把“高级节点”压低到机器层


// Forward declarations.
class NodeOriginTable;
class ObserveNodeManager;
class RepresentationChanger;
class RepresentationSelector;
class SourcePositionTable;
class TypeCache;

class V8_EXPORT_PRIVATE SimplifiedLowering final {
 public:
  SimplifiedLowering(JSGraph* jsgraph, JSHeapBroker* broker, Zone* zone,
                     SourcePositionTable* source_position,
                     NodeOriginTable* node_origins, TickCounter* tick_counter,
                     Linkage* linkage, OptimizedCompilationInfo* info,
                     ObserveNodeManager* observe_node_manager = nullptr);
  ~SimplifiedLowering() = default;

  void LowerAllNodes();

  void DoMax(Node* node, Operator const* op, MachineRepresentation rep);
  void DoMin(Node* node, Operator const* op, MachineRepresentation rep);
  void DoJSToNumberOrNumericTruncatesToFloat64(
      Node* node, RepresentationSelector* selector);
  void DoJSToNumberOrNumericTruncatesToWord32(Node* node,
                                              RepresentationSelector* selector);
  void DoIntegral32ToBit(Node* node);
  void DoOrderedNumberToBit(Node* node);
  void DoNumberToBit(Node* node);
  void DoIntegerToUint8Clamped(Node* node);
  void DoNumberToUint8Clamped(Node* node);
  void DoSigned32ToUint8Clamped(Node* node);
  void DoUnsigned32ToUint8Clamped(Node* node);

 private:
  // The purpose of this nested class is to hide method
  // v8::internal::compiler::NodeProperties::ChangeOp which should not be
  // directly used by code in SimplifiedLowering.
  // SimplifiedLowering code should call SimplifiedLowering::ChangeOp instead,
  // in order to notify the changes to ObserveNodeManager and support the
  // %ObserveNode intrinsic.
  class NodeProperties : public compiler::NodeProperties {
    static void ChangeOp(Node* node, const Operator* new_op) { UNREACHABLE(); }
  };
  void ChangeOp(Node* node, const Operator* new_op);

  JSGraph* const jsgraph_;
  JSHeapBroker* broker_;
  Zone* const zone_;
  TypeCache const* type_cache_;
  SetOncePointer<Node> to_number_code_;
  SetOncePointer<Node> to_number_convert_big_int_code_;
  SetOncePointer<Node> to_numeric_code_;
  SetOncePointer<Operator const> to_number_operator_;
  SetOncePointer<Operator const> to_number_convert_big_int_operator_;
  SetOncePointer<Operator const> to_numeric_operator_;

  // TODO(danno): SimplifiedLowering shouldn't know anything about the source
  // positions table, but must for now since there currently is no other way to
  // pass down source position information to nodes created during
  // lowering. Once this phase becomes a vanilla reducer, it should get source
  // position information via the SourcePositionWrapper like all other reducers.
  SourcePositionTable* source_positions_;
  NodeOriginTable* node_origins_;

  TickCounter* const tick_counter_;
  Linkage* const linkage_;
  OptimizedCompilationInfo* info_;

  ObserveNodeManager* const observe_node_manager_;

  Node* Float64Round(Node* const node);
  Node* Float64Sign(Node* const node);
  Node* Int32Abs(Node* const node);
  Node* Int32Div(Node* const node);
  Node* Int32Mod(Node* const node);
  Node* Int32Sign(Node* const node);
  Node* Uint32Div(Node* const node);
  Node* Uint32Mod(Node* const node);

  Node* ToNumberCode();
  Node* ToNumberConvertBigIntCode();
  Node* ToNumericCode();
  Node* Ieee754Fp64ToFp16RawBitsCode();
  Operator const* ToNumberOperator();
  Operator const* ToNumberConvertBigIntOperator();
  Operator const* ToNumericOperator();
  Operator const* Ieee754Fp64ToFp16RawBitsOperator();

  friend class RepresentationSelector;

  Isolate* isolate() { return jsgraph_->isolate(); }
  Zone* zone() { return jsgraph_->zone(); }
  JSGraph* jsgraph() { return jsgraph_; }
  TFGraph* graph() { return jsgraph()->graph(); }
  CommonOperatorBuilder* common() { return jsgraph()->common(); }
  MachineOperatorBuilder* machine() { return jsgraph()->machine(); }
  SimplifiedOperatorBuilder* simplified() { return jsgraph()->simplified(); }
  Linkage* linkage() { return linkage_; }
};

}  // namespace compiler
}  // namespace internal
}  // namespace v8

#endif  // V8_COMPILER_SIMPLIFIED_LOWERING_H_
  • 输入:JSGraph + 类型信息 + Linkage
  • 输出:更明确的 MachineRepresentation
  • 核心任务:让后端不再猜“这到底是 tagged、word32 还是 float64”

LowerAllNodes:Lowering 总入口

void SimplifiedLowering::LowerAllNodes() {
  SimplifiedLoweringVerifier* verifier = nullptr;
  if (v8_flags.verify_simplified_lowering) {
    verifier = zone_->New<SimplifiedLoweringVerifier>(zone_, graph());
  }
  RepresentationChanger changer(jsgraph(), broker_, verifier);
  RepresentationSelector selector(
      jsgraph(), broker_, zone_, &changer, source_positions_, node_origins_,
      tick_counter_, linkage_, observe_node_manager_, verifier);
  selector.Run(this);
}

void SimplifiedLowering::DoJSToNumberOrNumericTruncatesToFloat64(
    Node* node, RepresentationSelector* selector) {
  DCHECK(node->opcode() == IrOpcode::kJSToNumber ||
         node->opcode() == IrOpcode::kJSToNumberConvertBigInt ||
         node->opcode() == IrOpcode::kJSToNumeric);
  Node* value = node->InputAt(0);
  Node* context = node->InputAt(1);
  Node* frame_state = node->InputAt(2);
  Node* effect = node->InputAt(3);
  Node* control = node->InputAt(4);

  Node* check0 = graph()->NewNode(simplified()->ObjectIsSmi(), value);
  Node* branch0 = graph()->NewNode(
      common()->Branch(BranchHint::kTrue, BranchSemantics::kMachine), check0,
      control);

  Node* if_true0 = graph()->NewNode(common()->IfTrue(), branch0);
  Node* etrue0 = effect;
  Node* vtrue0;
  {
    vtrue0 = graph()->NewNode(simplified()->ChangeTaggedSignedToInt32(), value);
    vtrue0 = graph()->NewNode(machine()->ChangeInt32ToFloat64(), vtrue0);
  }

  Node* if_false0 = graph()->NewNode(common()->IfFalse(), branch0);
  Node* efalse0 = effect;
  Node* vfalse0;
  {
    Operator const* op =
        node->opcode() == IrOpcode::kJSToNumber
            ? (node->opcode() == IrOpcode::kJSToNumberConvertBigInt
                   ? ToNumberConvertBigIntOperator()
                   : ToNumberOperator())
            : ToNumericOperator();
    Node* code = node->opcode() == IrOpcode::kJSToNumber
                     ? ToNumberCode()
                     : (node->opcode() == IrOpcode::kJSToNumberConvertBigInt
                            ? ToNumberConvertBigIntCode()
                            : ToNumericCode());
    vfalse0 = efalse0 = if_false0 = graph()->NewNode(
        op, code, value, context, frame_state, efalse0, if_false0);

    // Update potential {IfException} uses of {node} to point to the above
    // stub call node instead.
    Node* on_exception = nullptr;
    if (NodeProperties::IsExceptionalCall(node, &on_exception)) {
      NodeProperties::ReplaceControlInput(on_exception, vfalse0);
      NodeProperties::ReplaceEffectInput(on_exception, efalse0);
      if_false0 = graph()->NewNode(common()->IfSuccess(), vfalse0);
    }

    Node* check1 = graph()->NewNode(simplified()->ObjectIsSmi(), vfalse0);
    Node* branch1 = graph()->NewNode(
        common()->Branch(BranchHint::kNone, BranchSemantics::kMachine), check1,
        if_false0);

    Node* if_true1 = graph()->NewNode(common()->IfTrue(), branch1);
    Node* etrue1 = efalse0;
    Node* vtrue1;
    {
      vtrue1 =
          graph()->NewNode(simplified()->ChangeTaggedSignedToInt32(), vfalse0);
      vtrue1 = graph()->NewNode(machine()->ChangeInt32ToFloat64(), vtrue1);
    }

    Node* if_false1 = graph()->NewNode(common()->IfFalse(), branch1);
    Node* efalse1 = efalse0;
    Node* vfalse1;
    {
      vfalse1 = efalse1 = graph()->NewNode(
          simplified()->LoadField(AccessBuilder::ForHeapNumberValue()), efalse0,
          efalse1, if_false1);
    }

    if_false0 = graph()->NewNode(common()->Merge(2), if_true1, if_false1);
    efalse0 =
        graph()->NewNode(common()->EffectPhi(2), etrue1, efalse1, if_false0);
    vfalse0 =
  • 遍历图节点并执行表示选择
  • 根据 use-site 与 type 决定最合适表示
  • 必要时插入转换节点,避免错误复用寄存器表示

表示选择的核心矛盾

更高层表示更低层表示
保留 JS 语义完整性执行成本更低
tagged 容错高word32 / float64 更快
转换少时开发简单转换合理时整体更优
SimplifiedLowering 的价值就在于:何时值得脱掉 tagged 外衣

DoNumberToBit:数字到位表示的专门降级

}

void SimplifiedLowering::DoOrderedNumberToBit(Node* node) {
  Node* const input = node->InputAt(0);

  node->ReplaceInput(0, graph()->NewNode(machine()->Float64Equal(), input,
                                         jsgraph()->Float64Constant(0.0)));
  node->AppendInput(graph()->zone(), jsgraph()->Int32Constant(0));
  ChangeOp(node, machine()->Word32Equal());
}

void SimplifiedLowering::DoNumberToBit(Node* node) {
  Node* const input = node->InputAt(0);

  node->ReplaceInput(0, jsgraph()->Float64Constant(0.0));
  node->AppendInput(graph()->zone(),
                    graph()->NewNode(machine()->Float64Abs(), input));
  ChangeOp(node, machine()->Float64LessThan());
}

void SimplifiedLowering::DoIntegerToUint8Clamped(Node* node) {
  Node* const input = node->InputAt(0);
  Node* const min = jsgraph()->Float64Constant(0.0);
  Node* const max = jsgraph()->Float64Constant(255.0);

  node->ReplaceInput(
      0, graph()->NewNode(machine()->Float64LessThan(), min, input));
  node->AppendInput(
      graph()->zone(),
      graph()->NewNode(
          common()->Select(MachineRepresentation::kFloat64),
          graph()->NewNode(machine()->Float64LessThan(), input, max), input,
          max));
  node->AppendInput(graph()->zone(), min);
  ChangeOp(node, common()->Select(MachineRepresentation::kFloat64));
}

void SimplifiedLowering::DoNumberToUint8Clamped(Node* node) {
  Node* const input = node->InputAt(0);
  Node* const min = jsgraph()->Float64Constant(0.0);
  Node* const max = jsgraph()->Float64Constant(255.0);

  node->ReplaceInput(
      0, graph()->NewNode(
             common()->Select(MachineRepresentation::kFloat64),
             graph()->NewNode(machine()->Float64LessThan(), min, input),
             graph()->NewNode(
                 common()->Select(MachineRepresentation::kFloat64),
                 graph()->NewNode(machine()->Float64LessThan(), input, max),
                 input, max),
             min));
  ChangeOp(node, machine()->Float64RoundTiesEven().placeholder());
}

void SimplifiedLowering::DoSigned32ToUint8Clamped(Node* node) {
  Node* const input = node->InputAt(0);
  Node* const min = jsgraph()->Int32Constant(0);
  Node* const max = jsgraph()->Int32Constant(255);

  node->ReplaceInput(
      0, graph()->NewNode(machine()->Int32LessThanOrEqual(), input, max));
  node->AppendInput(
      graph()->zone(),
      graph()->NewNode(common()->Select(MachineRepresentation::kWord32),
                       graph()->NewNode(machine()->Int32LessThan(), input, min),
                       min, input));
  node->AppendInput(graph()->zone(), max);
  ChangeOp(node, common()->Select(MachineRepresentation::kWord32));
}

void SimplifiedLowering::DoUnsigned32ToUint8Clamped(Node* node) {
  Node* const input = node->InputAt(0);
  Node* const max = jsgraph()->Uint32Constant(255u);
  • 看似简单的 true/false 结果,背后要考虑 NaN、零值、整数/浮点路径
  • 不同目标表示对应不同机器操作组合

几个典型 Lowering 模式

数值类
  • JSToNumber → Float64/Word32
  • Max/Min → 目标机器 op
  • Uint8Clamped → 媒体/Canvas 热路径友好
布尔/比较类
  • NumberToBit
  • OrderedNumberToBit
  • Integral32ToBit

类型信息如何驱动优化?

  • 若 typer 证明节点范围很窄,lowering 可以直接选更便宜的表示
  • 若类型不稳,则保守保留 tagged 或插入 check/deopt
  • 优化质量取决于:反馈够不够准 + 推导够不够强

为什么 Lowering 跟 Deopt 紧密相关

  • 越激进地去掉动态性,越依赖“假设失败时能退回解释器”
  • 因此 lowering 不只是性能问题,也是正确性问题
  • 很多 check 节点不是坏事,它们是优化的保险丝
Optimistic, but recoverable

Scheduler:让图变成可执行顺序

Scheduler::Scheduler(Zone* zone, TFGraph* graph, Schedule* schedule,
                     Flags flags, size_t node_count_hint,
                     TickCounter* tick_counter,
                     const ProfileDataFromFile* profile_data)
    : zone_(zone),
      graph_(graph),
      schedule_(schedule),
      flags_(flags),
      scheduled_nodes_(zone),
      schedule_root_nodes_(zone),
      schedule_queue_(zone),
      node_data_(zone),
      tick_counter_(tick_counter),
      profile_data_(profile_data),
      common_dominator_cache_(zone) {
  node_data_.reserve(node_count_hint);
  node_data_.resize(graph->NodeCount(), DefaultSchedulerData());
}

Schedule* Scheduler::ComputeSchedule(Zone* zone, TFGraph* graph, Flags flags,
                                     TickCounter* tick_counter,
                                     const ProfileDataFromFile* profile_data) {
  Zone* schedule_zone =
      (flags & Scheduler::kTempSchedule) ? zone : graph->zone();

  // Reserve 10% more space for nodes if node splitting is enabled to try to
  // avoid resizing the vector since that would triple its zone memory usage.
  float node_hint_multiplier = (flags & Scheduler::kSplitNodes) ? 1.1 : 1;
  size_t node_count_hint = node_hint_multiplier * graph->NodeCount();

  Schedule* schedule =
      schedule_zone->New<Schedule>(schedule_zone, node_count_hint);
  Scheduler scheduler(zone, graph, schedule, flags, node_count_hint,
                      tick_counter, profile_data);

  scheduler.BuildCFG();
  scheduler.ComputeSpecialRPONumbering();
  scheduler.GenerateDominatorTree();

  scheduler.PrepareUses();
  scheduler.ScheduleEarly();
  scheduler.ScheduleLate();

  scheduler.SealFinalSchedule();

  return schedule;
}

Scheduler::SchedulerData Scheduler::DefaultSchedulerData() {
  SchedulerData def = {schedule_->start(), 0, kUnknown};
  return def;
}
  • ComputeSchedule 会创建 Schedule 并执行后续调度
  • 输入是图,输出是基本块 + 节点线性顺序
  • 既要合法,也要尽量优化依赖距离与分支布局

为什么有了图还要 Schedule?

  • 图只表达依赖,不表达“具体在第几条指令执行”
  • 后端需要基本块、RPO 顺序、Phi 放置、控制流边界
  • 寄存器分配和指令选择都依赖一个稳定的线性视图

SchedulerData:每个节点的调度元信息

Scheduler::SchedulerData Scheduler::DefaultSchedulerData() {
  SchedulerData def = {schedule_->start(), 0, kUnknown};
  return def;
}


Scheduler::SchedulerData* Scheduler::GetData(Node* node) {
  return &node_data_[node->id()];
}

Scheduler::Placement Scheduler::InitializePlacement(Node* node) {
  SchedulerData* data = GetData(node);
  if (data->placement_ == kFixed) {
    // Nothing to do for control nodes that have been already fixed in
    // the schedule.
    return data->placement_;
  }
  DCHECK_EQ(kUnknown, data->placement_);
  switch (node->opcode()) {
    case IrOpcode::kParameter:
    case IrOpcode::kOsrValue:
      // Parameters and OSR values are always fixed to the start block.
      data->placement_ = kFixed;
      break;
    case IrOpcode::kPhi:
    case IrOpcode::kEffectPhi: {
      // Phis and effect phis are fixed if their control inputs are, whereas
      // otherwise they are coupled to a floating control node.
      Placement p = GetPlacement(NodeProperties::GetControlInput(node));
      data->placement_ = (p == kFixed ? kFixed : kCoupled);
      break;
    }
    default:
      // Control nodes that were not control-reachable from end may float.
      data->placement_ = kSchedulable;
      break;
  }
  return data->placement_;
}

Scheduler::Placement Scheduler::GetPlacement(Node* node) {
  return GetData(node)->placement_;
}

bool Scheduler::IsLive(Node* node) { return GetPlacement(node) != kUnknown; }

void Scheduler::UpdatePlacement(Node* node, Placement placement) {
  SchedulerData* data = GetData(node);
  if (data->placement_ == kUnknown) {
    // We only update control nodes from {kUnknown} to {kFixed}.  Ideally, we
    // should check that {node} is a control node (including exceptional calls),
    // but that is expensive.
    DCHECK_EQ(Scheduler::kFixed, placement);
    data->placement_ = placement;
    return;
  }

  switch (node->opcode()) {
    case IrOpcode::kParameter:
      // Parameters are fixed once and for all.
      UNREACHABLE();
    case IrOpcode::kPhi:
    case IrOpcode::kEffectPhi: {
      // Phis and effect phis are coupled to their respective blocks.
      DCHECK_EQ(Scheduler::kCoupled, data->placement_);
      DCHECK_EQ(Scheduler::kFixed, placement);
      Node* control = NodeProperties::GetControlInput(node);
      BasicBlock* block = schedule_->block(control);
      schedule_->AddNode(block, node);
      break;
    }
#define DEFINE_CONTROL_CASE(V) case IrOpcode::k##V:
      CONTROL_OP_LIST(DEFINE_CONTROL_CASE)
#undef DEFINE_CONTROL_CASE
      {
        // Control nodes force coupled uses to be placed.
        for (auto use : node->uses()) {
          if (GetPlacement(use) == Scheduler::kCoupled) {
            DCHECK_EQ(node, NodeProperties::GetControlInput(use));
            UpdatePlacement(use, placement);
          }
      }
      break;
    }
    default:
      DCHECK_EQ(Scheduler::kSchedulable, data->placement_);
      DCHECK_EQ(Scheduler::kScheduled, placement);
      break;
  }
  // Reduce the use count of the node's inputs to potentially make them
  // schedulable. If all the uses of a node have been scheduled, then the node
  // itself can be scheduled.
  • 记录所在块、早/晚调度区间、访问状态等
  • 这是从“纯图世界”走向“块布局世界”的桥梁

Schedule 最终产出什么?

Block B0: Start, Parameters, Constants Block B1: Checks, Arithmetic Block B2: Branch True Path Block B3: Branch False Path Block B4: Merge, Phi, Return
有了 block 结构,指令选择器才能做 architecture-specific lowering,寄存器分配器才能分析 live range。

Pipeline 中多次调用 Scheduler::ComputeSchedule

    reducer.ReduceGraph();
    // TODO(turbofan): Turn this into a debug mode check once we have
    // confidence.
    escape_reducer.VerifyReplacement();
  }
};

struct TypeAssertionsPhase {
  DECL_PIPELINE_PHASE_CONSTANTS(TypeAssertions)

  void Run(TFPipelineData* data, Zone* temp_zone) {
    Schedule* schedule = Scheduler::ComputeSchedule(
        temp_zone, data->graph(), Scheduler::kTempSchedule,
        &data->info()->tick_counter(), data->profile_data());

    AddTypeAssertions(data->jsgraph(), schedule, temp_zone);
  }
};

struct SimplifiedLoweringPhase {
  DECL_PIPELINE_PHASE_CONSTANTS(SimplifiedLowering)

  void Run(TFPipelineData* data, Zone* temp_zone, Linkage* linkage) {
    SimplifiedLowering lowering(data->jsgraph(), data->broker(), temp_zone,
                                data->source_positions(), data->node_origins(),
                                &data->info()->tick_counter(), linkage,
                                data->info(), data->observe_node_manager());
    }
  }
};
#endif  // V8_ENABLE_WEBASSEMBLY

struct ComputeSchedulePhase {
  DECL_PIPELINE_PHASE_CONSTANTS(Scheduling)

  void Run(TFPipelineData* data, Zone* temp_zone) {
    Schedule* schedule = Scheduler::ComputeSchedule(
        temp_zone, data->graph(),
        data->info()->splitting() ? Scheduler::kSplitNodes
                                  : Scheduler::kNoFlags,
        &data->info()->tick_counter(), data->profile_data());
    data->set_schedule(schedule);
  }
};

#if V8_ENABLE_WASM_SIMD256_REVEC
struct RevectorizePhase {
  DECL_PIPELINE_PHASE_CONSTANTS(Revectorizer)

  void Run(TFPipelineData* data, Zone* temp_zone) {
    Revectorizer revec(temp_zone, data->graph(), data->mcgraph(),
                       data->source_positions());
    revec.TryRevectorize(data->info()->GetDebugName().get());
  }
};
#endif  // V8_ENABLE_WASM_SIMD256_REVEC

struct PrintGraphPhase {
  DECL_PIPELINE_PHASE_CONSTANTS(PrintGraph)

  void Run(TFPipelineData* data, Zone* temp_zone, const char* phase) {
    OptimizedCompilationInfo* info = data->info();
    TFGraph* graph = data->graph();
    if (info->trace_turbo_json()) {  // Print JSON.
      UnparkedScopeIfNeeded scope(data->broker());
      AllowHandleDereference allow_deref;

      TurboJsonFile json_of(info, std::ios_base::app);
      json_of << "{\"name\":\"" << phase << "\",\"type\":\"graph\",\"data\":"
              << AsJSON(*graph, data->source_positions(), data->node_origins())
              << "},\n";
    }

    if (info->trace_turbo_scheduled()) {
      Schedule* schedule = data->schedule();
      if (schedule == nullptr) {
        schedule = Scheduler::ComputeSchedule(
            temp_zone, data->graph(), Scheduler::kNoFlags,
            &info->tick_counter(), data->profile_data());
      }

      UnparkedScopeIfNeeded scope(data->broker());
      AllowHandleDereference allow_deref;
      CodeTracer::StreamScope tracing_scope(data->GetCodeTracer());
      tracing_scope.stream()
          << "----- Graph after " << phase << " ----- " << std::endl
          << AsScheduledGraph(schedule);
    } else if (info->trace_turbo_graph()) {  // Simple textual RPO.
说明调度不是一次性尾部动作,而是会在不同生成路径中反复出现。

Schedule 之后:进入真正的后端世界

Instruction Selection
  • 把节点映射为目标架构指令模式
  • 识别 addressing mode、融合指令
Register Allocation
  • 基于 live range 分配寄存器
  • 决定 spill / move / phi resolution

V8 优化器真正赚到的性能来自哪?

来源收益
消除动态检查减少分支与 deopt 开销
降低表示层级tagged → unboxed,寄存器利用更好
控制流规整后端更容易布局、预测更稳定
指令融合减少中间节点与搬运

为什么 JS 这种动态语言也能跑得很快

  • 热点代码只要类型稳定,就能被近似看成静态语言片段
  • Turbofan 用反馈 + 假设 + deopt 做“局部静态化”
  • 不是所有代码都快,而是 足够热、足够稳定 的代码会快

设计模式:Pipeline + Pass Manager 思想

  • Pipeline 本质上就是 pass orchestration
  • 每个阶段有清晰输入输出与可插拔点
  • 这种模式同样适用于数据库优化器、图计算框架、ML 编译器

设计模式:Visitor / Reducer / Worklist

  • Reducer 像局部 rewrite 规则
  • GraphReducer 用 worklist/revisit 维持收敛
  • Lowering 则像带类型信息的 visitor + transformer

核心对象 UML

Pipeline ├─ creates → TurbofanCompilationJob ├─ uses → JSGraph / TFGraph ├─ calls → GraphReducer ├─ calls → SimplifiedLowering └─ calls → Scheduler GraphReducer ├─ owns → reducers_ ├─ edits → Node graph └─ revisits → pending nodes SimplifiedLowering ├─ reads → Type / Linkage └─ rewrites → MachineRepresentation

一次热点函数优化的时序图

Ignition bytecode → Pipeline.NewCompilationJob → build graph → GraphReducer.ReduceGraph → SimplifiedLowering.LowerAllNodes → Scheduler.ComputeSchedule → Instruction Selection → Register Allocation → Code object installed

细节:为什么 ReplaceWithValue 如此复杂

  • JS 节点可能同时承担值、effect、control 三种角色
  • 单纯替换 value 可能会破坏异常边或顺序依赖
  • 所以 V8 把不同 use 类别拆开处理,避免“语义被优化丢了”

细节:为什么 Lowering 里有大量特化函数

  • 高层 JS 操作拥有太多 corner cases:NaN、-0、BigInt、clamp、溢出
  • 一个“大一统 lowering”很容易失控
  • 拆成 DoNumberToBit / DoMin / DoMax 等函数,可分别校验语义与目标表示

细节:Scheduler 不是简单拓扑排序

  • 要考虑控制流块边界、Phi、effect 链、延迟 materialization
  • 还要为后端生成更有利于寄存器分配的顺序
  • 因此它是“带约束的布局问题”,不是纯 DAG 排序

阅读 V8 优化源码的常见误区

  • 误区一:把每个节点都当 AST 节点理解
  • 误区二:忽略 effect/control,只看值边
  • 误区三:以为 lowering 只是“换个名字”

调优反模式:只盯单个函数

  • 性能瓶颈往往来自整条 pipeline 的协同失败
  • 类型推导差一点,后面所有阶段都会保守
  • 图归约没消掉冗余,调度和寄存器分配都会更差

如何设计自己的 Benchmark 去观察这些优化

  1. 构造类型稳定和类型不稳定两组函数
  2. 预热足够轮次触发优化编译
  3. 比较 deopt 次数、生成代码长度、吞吐量
  4. 配合 trace-turbo / trace-deopt 分析差异

调试 V8 优化器常用方法

方法作用
--trace-turbo观察 IR 与阶段输出
--trace-turbo-reduction看每个 reducer 如何改图
--trace-deopt定位乐观假设失败
Turbolizer可视化图、schedule、寄存器信息

对业务工程的启示:把优化做成管线

  • 复杂系统别靠一个“大而全函数”处理
  • 拆阶段、定义中间表示、允许局部重写
  • 可观测性很关键:没有 trace,优化很难维护

对编译器学习者的启示

  • 先掌握图 IR、数据流和控制流,再读具体语言优化
  • 理解“表示降低”比死记某个 pass 名字更重要
  • V8 的复杂度主要来自动态语言语义,不是代码风格问题

阅读这类源码的最佳实践

  1. 先看 .h 理清对象关系
  2. 再抓入口函数,建立阶段地图
  3. 最后挑 2~3 条典型路径深挖
  4. 边读边画图:Graph、Reducer 链、Schedule

一句话总结 Turbofan 的优化哲学

基于反馈构图 → 局部规约 → 逐级 lowering → 合法调度 → 机器码

它的本质不是“黑魔法优化”,而是大量工程化良好的中间表示转换。

本次源码解读带走什么

  • Pipeline 是调度中心
  • GraphReducer 是图重写引擎
  • SimplifiedLowering 是去动态化核心
  • Scheduler 是连接图世界和机器世界的桥

扩展阅读

  • V8 Turbofan / Turboshaft 设计文档
  • Turbolizer 可视化工具
  • Sea of Nodes 论文与 HotSpot C2 资料
  • LLVM SelectionDAG / GlobalISel 作为对照学习

如果你想继续深挖,下一步最适合读:

instruction-selectorregister-allocatordeoptimizer