Skip to content

[VPlan] Allow generating vectors with VPInstruction::ptradd. NFC #148273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Jul 11, 2025

Currently a ptradd can only generate a scalar, or a series of scalars per-lane.

In an upcoming patch to expand VPWidenPointerRecipe into smaller recipes, we need to be able to generate a vector ptradd, which currently we can't do.

This adds support for generating vectors by checking to see if the offset operand is a vector: If it isn't, it will generate per-lane scalars as per usual.

Currently a ptradd can only generate a scalar, or a series of scalars per-lane.

In an upcoming patch to expand VPWidenPointerRecipe into smaller recipes, we need to be able to generate a vector ptradd, which currently we can't do.

This adds support for generating vectors by checking to see if the offset operand is a vector: If it isn't, it will generate per-lane scalars as per usual.
@llvmbot
Copy link
Member

llvmbot commented Jul 11, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

Currently a ptradd can only generate a scalar, or a series of scalars per-lane.

In an upcoming patch to expand VPWidenPointerRecipe into smaller recipes, we need to be able to generate a vector ptradd, which currently we can't do.

This adds support for generating vectors by checking to see if the offset operand is a vector: If it isn't, it will generate per-lane scalars as per usual.


Full diff: https://github.com/llvm/llvm-project/pull/148273.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+5-3)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+5-9)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 9a6e4b36397b3..0d9af0210a393 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -958,8 +958,10 @@ class LLVM_ABI_FOR_TEST VPInstruction : public VPRecipeWithIRFlags,
     ExtractPenultimateElement,
     LogicalAnd, // Non-poison propagating logical And.
     // Add an offset in bytes (second operand) to a base pointer (first
-    // operand). Only generates scalar values (either for the first lane only or
-    // for all lanes, depending on its uses).
+    // operand). The base pointer must be scalar, but the offset can be a
+    // scalar, multiple scalars, or a vector. If the offset is multiple scalars
+    // then it will generate multiple scalar values (either for the first lane
+    // only or for all lanes, depending on its uses).
     PtrAdd,
     // Returns a scalar boolean value, which is true if any lane of its
     // (boolean) vector operands is true. It produces the reduced value across
@@ -998,7 +1000,7 @@ class LLVM_ABI_FOR_TEST VPInstruction : public VPRecipeWithIRFlags,
   /// values per all lanes, stemming from an original ingredient. This method
   /// identifies the (rare) cases of VPInstructions that do so as well, w/o an
   /// underlying ingredient.
-  bool doesGeneratePerAllLanes() const;
+  bool doesGeneratePerAllLanes(VPTransformState &State) const;
 
   /// Returns true if we can generate a scalar for the first lane only if
   /// needed.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 75ade13b09d9c..4b7d21edbb48a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -494,8 +494,9 @@ unsigned VPInstruction::getNumOperandsForOpcode(unsigned Opcode) {
 }
 #endif
 
-bool VPInstruction::doesGeneratePerAllLanes() const {
-  return Opcode == VPInstruction::PtrAdd && !vputils::onlyFirstLaneUsed(this);
+bool VPInstruction::doesGeneratePerAllLanes(VPTransformState &State) const {
+  return Opcode == VPInstruction::PtrAdd && !vputils::onlyFirstLaneUsed(this) &&
+         !State.hasVectorValue(getOperand(1));
 }
 
 bool VPInstruction::canGenerateScalarForFirstLane() const {
@@ -848,10 +849,8 @@ Value *VPInstruction::generate(VPTransformState &State) {
     return Builder.CreateLogicalAnd(A, B, Name);
   }
   case VPInstruction::PtrAdd: {
-    assert(vputils::onlyFirstLaneUsed(this) &&
-           "can only generate first lane for PtrAdd");
     Value *Ptr = State.get(getOperand(0), VPLane(0));
-    Value *Addend = State.get(getOperand(1), VPLane(0));
+    Value *Addend = State.get(getOperand(1), vputils::onlyFirstLaneUsed(this));
     return Builder.CreatePtrAdd(Ptr, Addend, Name, getGEPNoWrapFlags());
   }
   case VPInstruction::AnyOf: {
@@ -911,9 +910,6 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
       }
     }
 
-    assert(!doesGeneratePerAllLanes() &&
-           "Should only generate a vector value or single scalar, not scalars "
-           "for all lanes.");
     return Ctx.TTI.getArithmeticInstrCost(getOpcode(), ResTy, Ctx.CostKind);
   }
 
@@ -1001,7 +997,7 @@ void VPInstruction::execute(VPTransformState &State) {
   bool GeneratesPerFirstLaneOnly = canGenerateScalarForFirstLane() &&
                                    (vputils::onlyFirstLaneUsed(this) ||
                                     isVectorToScalar() || isSingleScalar());
-  bool GeneratesPerAllLanes = doesGeneratePerAllLanes();
+  bool GeneratesPerAllLanes = doesGeneratePerAllLanes(State);
   if (GeneratesPerAllLanes) {
     for (unsigned Lane = 0, NumLanes = State.VF.getFixedValue();
          Lane != NumLanes; ++Lane) {

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jul 11, 2025
Stacked on llvm#148273 to be able to use VPInstruction::PtrAdd.

This is the VPWidenPointerInductionRecipe equivalent of llvm#118638, with the motivation of allowing us to use the EVL as the induction step.

Most of the new VPlan transformation is a straightforward translation of the existing execute code.

VPUnrollPartAccessor unfortunately doesn't work outside of VPlanRecipes.cpp so here the operands are just manually checked to see if they're unrolled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants