Skip to content

Conversation

@alexey-bataev
Copy link
Member

@alexey-bataev alexey-bataev commented Dec 10, 2025

Patch disables memory intrinsics expansion, enabled by default in
#168622. This patch does the
same in clang, but not in flang.

The expansion causes massive perf regressions, up to 2x times in
fortran code.

Created using spr 1.3.7
Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@llvmbot llvmbot added flang:driver flang Flang issues not falling into any other category flang:fir-hlfir labels Dec 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Alexey Bataev (alexey-bataev)

Changes

Patch disables memory intrinsics expansion, enabled by default in
#168622. This patch does the
same in clang, but not in flang.

The expansion causes massive perf regressions, up to 2x times in
fortran code.


Full diff: https://github.com/llvm/llvm-project/pull/171650.diff

2 Files Affected:

  • (modified) flang/lib/Frontend/FrontendActions.cpp (+13)
  • (added) flang/test/Lower/memory-intrinsics-expansion.F90 (+33)
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index ddf125f9bb216..d99c44d5aa1ea 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -42,9 +42,11 @@
 #include "clang/Driver/DriverDiagnostic.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
 #include "llvm/Bitcode/BitcodeWriterPass.h"
+#include "llvm/CodeGen/LibcallLoweringInfo.h"
 #include "llvm/CodeGen/MachineOptimizationRemarkEmitter.h"
 #include "llvm/IR/LLVMRemarkStreamer.h"
 #include "llvm/IR/LegacyPassManager.h"
@@ -902,6 +904,9 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags,
   llvm::TargetLibraryInfoImpl *tlii =
       llvm::driver::createTLII(triple, codeGenOpts.getVecLib());
   codeGenPasses.add(new llvm::TargetLibraryInfoWrapperPass(*tlii));
+  codeGenPasses.add(new llvm::RuntimeLibraryInfoWrapper(
+      triple, tm.Options.ExceptionModel, tm.Options.FloatABIType,
+      tm.Options.EABIVersion, tm.Options.MCOptions.ABIName, tm.Options.VecLib));
 
   llvm::CodeGenFileType cgft = (act == BackendActionTy::Backend_EmitAssembly)
                                    ? llvm::CodeGenFileType::AssemblyFile
@@ -1009,6 +1014,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   llvm::TargetLibraryInfoImpl *tlii =
       llvm::driver::createTLII(triple, opts.getVecLib());
   fam.registerPass([&] { return llvm::TargetLibraryAnalysis(*tlii); });
+  mam.registerPass([&] {
+    return llvm::RuntimeLibraryAnalysis(
+        triple, targetMachine->Options.ExceptionModel,
+        targetMachine->Options.FloatABIType, targetMachine->Options.EABIVersion,
+        targetMachine->Options.MCOptions.ABIName,
+        targetMachine->Options.VecLib);
+  });
+  mam.registerPass([&] { return llvm::LibcallLoweringModuleAnalysis(); });
 
   // Register all the basic analyses with the managers.
   pb.registerModuleAnalyses(mam);
diff --git a/flang/test/Lower/memory-intrinsics-expansion.F90 b/flang/test/Lower/memory-intrinsics-expansion.F90
new file mode 100644
index 0000000000000..6d9eee429f487
--- /dev/null
+++ b/flang/test/Lower/memory-intrinsics-expansion.F90
@@ -0,0 +1,33 @@
+! REQUIRES:  aarch64-registered-target
+
+! RUN: %flang_fc1 -S -O1 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O0 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -S -O2 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O2 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -S -O3 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O3 2>&1 | FileCheck %s
+! RUN: FileCheck --input-file=%t_O0 --check-prefix=CALL %s
+! RUN: FileCheck --input-file=%t_O2 --check-prefix=CALL %s
+! RUN: FileCheck --input-file=%t_O3 --check-prefix=CALL %s
+
+! CHECK: Target Library Information
+! CHECK: Runtime Library Function Analysis
+! CHECK: Library Function Lowering Analysis
+
+! CALL: {{callq|bl}} memcpy
+program memcpy_test
+  implicit none
+  integer, parameter :: n = 100
+  real :: a(n), b(n)
+  integer :: i
+
+  ! Initialize array a
+  do i = 1, n
+    a(i) = real(i)
+  end do
+
+  ! Array assignment - this should generate memcpy
+  b = a
+
+  ! Use array b to prevent optimization
+  print *, b(1), b(n)
+
+end program memcpy_test
+

@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2025

@llvm/pr-subscribers-flang-driver

Author: Alexey Bataev (alexey-bataev)

Changes

Patch disables memory intrinsics expansion, enabled by default in
#168622. This patch does the
same in clang, but not in flang.

The expansion causes massive perf regressions, up to 2x times in
fortran code.


Full diff: https://github.com/llvm/llvm-project/pull/171650.diff

2 Files Affected:

  • (modified) flang/lib/Frontend/FrontendActions.cpp (+13)
  • (added) flang/test/Lower/memory-intrinsics-expansion.F90 (+33)
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index ddf125f9bb216..d99c44d5aa1ea 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -42,9 +42,11 @@
 #include "clang/Driver/DriverDiagnostic.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
 #include "llvm/Bitcode/BitcodeWriterPass.h"
+#include "llvm/CodeGen/LibcallLoweringInfo.h"
 #include "llvm/CodeGen/MachineOptimizationRemarkEmitter.h"
 #include "llvm/IR/LLVMRemarkStreamer.h"
 #include "llvm/IR/LegacyPassManager.h"
@@ -902,6 +904,9 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags,
   llvm::TargetLibraryInfoImpl *tlii =
       llvm::driver::createTLII(triple, codeGenOpts.getVecLib());
   codeGenPasses.add(new llvm::TargetLibraryInfoWrapperPass(*tlii));
+  codeGenPasses.add(new llvm::RuntimeLibraryInfoWrapper(
+      triple, tm.Options.ExceptionModel, tm.Options.FloatABIType,
+      tm.Options.EABIVersion, tm.Options.MCOptions.ABIName, tm.Options.VecLib));
 
   llvm::CodeGenFileType cgft = (act == BackendActionTy::Backend_EmitAssembly)
                                    ? llvm::CodeGenFileType::AssemblyFile
@@ -1009,6 +1014,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   llvm::TargetLibraryInfoImpl *tlii =
       llvm::driver::createTLII(triple, opts.getVecLib());
   fam.registerPass([&] { return llvm::TargetLibraryAnalysis(*tlii); });
+  mam.registerPass([&] {
+    return llvm::RuntimeLibraryAnalysis(
+        triple, targetMachine->Options.ExceptionModel,
+        targetMachine->Options.FloatABIType, targetMachine->Options.EABIVersion,
+        targetMachine->Options.MCOptions.ABIName,
+        targetMachine->Options.VecLib);
+  });
+  mam.registerPass([&] { return llvm::LibcallLoweringModuleAnalysis(); });
 
   // Register all the basic analyses with the managers.
   pb.registerModuleAnalyses(mam);
diff --git a/flang/test/Lower/memory-intrinsics-expansion.F90 b/flang/test/Lower/memory-intrinsics-expansion.F90
new file mode 100644
index 0000000000000..6d9eee429f487
--- /dev/null
+++ b/flang/test/Lower/memory-intrinsics-expansion.F90
@@ -0,0 +1,33 @@
+! REQUIRES:  aarch64-registered-target
+
+! RUN: %flang_fc1 -S -O1 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O0 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -S -O2 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O2 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -S -O3 %s -triple aarch64-linux-gnu -mllvm -debug-pass=Structure -o %t_O3 2>&1 | FileCheck %s
+! RUN: FileCheck --input-file=%t_O0 --check-prefix=CALL %s
+! RUN: FileCheck --input-file=%t_O2 --check-prefix=CALL %s
+! RUN: FileCheck --input-file=%t_O3 --check-prefix=CALL %s
+
+! CHECK: Target Library Information
+! CHECK: Runtime Library Function Analysis
+! CHECK: Library Function Lowering Analysis
+
+! CALL: {{callq|bl}} memcpy
+program memcpy_test
+  implicit none
+  integer, parameter :: n = 100
+  real :: a(n), b(n)
+  integer :: i
+
+  ! Initialize array a
+  do i = 1, n
+    a(i) = real(i)
+  end do
+
+  ! Array assignment - this should generate memcpy
+  b = a
+
+  ! Use array b to prevent optimization
+  print *, b(1), b(n)
+
+end program memcpy_test
+

@alexey-bataev alexey-bataev merged commit deac791 into main Dec 10, 2025
14 checks passed
@alexey-bataev alexey-bataev deleted the users/alexey-bataev/spr/flangpassdisable-memory-intrinsics-expansions branch December 10, 2025 17:36
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Dec 10, 2025
Patch disables memory intrinsics expansion, enabled by default in
llvm/llvm-project#168622. This patch does the
same in clang, but not in flang.

The expansion causes massive perf regressions, up to 2x times in
fortran code.

Reviewers: jeanPerier, vzakhari

Reviewed By: vzakhari

Pull Request: llvm/llvm-project#171650
@pawosm-arm
Copy link
Contributor

It doesn't build :(

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysis::Key
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(Fortran::frontend::CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream&))
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(Fortran::frontend::CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream&))

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysis::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&)
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(llvm::detail::AnalysisPassModel<llvm::Module, llvm::LibcallLoweringModuleAnalysis, llvm::AnalysisManager<llvm::Module>::Invalidator>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&))

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysisResult::invalidate(llvm::Module&, llvm::PreservedAnalyses const&, llvm::AnalysisManager<llvm::Module>::Invalidator&)
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(llvm::detail::AnalysisResultModel<llvm::Module, llvm::LibcallLoweringModuleAnalysis, llvm::LibcallLoweringModuleAnalysisResult, llvm::AnalysisManager<llvm::Module>::Invalidator, true>::invalidate(llvm::Module&, llvm::PreservedAnalyses const&, llvm::AnalysisManager<llvm::Module>::Invalidator&))

I suspect majority doesn't build shared libs LLVM, but sadly, we need both shared and static for a complete toolchain product...

@alexey-bataev
Copy link
Member Author

It doesn't build :(

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysis::Key
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(Fortran::frontend::CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream&))
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(Fortran::frontend::CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream&))

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysis::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&)
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(llvm::detail::AnalysisPassModel<llvm::Module, llvm::LibcallLoweringModuleAnalysis, llvm::AnalysisManager<llvm::Module>::Invalidator>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&))

ld.lld: error: undefined symbol: llvm::LibcallLoweringModuleAnalysisResult::invalidate(llvm::Module&, llvm::PreservedAnalyses const&, llvm::AnalysisManager<llvm::Module>::Invalidator&)
>>> referenced by FrontendActions.cpp
>>>               tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/FrontendActions.cpp.o:(llvm::detail::AnalysisResultModel<llvm::Module, llvm::LibcallLoweringModuleAnalysis, llvm::LibcallLoweringModuleAnalysisResult, llvm::AnalysisManager<llvm::Module>::Invalidator, true>::invalidate(llvm::Module&, llvm::PreservedAnalyses const&, llvm::AnalysisManager<llvm::Module>::Invalidator&))

I suspect majority doesn't build shared libs LLVM, but sadly, we need both shared and static for a complete toolchain product...

Should be fixed already

@pawosm-arm
Copy link
Contributor

Should be fixed already

It builds now, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:driver flang:fir-hlfir flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants