Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AOT compiled app throws System.PlatformNotSupportedException #5358

Closed
kedare opened this issue Feb 14, 2024 · 16 comments
Closed

AOT compiled app throws System.PlatformNotSupportedException #5358

kedare opened this issue Feb 14, 2024 · 16 comments
Labels
bug Something isn't working

Comments

@kedare
Copy link

kedare commented Feb 14, 2024

Bug Report

List of all OpenTelemetry NuGet
packages
and version that you are
using (e.g. OpenTelemetry 1.0.2):

  • OpenTelemetry 1.7.0
  • OpenTelemetry.Exporter.OpenTelemetryProtocol 1.7.0

Runtime version (e.g. net462, net48, netcoreapp3.1, net6.0 etc. You can
find this information from the *.csproj file): net8.0

Symptom

When compiling with PublishAot=true, I get an System.PlatformNotSupportedException from an unknown origin in the code

What is the expected behavior?

App should output without error

pre otel
post otel

What is the actual behavior?

The application would throw an exception at startup:

pre otel
Unhandled Exception: System.PlatformNotSupportedException: Operation is not supported on this platform.
   at System.Runtime.InteropServices.Marshal.GetExceptionPointers() + 0x28
   at OpenTelemetry.Trace.ExceptionProcessor.OnStart(Activity) + 0x20
   at System.Diagnostics.SynchronizedList`1.EnumWithAction(Action`2, Object) + 0xe4
   at System.Diagnostics.Activity.Start() + 0x188
   at System.Diagnostics.Activity.Create(ActivitySource, String, ActivityKind, String, ActivityContext, IEnumerable`1, IEnumerable`1, DateTimeOffset, ActivityTagsCollection, ActivitySamplingRe
sult, Boolean, ActivityIdFormat, String) + 0x328
   at System.Diagnostics.ActivitySource.CreateActivity(String, ActivityKind, ActivityContext, String, IEnumerable`1, IEnumerable`1, DateTimeOffset, Boolean, ActivityIdFormat) + 0x4d4
   at System.Diagnostics.ActivitySource.StartActivity(String, ActivityKind) + 0x40
   at Program.<Main>$(String[] args) + 0xfc
   at dotnet-otel-bug!<BaseAddress>+0x239d08
[1]    80113 abort

Reproduce

Code in : https://github.com/kedare/dotnet-otel-bug/

Try without AOT, it should work fine :

dotnet publish -c Release --sc --use-current-runtime

Try with AOT, it should throw the exception after the pre otel console output

dotnet publish -c Release --sc --use-current-runtime -p:PublishAot=true

Additional Context

Tested in MacOs (build to Mac binary)
Tested inside docker (build to Linux binary)

I first though the issue was from the AddSource("*") as then I remove it the issue is gone but I guess it's just because then it's not gathering any trace and not hitting the part of the opentelemetry code that triggers this issue)

@kedare kedare added the bug Something isn't working label Feb 14, 2024
@kedare kedare changed the title AOT compiled app triggers System.PlatformNotSupportedException AOT compiled app throws System.PlatformNotSupportedException Feb 14, 2024
@vishweshbankwar
Copy link
Member

@Yun-Ting Could you take a look?

@Yun-Ting
Copy link
Contributor

Hi @kedare, I cloned the code you shared: https://github.com/kedare/dotnet-otel-bug/tree/main and the issue did not repro in my machine.

But I do see this warning emitted with your code:
C:\repos\dotnet-otel-bug\Program.cs(36,1): warning CS8602: Dereference of a possibly null reference. [C:\repos\dotnet-otel-bug\dotnet-otel-bug.csproj]

Could you try whether updating L36 to be
activity?.Stop();
would fix the issue?

@kedare
Copy link
Author

kedare commented Feb 16, 2024

I fixed the null safety but the issue still happens 100% of the time, I will try to make a Dockerfile to make it easier to reproduce

@kedare
Copy link
Author

kedare commented Feb 16, 2024

Hmm somehow I can't reproduce it with this sample project with Docker but it does with our full project.

Is there a way to get more information from the stacktrace from the full project about what could cause this so I can try to reproduce it on the minimal project ?

Unhandled Exception: System.PlatformNotSupportedException: Operation is not supported on this platform.
   at System.Runtime.InteropServices.Marshal.GetExceptionPointers() + 0x28
   at OpenTelemetry.Trace.ExceptionProcessor.OnStart(Activity) + 0x20
   at OpenTelemetry.CompositeProcessor`1.OnStart(T) + 0x2c
   at System.Diagnostics.SynchronizedList`1.EnumWithAction(Action`2, Object) + 0xe4
   at System.Diagnostics.Activity.Start() + 0x188
   at System.Diagnostics.Activity.Create(ActivitySource, String, ActivityKind, String, ActivityContext, IEnumerable`1, IEnumerable`1, DateTimeOffset, ActivityTagsCollection, ActivitySamplingResult, Boolean, ActivityIdFormat, String) + 0x328
   at System.Diagnostics.ActivitySource.CreateActivity(String, ActivityKind, ActivityContext, String, IEnumerable`1, IEnumerable`1, DateTimeOffset, Boolean, ActivityIdFormat) + 0x53c
   at System.Diagnostics.ActivitySource.StartActivity(String, ActivityKind) + 0x40
   at Cardiologs.Pulse.ReplicaManager.Console.Program.<Main>d__6.MoveNext() + 0x338
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() + 0x24
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task) + 0x100
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions) + 0x68
   at Cardiologs.Pulse.ReplicaManager.Console.Program.<Main>(String[] args) + 0x3c
   at replica-manager!<BaseAddress>+0xc8a268

It would be related to the activity start ? Or to something happening inside the activity ?

@Tesouro-Chris
Copy link

Hello there! We encountered this as well recently. I believe this is coming from the ExceptionProcessor when it calls the following

#if NET6_0_OR_GREATER || NETFRAMEWORK
        this.fnGetExceptionPointers = Marshal.GetExceptionPointers;
#else

For AOT, this currently is not supported. https://github.com/dotnet/runtime/blob/cd850233414468a8ffe84fc8fa05a36d0b3b1316/src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/InteropServices/Marshal.NativeAot.cs#L225

public static IntPtr GetExceptionPointers()
        {
            throw new PlatformNotSupportedException();
        }

I'm currently noodling on ways around this, but, honestly, not comfortable enough with the pointer accessors yet to recommend a solution or submit a PR. Hoping to be able to do either of those over the next few days though. Cheers!

@utpilla
Copy link
Contributor

utpilla commented Feb 17, 2024

@kedare As @Tesouro-Chris pointed, it looks like Marshal.GetExceptionPointers is not supported for AOT. SetErrorStatusOnException method which is used to report unhandled exceptions adds an activity processor to the SDK which calls the GetExceptionPointers method. If you don't care about reporting unhandled exceptions, then you could remove the call to SetErrorStatusOnException from your SDK setup to avoid this issue.

@eerhardt @vitek-karas Would you have any inputs for this issue? We probably need to annotate this method.

@eerhardt
Copy link
Contributor

We probably need to annotate this method.

I was thinking the same thing. But annotating it with "RequiresDynamicCode" seems a bit confusing, since it doesn't really require "dynamic code".

Are there other ways to know if an exception is currently being handled?

It would probably be best to guard this and not call it when RuntimeFeature.IsDynamicCodeSupported == false for now, so people aren't getting a PNSE.

cc @agocke @MichalStrehovsky

@MichalStrehovsky
Copy link

I was thinking the same thing. But annotating it with "RequiresDynamicCode" seems a bit confusing, since it doesn't really require "dynamic code".

Yeah, the right thing would be something along the lines of IsSehInteropSupported or something like that. This doesn't have to do with the ability to JIT new code at runtime, but how exceptions are implemented. CoreCLR implements exceptions on top of SEH because of C++/CLI.

I think we could make this return null based on the conversation in dotnet/runtime#26620.

Some additional conversation at dotnet/runtime#75669 (comment).

Cc-ing people involved in those: @AaronRobinsonMSFT @jkoritzinsky @jkotas

@jkotas
Copy link

jkotas commented Feb 20, 2024

I think we could make this return null

Returning null won't make

public override void OnStart(Activity activity)
{
var pointers = this.fnGetExceptionPointers();
if (pointers != IntPtr.Zero)
{
activity.SetTag(ExceptionPointersKey, pointers);
}
}
/// <inheritdoc />
public override void OnEnd(Activity activity)
{
var pointers = this.fnGetExceptionPointers();
if (pointers == IntPtr.Zero)
{
return;
}
var snapshot = activity.GetTagValue(ExceptionPointersKey) as IntPtr?;
if (snapshot != null)
{
activity.SetTag(ExceptionPointersKey, null);
}
if (snapshot != pointers)
{
// TODO: Remove this when SetStatus is deprecated
activity.SetStatus(Status.Error);
// For processors/exporters checking `Status` property.
activity.SetStatus(ActivityStatusCode.Error);
}
}
work correctly. I am not sure what this code is trying to do exactly, but returning null is going to make it misbehave.

I think the proper fix would be to introduce APIs that allow tracking exceptions in flight. We have AppDomain.FirstChanceException event that is one part of it, but we do not have event for when the exception is caught.

@martinjt
Copy link
Member

I'm going to remove the bug label here and close this as resolved. The issue isn't part of OpenTelemetry, or one we are going to fix. It requires upstream fixes in the runtime to make that happen.

Sorry we don't have a better answer for you.

If anyone thinks that there's something that the Otel .NET library can do, please do comment and we can open an issue for tracking that work.

@cijothomas
Copy link
Member

If anyone thinks that there's something that the Otel .NET library can do

What is this PR #5374 then?

@martinjt martinjt reopened this Feb 29, 2024
@martinjt
Copy link
Member

Reopened. My perception was that it wasn't a fix, just making a different, potentially better, error message. If you think it's worth keeping this open while we add a better error message, or I've misunderstood, then I don't mind it being open.

@cijothomas
Copy link
Member

I think we should keep this open, as there is an open PR trying to address this. If we conclude in the PR that this is not something OTel .NET can fix, then yes, we can close.

@kedare
Copy link
Author

kedare commented Mar 27, 2024

Sorry we don't have a better answer for you.

No worry, thank you for your help :)
I was only looking for a more explicit error message (throwing the error during setup instead of runtime is good now), I know AOT still have some limitation on this side. Hopefully this would be possible in a next version of .NET

@martinjt
Copy link
Member

Are you happy, with that PR merged, that we close this?

@kedare
Copy link
Author

kedare commented Sep 7, 2024

Yes that should be good, thank you for your help

@Kielek Kielek closed this as completed Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests