Issues with the latest macOS SDK #49

Closed
opened 2026-03-31 11:54:15 -05:00 by wolfv · 13 comments
wolfv commented 2026-03-31 11:54:15 -05:00 (Migrated from github.com)

We ran into issues with the latest macOS SDK: https://github.com/conda-forge/tapi-feedstock/pull/17

We'd be happy about any pointers, e.g. whether the applied fix looks sensible.

We ran into issues with the latest macOS SDK: https://github.com/conda-forge/tapi-feedstock/pull/17 We'd be happy about any pointers, e.g. whether the applied fix looks sensible.
tdejager commented 2026-04-01 03:34:48 -05:00 (Migrated from github.com)

Just for some context this was the actual change in the PR:

--- a/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp
+++ b/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp
@@ -179,6 +179,11 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType,

   if (enforceCpuSubType)
     return AK_unknown;
+
+  // macOS 26+ SDK .tbd files only carry arm64e slices.
+  // arm64e is a superset of arm64, so use it as fallback.
+  if (arch == AK_arm64 && archs.has(AK_arm64e))
+    return AK_arm64e;
+
   return arch;
 }

Problem

The macOS 26.4 SDK dropped arm64-macos from the target lists in .tbd stub files across the SDK (at least libSystem maybe more?). Only arm64e-macos remains. I'm assuming that it has also been compiled for arm64e for a while actually.

When ld64 links for -arch arm64 (which we are doing throughout conda-forge):

  1. getArchForCPU() looks for arm64, doesn't find it, but returns AK_arm64 anyway.
  2. Impl::init() then filters symbols by arm64, matches nothing, and returns an empty: LinkerInterfaceFile.

The linker sees 0 exports and fails with undefined symbols for basic libc functions (_pthread_create, _write, etc.). This was causing compiler issue for people using the conda-forge compilation toolchains, again only with the newest SDK versions.

Why we figured this might be okay

We figured that the patched code is only reachable when arm64 is absent from the .tbd file, also for context the function (that was patched) already returns early on:

auto arch = getArchitectureFromCpuType(cpuType, cpuSubType);
if (archs.has(arch))
    return arch;  // arm64 found, patch never reached

Without the fallback the result is always zero symbols, which was the issue. So hopefully this does not break more then it was breaking.

Again, assumes the actual dylib is compiled with arm64e, which I think if they dropped arm64 from the tbd it should.

Possible concern

Because arm64e adds pointer authentication (PAC). An arm64e .tbd slice could in theory list symbols that don't exist in the arm64 runtime slice. If code referenced such a symbol it would link successfully but dyld would fail at load time since the arm64 slice of the real dylib wouldn't export it. I thought because only Apple uses the tbd files (I think). That in practice this should be okay

Question

But a concern was raised in: https://github.com/conda-forge/tapi-feedstock/issues/16 and the issue was re-opened.

Would like to get your opinion if this is the right place to fix this or if there's a better approach, or maybe something we could upstream.

Just for some context this was the actual change in the PR: ```diff --- a/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp +++ b/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp @@ -179,6 +179,11 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType, if (enforceCpuSubType) return AK_unknown; + + // macOS 26+ SDK .tbd files only carry arm64e slices. + // arm64e is a superset of arm64, so use it as fallback. + if (arch == AK_arm64 && archs.has(AK_arm64e)) + return AK_arm64e; + return arch; } ``` ### Problem The macOS 26.4 SDK dropped `arm64-macos` from the target lists in `.tbd` stub files across the SDK (at least `libSystem` maybe more?). Only `arm64e-macos` remains. I'm assuming that it has also been compiled for arm64e for a while actually. When ld64 links for `-arch arm64` (which we are doing throughout conda-forge): 1. `getArchForCPU()` looks for `arm64`, doesn't find it, but returns `AK_arm64` anyway. 2. `Impl::init()` then filters symbols by `arm64`, matches nothing, and returns an empty: `LinkerInterfaceFile`. The linker sees 0 exports and fails with undefined symbols for basic libc functions (`_pthread_create`, `_write`, etc.). This was causing compiler issue for people using the conda-forge compilation toolchains, again only with the newest SDK versions. ### Why we figured this might be okay We figured that the patched code is only reachable when `arm64` is absent from the `.tbd` file, also for context the function (that was patched) already returns early on: ```cpp auto arch = getArchitectureFromCpuType(cpuType, cpuSubType); if (archs.has(arch)) return arch; // arm64 found, patch never reached ``` Without the fallback the result is always zero symbols, which was the issue. So hopefully this does not break more then it was breaking. Again, assumes the actual dylib is compiled with arm64e, which I think if they dropped arm64 from the tbd it should. ### Possible concern Because arm64e adds pointer authentication (PAC). An arm64e `.tbd` slice could in theory list symbols that don't exist in the arm64 runtime slice. If code referenced such a symbol it would link successfully but `dyld` would fail at load time since the arm64 slice of the real dylib wouldn't export it. I thought because only Apple uses the tbd files (I think). That in practice this should be okay ### Question But a concern was raised in: https://github.com/conda-forge/tapi-feedstock/issues/16 and the issue was re-opened. Would like to get your opinion if this is the right place to fix this or if there's a better approach, or maybe something we could upstream.
lucascolley commented 2026-04-01 04:15:17 -05:00 (Migrated from github.com)

Perhaps @Developer-Ecosystem-Engineering can help?

Perhaps @Developer-Ecosystem-Engineering can help?
Un1q32 commented 2026-04-02 10:44:11 -05:00 (Migrated from github.com)

The real solution is for Apple to release the tapi 1700 and 2100 code. It's been a year since they released 1600 sources.
I'm still on 1300 because of issues with x86_64h.

The real solution is for Apple to release the tapi 1700 and 2100 code. It's been a year since they released 1600 sources. I'm still on 1300 because of issues with x86_64h.
tdejager commented 2026-04-02 12:06:42 -05:00 (Migrated from github.com)

You mean they probably have fixes for this and more?

You mean they probably have fixes for this and more?
Un1q32 commented 2026-04-02 12:21:07 -05:00 (Migrated from github.com)

I mean of course they do, building with the macOS 26 SDK for arm64 works on macOS with the Xcode utilities. They are just sitting on the solution.

I mean of course they do, building with the macOS 26 SDK for arm64 works on macOS with the Xcode utilities. They are just sitting on the solution.
tdejager commented 2026-04-02 13:15:56 -05:00 (Migrated from github.com)

Yeah that makes a lot of sense, so I guess this fix can be anyone's guess then 😅

Yeah that makes a lot of sense, so I guess this fix can be anyone's guess then 😅
Un1q32 commented 2026-04-02 15:56:45 -05:00 (Migrated from github.com)

They're probably doing the same thing if I were to guess. I can't think of any better way to do it. If the 1700/2100 sources get released and it turns out they're doing some different the PR can just be reverted and we backport the real fix instead. Or you could decompile the 1700 version from Xcode 26.

They're probably doing the same thing if I were to guess. I can't think of any better way to do it. If the 1700/2100 sources get released and it turns out they're doing some different the PR can just be reverted and we backport the real fix instead. Or you could decompile the 1700 version from Xcode 26.
tdejager commented 2026-04-02 23:54:29 -05:00 (Migrated from github.com)

Pretty good idea with the decompilation. I decompiled the 2100 sources though with Claude.

Okay so as I mentioned I had Claude do the decompilation and grilled him a bit about it and the apple sources seemed to be calling a function that iterated to find the compatible architecture, something like:

--- a/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp
+++ b/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp
@@ -162,6 +162,19 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType,
   return AK_unknown;
 }

+// Find the first architecture in the set that has the same cpu type as the
+// requested one. This should mirror the approach in tapi 2100 for handling macOS 26+
+// SDK .tbd files that only carry arm64e slices when arm64 is requested.
+static Architecture findCompatibleArch(ArchitectureSet archs,
+                                       Architecture requestedArch) {
+  auto requestedCPUType = getCPUTypeFromArchitecture(requestedArch).first;
+  for (auto arch : archs) {
+    if (getCPUTypeFromArchitecture(arch).first == requestedCPUType)
+      return arch;
+  }
+  return AK_unknown;
+}
+
 static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType,
                                   bool enforceCpuSubType,
                                   ArchitectureSet archs) {
@@ -179,6 +192,7 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType,

   if (enforceCpuSubType)
     return AK_unknown;
-  return arch;
+
+  return findCompatibleArch(archs, arch);
 }

Is this something you'd want to have a PR for?

Pretty good idea with the decompilation. I decompiled the 2100 sources though with Claude. Okay so as I mentioned I had Claude do the decompilation and grilled him a bit about it and the apple sources seemed to be calling a function that iterated to find the compatible architecture, something like: ```diff --- a/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp +++ b/src/tapi/tools/libtapi/LinkerInterfaceFile.cpp @@ -162,6 +162,19 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType, return AK_unknown; } +// Find the first architecture in the set that has the same cpu type as the +// requested one. This should mirror the approach in tapi 2100 for handling macOS 26+ +// SDK .tbd files that only carry arm64e slices when arm64 is requested. +static Architecture findCompatibleArch(ArchitectureSet archs, + Architecture requestedArch) { + auto requestedCPUType = getCPUTypeFromArchitecture(requestedArch).first; + for (auto arch : archs) { + if (getCPUTypeFromArchitecture(arch).first == requestedCPUType) + return arch; + } + return AK_unknown; +} + static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType, bool enforceCpuSubType, ArchitectureSet archs) { @@ -179,6 +192,7 @@ static Architecture getArchForCPU(cpu_type_t cpuType, cpu_subtype_t cpuSubType, if (enforceCpuSubType) return AK_unknown; - return arch; + + return findCompatibleArch(archs, arch); } ``` Is this something you'd want to have a PR for?
tpoechtrager commented 2026-04-03 01:09:08 -05:00 (Migrated from github.com)

Sure - a PR would be more than welcome!

Sure - a PR would be more than welcome!
Developer-Ecosystem-Engineering commented 2026-04-03 11:11:46 -05:00 (Migrated from github.com)

Following the lead of what TAPI has done seems like a reasonable approach for this problem. Apologies that we cannot offer more, including answers to "when will a new drop be provided." We do read every @.

Following the lead of what TAPI has done seems like a reasonable approach for this problem. Apologies that we cannot offer more, including answers to "when will a new drop be provided." We do read every @.
tdejager commented 2026-04-03 13:14:40 -05:00 (Migrated from github.com)

@tpoechtrager okay I opened the PR! BTW @Un1q32 if you want me to look at the x86_64, and its something that decompilation with claude could investigate (I have one with context, were it can get going quickly). Happy to do so if you give some pointers for the problem.

@tpoechtrager okay I opened the PR! BTW @Un1q32 if you want me to look at the x86_64, and its something that decompilation with claude could investigate (I have one with context, were it can get going quickly). Happy to do so if you give some pointers for the problem.
Un1q32 commented 2026-04-03 15:09:10 -05:00 (Migrated from github.com)

OOOH is this why x86_64h doesn't work on 1600?
For context x86_64h fails to link with a tapi dylib built with the 1600 sources, but Xcode's 1600 dylib works (at least later versions). I've been waiting forever for 1700 sources to release to hopefully have the fix, I've been stuck on 1300 in the meantime since 1500 has major performance issues and I couldn't build 1400.

OOOH is this why x86_64h doesn't work on 1600? For context x86_64h fails to link with a tapi dylib built with the 1600 sources, but Xcode's 1600 dylib works (at least later versions). I've been waiting forever for 1700 sources to release to hopefully have the fix, I've been stuck on 1300 in the meantime since 1500 has major performance issues and I couldn't build 1400.
tdejager commented 2026-04-07 04:07:57 -05:00 (Migrated from github.com)

Let's close now its merged :)

Let's close now its merged :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
miles/apple-libtapi#49
No description provided.