作者:eddiecmchen,PCG客户端开发工程师
导语把我的iphone XR扶起来,它还能再顶一会儿~
背景
远在IOS 11时期,苹果就发公告要求所有需要上架AppStore的应用都必须支持64位。32位应用不再支持上架与运行。
升级64位应用有什么好处呢?
这里我们要注意的是:虚拟内存确实比纯32位多了,但是App到底能用多少,是否跟宣传一样接近16EB?下面将会展开聊聊,我们先来看一个Crash。
一个长期存在的幽灵
我们先来看下面的一个内存导致的崩溃,JSC在使用bmalloc尝试进行内存分配时,提示OOM导致了SIGTRAP。
Last Exception :0 JAVAScriptCore 0x000000018b777570 _pas_panic_on_out_of_memory_error1 JavaScriptCore 0x000000018b72e918 _bmalloc_try_iso_allocate_impl_impl_slow2 JavaScriptCore 0x000000018b73d3d8 _bmalloc_heap_config_specialized_local_allocator_try_allocate_small_segregated_slow + 59523 JavaScriptCore 0x000000018b7276f8 _bmalloc_allocate_impl_casual_case + 8004 JavaScriptCore 0x000000018c60d494 JSC::PropertyTable::create + 2445 JavaScriptCore 0x000000018c66ba74 JSC::Structure::materializePropertyTable(JSC::VMamp;, bool) + 3246 JavaScriptCore 0x000000018c66dfac JSC::Structure::changePrototypeTransition(JSC::VMamp;, JSC::Structure*, JSC::JSValue, JSC::DeferredStructureTransitionWatchpointFireamp;) + 6127 JavaScriptCore 0x000000018c559930 JSC::JSObject::setPrototypeDirect(JSC::VMamp;, JSC::JSValue) + 1928 JavaScriptCore 0x000000018c559e40 JSC::JSObject::setPrototypeWithCycleCheck(JSC::VMamp;, JSC::JSGlobalObject*, JSC::JSValue, bool) + 3169 JavaScriptCore 0x000000018c4f580c JSC::globalFuncProtoSetter(JSC::JSGlobalObject*, JSC::CallFrame*) + 19210 JavaScriptCore 0x000000018ba1f7a8 _vmEntryToNative + 28011 JavaScriptCore 0x000000018c1b0cd0 JSC::Interpreter::executeCall(JSC::JSGlobalObject*, JSC::JSObject*, JSC::CallData constamp;, JSC::JSValue, JSC::ArgList constamp;) + 61612 JavaScriptCore 0x000000018c474ecc JSC::GetterSetter::callSetter(JSC::JSGlobalObject*, JSC::JSValue, JSC::JSValue, bool) + 21213 JavaScriptCore 0x000000018c5b6264 JSC::JSGenericTypedArrayViewlt;JSC::Uint8Adaptorgt;::put(JSC::JSCell*, JSC::JSGlobalObject*, JSC::PropertyName, JSC::JSValue, JSC::PutPropertySlotamp;) + 61214 JavaScriptCore 0x000000018c2c2ecc _llint_slow_path_put_by_id + 3244// 忽略多余重复堆栈37 JavaScriptCore 0x000000018ba1f5fc _vmEntryToJavaScript + 26438 JavaScriptCore 0x000000018c1b0c7c JSC::Interpreter::executeCall(JSC::JSGlobalObject*, JSC::JSObject*, JSC::CallData constamp;, JSC::JSValue, JSC::ArgList constamp;) + 53239 JavaScriptCore 0x000000018bac7ae4 _JSObjectCallAsFunction + 56840 mttlite 0x0000000102a54914 hippy::napi::JSCCtx::CallFunction(std::__1::shared_ptrlt;hippy::napi::CtxValuegt; constamp;, unsigned long, std::__1::shared_ptrlt;hippy::napi::CtxValuegt; const*) (js_native_api_value_jsc.cc:406)41 mttlite 0x0000000102a664e0 _ZNSt3__110__function6__funcIZN11TimerModule5StartERKN5hippy4napi12CallbackInfoEbE3$_4NS_9allocatorIS8_EEFvvEEclEv (memory:3237)42 mttlite 0x0000000102a63018 hippy::base::TaskRunner::Run() (memory:3237)43 mttlite 0x0000000102a64974 ThreadEntry (thread.cc:0)44 libsystem_pthread.dylib 0x00000001dc129348 __pthread_start + 116Exception Type: SIGTRAP Exception Codes: fault addr: 0x000000018b777570Crashed Thread: 48 hippy.js
这个OOM问题,与iOS上常见的OOM不一样。按照常规的理解,当App内存不足的时候,正常会触发系统的Jetsam机制杀死App。在系统日志中会留下Jetsam相关日志,理论上不会在Bugly等异常上报中发现。但这一类崩溃却一直在产生上报,并且低内存的崩溃堆栈表现形式有很多种。
以上的JSC崩溃问题已经存在很长一段时间了,而且崩溃堆栈都集中在JSC执行JS代码的过程中,长期缺乏JS相关的监控与Debug工具导致该问题一直无法解决。
虽然堆栈上有明确的原因说明是OOM,但我们观察到有不少用户实际上物理内存空间还是足够的:
两年前,冲浪的时候偶然看来了来自微视同学的Case总结:《OOM与内存》
当时跟hippy SDK的同事也讨论过是否存在类似的内存不足情况。但由于大家对JSC黑盒都不熟悉,而且崩溃的JS堆栈也不确切。当时的建议是:少在后台加载JSC。最终也并没有解决该问题。
两年后,当浏览器集成flutter,类似的JS崩溃直接翻倍。没办法,还是要看类似JSC和Dart VM的内存分配机制是怎样的,再挖掘一下是否存在解(缓)决(解)方案。
JSC、DartVM的虚拟内存分配
翻阅相关虚拟机的内存管理相关代码,可以找到底层的内存分配基本实现都是基于mmap处理的。
PROT_WRITE, MAP_PRIVATE | MAP_ANON | BMALLOC_NORESERVE, static_castlt;intgt;(usage), 0); if (result MAP_FAILED) return nullptr; return result;
PROT_WRITE | (is_executable ? PROT_EXEC : 0); int map_flags = MAP_PRIVATE | MAP_ANONYMOUS;#if (defined(DART_HOST_OS_macOS) amp;amp; !defined(DART_HOST_OS_IOS)) if (is_executable amp;amp; IsAtLeastOS10_14()) map_flags |= MAP_JIT; #endif // defined(DART_HOST_OS_MACOS) // Some 64-bit microarchitectures store only the low 32-bits of targets as // part of indirect branch prediction, predicting that the target#39;s upper bits // will be same as the call instruction#39;s address. This leads to misprediction // for indirect calls crossing a 4GB boundary. We ask mmap to place our // generated code near the VM binary to avoid this. void* hint = is_executable ? reinterpret_castlt;void*gt;(amp;Allocate) : nullptr; void* address = mmap(hint, size, prot, map_flags, -1, 0); if (address MAP_FAILED) return nullptr; return new VirtualMemory(address, size);VirtualMemory::~VirtualMemory() if (address_ != nullptr) if (munmap(address_, size_) != 0) int error = errno; const int kBufferSize = 1024; char error_buf(kBufferSize); FATAL(quot;munmap error: %d (%s)quot;, error, Utils::StrError(error, error_buf, kBufferSize));
当map_flags包含MAP_ANON时,并且fd传入-1时,mmap将直接使用虚拟内存进行分配,不需要依赖文件描述符。
mmap在xnu上的实现
| (((flags amp; MAP_PRIVATE) != MAP_PRIVATE) amp;amp; ((flags amp; MAP_SHARED) != MAP_SHARED)) || (len 0)) cerror_nocancel(EINVAL); return(MAP_FAILED); void *ptr = __mmap(addr, len, prot, flags, fildes, off); if (__syscall_logger) int stackLoggingFlags = stack_logging_type_vm_allocate; if (flags amp; MAP_ANON) stackLoggingFlags |= (fildes amp; VM_FLAGS_ALIAS_MASK); else stackLoggingFlags |= stack_logging_type_mapped_file_or_shared_mem; __syscall_logger(stackLoggingFlags, (uintptr_t)mach_task_self(), (uintptr_t)len, 0, (uintptr_t)ptr, 0); return ptr;
上面的调用会传递到内核kern_mman.c的实现函数mmap
/* * XXX Internally, we use VM_PROT_* somewhat interchangeably, but the correct * XXX usage is PROT_* from an interface perspective. Thus the values of * XXX VM_PROT_* and PROT_* need to correspond. */intmmap /* * 上面忽略了一部分代码 */ result = vm_map_enter_mem_object(user_map, amp;user_addr, user_size, 0, alloc_flags, vmk_flags, tag, IPC_PORT_NULL, 0, FALSE, prot, maxprot, (flags amp; MAP_SHARED) ? VM_INHERIT_SHARE : VM_INHERIT_DEFAULT); /* If a non-binding address was specified for this anonymous * mapping, retry the mapping with a zero base * in the event the mapping operation failed due to * lack of space between the address and the map#39;s maximum. */ if ((result KERN_NO_SPACE) amp;amp; ((flags amp; MAP_FIXED) 0) amp;amp; user_addr amp;amp; (num_retries++ 0)) user_addr = vm_map_page_size(user_map); goto map_anon_retry; /* * 下面忽略了一部分代码 */
其中又会调用vm_map.c内部的vm_map_enter_mem_object,而该方法最终会在vm_map_enter中依据对象进行内存分配:
// 下面这个只截了个头,大概带一下,我也没调过代码~/* * Routine: vm_map_enter * * Description: * Allocate a range in the specified virtual address map. * The resulting range will refer to memory defined by * the given memory object and offset into that object. * * Arguments are as defined in the vm_map call. */kern_return_tvm_map_enter
其中vm_map_enter在分配过程中会对hole_entryrarr;vme_end作判断,vme_end即最大的可分配空间。
xnu上虚拟内存的分配范围
本来我只是观察到苹果在iOS15上增加了com.apple.developer.kernel.increased-memory-limit的能力声明。本着死马当活马医的想法,尝试在新版本上添加该声明以缓解一部分问题。
结果偶然看到部分开发者提问:该能力可配合com.apple.developer.kernel.extended-virtual-addressing使用。看到后我一下子反应过来,顺手搜到了今年二月国外有大佬做了相关的探索:
Size Matters: An Exploration of Virtual Memory on iOS
文章阐述了iOS的内存管理机制和虚拟内存空间分配在不同的机型上存在上限,代码如下:
#define ARM64_MIN_MAX_ADDRESS // end of shared region + 512MB for various purposesconst vm_map_offset_t min_max_offset = ARM64_MIN_MAX_ADDRESS; // end of shared region + 512MB for various purposesif (arm64_pmap_max_offset_default) max_offset_ret = arm64_pmap_max_offset_default; else if (max_mem gt; 0xC0000000) max_offset_ret = min_max_offset + 0x138000000; // Max offset is 13.375GB for devices with gt; 3GB of memory else if (max_mem gt; 0x40000000) max_offset_ret = min_max_offset + 0x38000000; // Max offset is 9.375GB for devices with gt; 1GB and lt;= 3GB of memory else max_offset_ret = min_max_offset;
并且总结了一个上限值与机型表格:
RAM |
Address Space |
Usable |
Devices |
gt; 3 GiB |
15.375 GiB |
7.375 GiB |
- iPhone XS ndash; iPhone 13- iPad Air - iPad Pro (12.9-inch), (10.5-inch), (11-inch) |
gt; 1 GiB |
11.375 GiB |
3.375 GiB |
- iPhone 6s ndash; X, SE, XR- iPad ndash; iPad (8th generation)- iPad Air 2, iPad Air (3rd generation)- iPad mini 4, iPad mini - iPad Pro (9.7-inch) |
lt;= 1 GiB |
10.5 GiB |
2.5 GiB |
- iPhone 5s, iPhone 6- iPad Air- iPad mini 2, iPad mini 3 |
而xnu的源码中还透露了内核内存分配存在jumbo机制。当iOS App带有指定的能力声明时,xnu内核将会以jumbo模式运行,虚拟内存地址空间将会直接分配为最大值64GB:
if if (arm64_pmap_max_offset_default) // Allow the boot-arg to override jumbo size max_offset_ret = arm64_pmap_max_offset_default; else max_offset_ret = MACH_VM_MAX_ADDRESS; // Max offset is 64GB for pmaps with special quot;jumboquot; blessing
并且该上限值会在进程启动时进行调整,具体代码可以在kern_exec.c中找到:
/* * Apply the requested maximum address. */if struct _posix_spawnattr *psa = (struct _posix_spawnattr *) imgp-gt;ip_px_sa; if (psa-gt;psa_max_addr) vm_map_set_max_addr(get_task_map(new_task), (vm_map_offset_t)psa-gt;psa_max_addr);
甚少文档记录的entitlement
com.apple.developer.kernel.extended-virtual-addressing
苹果的文档仅有一句话说明该能力:
Use this entitlement if your app has specific needs that require a larger addressable space. For example, games that memory map assets to stream to the GPU may benefit from a larger address space.
举个例子:有的游戏需要将资源通过mmap的形式传递到GPU中渲染时,更大的地址空间可提高其运行效率。
描述上看,配置该选项时,将开启上面xnu的jumbo mode,地址的扩充刚好能解决上面的崩溃问题。
做一次极限测试
为验证地址分配的极限值,简单做个实验:
通过malloc进行连续的内存分配,阈值卡在1009字节(为什么是1009字节,这里可以参考(ios 内核)源码解读(3) 详解ios是怎么malloc的(上) - 钟路成的博客 (luchengzhong.github.io))。
for void *a = malloc(1009); if (a NULL) NSLog(quot;error count: %luquot;, i); break;
结果如下:
size = 1009 gt; SMALL_THRESHOLD
内存扩展前malloc失败阈值约7065482 * 1009 = 6.63 GB
内存扩展后malloc失败阈值约56753881 * 1009 = 53.33 GB
当然,在xnu的单元测试代码中,也可找到jumbo mode相关的测试代码,与上面的测试结果完全一致,即最多可分配53GB的空间。
#define GB /* * This test expects the entitlement to be the enabling factor for a process to * allocate at least this many GB of VA space. i.e. with the entitlement, n GB * must be allocatable; whereas without it, it must be less. * This value was determined experimentally to fit on applicable devices and to * be clearly distinguishable from the default VA limit. */#define ALLOC_TEST_GB 53T_DECL(TESTNAME, quot;Verify that a required entitlement is present in order to be granted an extra-large quot; quot;VA space on arm64quot;, T_META_NAMESPACE(quot;xnu.vmquot;), T_META_CHECK_LEAKS(false)) int i; void *res; if (!dt_64_bit_kernel()) T_SKIP(quot;This test is only applicable to arm64quot;); T_LOG(quot;Attemping to allocate VA space in 1 GB chunks.quot;); for (i = 0; i lt; (ALLOC_TEST_GB * 2); i++) res = mmap(NULL, 1 * GB, PROT_NONE, MAP_PRIVATE
可见,当开启com.apple.developer.kernel.extended-virtual-addressing时,内核的可分配空间确实有明显提升。
上线效果与结论
简单总结:
以上源码相关的内容仅个人阅读理解,如有错误请指出。
免责声明:该文章系本站转载,旨在为读者提供更多信息资讯。所涉内容不构成投资、消费建议,仅供读者参考。