Currently most/all versions of get_unmapped_area perform a linear search in the process virtual address space to find free space.
Some, like arch_get_unmapped_area_topdown for x86-64 even call find_vma for each step, which does a full walk on the rb-tree of vmas.
Instead, they should use, from slower to faster:
- O(n) but faster: a linked list of virtual address space holes
- O(log(n)): an rb-tree of virtual address space holes indexed by size
- O(1): a buddy allocator of virtual address space holes, or another scheme with buckets
Is there any reason this issue hasn't been fixed yet? (i.e. any reason none of the proposed schemes are feasible?)
Workloads doing a lot of mmaps tend to suffer greatly, especially on the versions that do a find_vma for each step of the scan.
An example are OpenGL drivers using DRM/GEM/TTM who don't employ userspace caching and suballocation of TTM allocated buffers.