I created a thread-safe memory allocator (hmalloc) optimized for small allocations. I used multiple techniques to increase the performance of my allocator. For example, I utilized a binning system so that I separate the way I handle small and large allocations. Each thread uses a different binning ‘arena’ to avoid resource locking as much as possible, but a global bins cache is available to all threads so that memory can be freed in a different thread than it was allocated in. I also coalesced memory blocks when helpful to reduce my overall space-complexity. I kept a list of all the free memory blocks separated by bin size. I implemented this as circular doubly-linked list in order to have constant time freeing and (amortized constant time) allocating. When tested on allocating and freeing linked-lists with 1000+ small-sized nodes and dynamically resizing vectors with 1000+ small-sized items in them, my implementation (hmalloc) could actually out-perform (in terms of time complexity) malloc. Here are results of the trial times below (three trials each):

Times Vectors Lists
malloc trial 1 0.046s 0.073s
malloc trial 2 0.042s 0.073s
malloc trial 3 0.04s 0.067s
hmalloc trial 1 0.039s 0.049s
hmalloc trial 2 0.043s 0.049s
hmalloc trial 3 0.029s 0.048s

Tech: C

Sorry, this project is in a private repo.

If you'd like to see the code, please feel free to email me.