igrunert 3 hours ago

I recently ported WebKit's libpas memory allocator[1] to Windows, which used pthreads on the Linux and Darwin ports. Depending on what pthreads features you're using it's not that much code to shim to Windows APIs. It's around ~200 LOC[2] for WebKit's usage, which a lot smaller than pthread-win32.

[1] https://github.com/WebKit/WebKit/pull/41945 [2] https://github.com/WebKit/WebKit/blob/main/Source/bmalloc/li...

andy99 2 hours ago

I'm a big fan of pigz, I discovered it 6 years ago when I had some massive files I needed to zip and and 48 core server I was underutilizing. It was very satisfying to open htop and watch all the cores max out.

Edit: found the screenshot https://imgur.com/a/w5fnXKS

kjksf 4 days ago

Worth mentioning that this is only of interest as technical info on porting process.

The port itself is very old and therefore very outdated.

themadsens 4 hours ago

I wish premake could gain more traction. It is the comprehensible alternative to Cmake etc.

  • beagle3 4 hours ago

    Xmake[0] is as-simple-as-premake and does IIRC everything Premake does and a whole lot more.

    [0] https://xmake.io/

  • PeakKS 2 hours ago

    It's 2025, just use meson

    • nly an hour ago

      Completely useless in an airgapped environment

jqpabc123 4 days ago

This is clearly aimed at faster results in a single user desktop environment.

In a threaded server type app where available processor cores are already being utilized, I don't see much real advantage in this --- if any.

  • GuinansEyebrows 4 hours ago

    depends on the current load. i've worked places where we would create nightly postgres dumps via pg_dumpall, then pipe through pigz to compress. it's great if you run it when load is otherwise low and you want to squeeze every bit of performance out of the box during that quiet window.

    this predates the maturation of pg_dump/pg_restore concurrency features :)

    • ggm 2 hours ago

      Not to over state it, embedding the parallelism into the application drives to the logic "the application is where we know we can do it" but embedding the parallelism into a discrete lower layer and using pipes drives to "this is a generic UNIX model of how to process data"

      The thing with "and pipe to <thing>" is that you then reduce to a serial buffer delay decoding the pipe input. I do this, because often its both logically simple and the component of serial->parallel delay deblocking on a pipe is low.

      Which is where xargs and the prefork model comes in, because instead you segment/shard the process, and either don't have a re-unification burden or its a simple serialise over the outputs.

      When I know I can shard, and I don't know how to tell the appication to be parallel, this is my path out.