We are indeed fetching a bunch of resources from the same server across different streams. Libp2p allows us to use multiplexing so we can open as many bi-directional streams as we want over a single connection, it's awesome. We use it for Airlock (permission system for the decentralized GitHub). I understand your point about the OS handling TCP instead of each app handling networking individually, which does make a lot of sense. I wish there were a plug-and-play TCP+TPO+Noise library that could handle multiplexing! Would be a nice addition to include in libp2p.
TFO*
I mean if you drop the TFO requirement it’s easy - just open many connections. But just fetching many resources isn’t sufficient to want QUIC - you have to be doing so immediately after opening the connection(s), the resources have to be non-trivial in size (think 10s of packets, so the text of a note generally doesn’t qualify) and have a need for not blocking one on another, which is generally not the case on a mobile app - server can send the first three things you need to paint the full window first and then more later.
It’s a desktop app for decentralized GitHub on Nostr. The amount of data is non-trivial in size (sometimes). Repos can be large. This is why we’re using merkle tree chunking for large files as well. I want the reduced RTT.
It’s just a head-of-line-blocking question, though…I imagine mostly you’re not downloading lots of CSS/JS/images, which is the big head-of-line issue HTTP clients have - they can render the page partially in response to each new item they get off the wire. I assume you don’t, really, though? You presumably get events back from the server in time-order and populate the page down in time-order, so mostly one event stalling the next won’t make all that much difference in UX?