it's not even some midjourney shit, at least their model is somewhat decent for realistic stuff
Going off of complete lack of knowledge over how they did it I'd say it's some faggot fucking around in SDXL since the default setting is 1024x1024 like you pointed out.
Not SDXL for sure, even that produces better result than this You CAN train 1.5/2.x models at high resolutions, it takes much longer than the average 768x tho
Either way it's bad. Could whip up a prompt for this if I tried.