• 2 Posts
  • 123 Comments
Joined 1 year ago
cake
Cake day: June 8th, 2023

help-circle

  • Audalin@lemmy.worldOPtoPixel Dungeon@lemmy.worldDid 6 challenges
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 days ago

    Oh, forgot about healing wells, thanks for the reminder. You should probably be able to throw the ankh directly too? But I don’t encounter them every run (e.g. didn’t have any this one) so they aren’t reliable.

    I know ascending is easy (did it many times, though only with 0-1 challenges, none of them Swarm Intelligence) and adds a 1.25 multiplier and I’ll do it when I go for that badge - but I didn’t plan for it (thought 6 challenges would be 2-3x harder than it turned out) so I wasn’t prepared to ascend this run. I’d have probably died in the 21-24 zone.

    So you think it should be On Diet? Hmm, maybe. But exploration with both On Diet and Into Darkness will be challenging.



  • My intuition:

    • There’re “genuine” instances of hapax legomena which probably have some semantic sense, e.g. a rare concept, a wordplay, an artistic invention, an ancient inside joke.
    • There’s various noise because somebody let their cat on the keyboard, because OCR software failed in one small spot, because somebody was copying data using a noisy channel without error correction, because somebody had a headache and couldn’t be bothered, because whatever.
    • Once a dataset is too big to be manually reviewed by experts, the amount of general noise is far far far larger than what you’re looking for. At the same time you can’t differentiate between the two using statistics alone. And if it was manually reviewed, the experts have probably published their findings, or at least told a few colleagues.
    • Transformers are VERY data-hungry. They need enormous datasets.

    So I don’t think this approach will help you a lot even for finding words and phrases. And everything I’ve said can be extended to semantic noise too, so your extended question also seems a hopeless endeavour when approached specifically with LLMs or big data analysis of text.













  • Because we have tons of ground-level sensors, but not a lot in the upper layers of the atmosphere, I think?

    Why is this important? Weather processes are usually modelled as a set of differential equations, and you want to know the border conditions in order to solve them and obtain the state of the entire atmosphere. The atmosphere has two boundaries: the lower, which is the planet’s surface, and the upper, which is where the atmosphere ends. And since we don’t seem to have a lot of data from the upper layers, it reduces the quality of all predictions.