• smeenz@lemmy.nz
    link
    fedilink
    English
    arrow-up
    5
    ·
    15 days ago

    If sites start blocking googlebot en masse, then googlebot will just start ignoring robots.txt

    • ℍ𝕂-𝟞𝟝@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      4
      ·
      14 days ago

      Can they just put an EULA on the site and then sue Google for unauthorized access?

      Not in the US of course, but in the EU or something

      • smeenz@lemmy.nz
        link
        fedilink
        English
        arrow-up
        2
        ·
        13 days ago

        Then the user agent string will just quietly become randomised so you can’t match it reliably because it turns out that honouring robots.txt was always little more than a “gentleman’s handshake”.

          • Evinceo@awful.systems
            link
            fedilink
            English
            arrow-up
            2
            ·
            10 days ago

            Yeah an adversary like Google isn’t something you can easily block without really annoying legitimate users unfortunately. Nothing is stopping them from turning every chrome instance into a botnet node except for the angry article that would run in Ars Technica.