BaumGeist@lemmy.ml to Programmer Humor@lemmy.ml · 2 months agoThe best answer on StackOverflow: Using RegEx to parse HTMLstackoverflow.comexternal-linkmessage-square36fedilinkarrow-up1310arrow-down112
arrow-up1298arrow-down1external-linkThe best answer on StackOverflow: Using RegEx to parse HTMLstackoverflow.comBaumGeist@lemmy.ml to Programmer Humor@lemmy.ml · 2 months agomessage-square36fedilink
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up8arrow-down1·1 month agoIt can’t be done, as an opening tag in html can contain anything in its attributes, even JavaScript (e.g. onclick handler).
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up5·1 month agoYou can’t parse every html opening tag with regex, because a html opening tag doesn’t have a set structure. How would you match, with regex, this opening tag? <mytag myattribute="<value of \"myattribute\">" >
minus-squareschnurrito@discuss.tchncs.delinkfedilinkarrow-up1arrow-down1·edit-21 month agoIs this valid HTML? My understanding is that that attribute value needs to be escaped, i.e. <value of \"myattribute\">.
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up4·1 month agoThe quote must not be escaped when you start with a single quote. The rest doesn’t. This is valid and tested: <img alt='my "<img>"'>
It can’t be done, as an opening tag in html can contain anything in its attributes, even JavaScript (e.g. onclick handler).
??? Non sequitur
You can’t parse every html opening tag with regex, because a html opening tag doesn’t have a set structure. How would you match, with regex, this opening tag?
<mytag myattribute="<value of \"myattribute\">" >
Is this valid HTML? My understanding is that that attribute value needs to be escaped, i.e.
<value of \"myattribute\">
.The quote must not be escaped when you start with a single quote. The rest doesn’t. This is valid and tested:
<img alt='my "<img>"'>