just wrote an ugly regex...

/(<[^\/>]+>)|(<\s*\/[^>]*>)|(<[^\/>]+\/\s*>)|(\${[^}]+})/g

.\ /. -___- X___X

@g pretty much! it's a regex for html5 tags with 4 capture groups - open tags (e.g. <div>), close tags (e.g. </div>), self closing tags (e.g. <br />), and a substitution variables (${xxx}).

@fourlegs ah, but will it capture

```
<body data-id="93" data-encodedsecret="dGhpcyBpcyBhIHN1cGVyIHNlY3JldCBzdHJpbmc" data-tag="<p>paragraph</p>">
</body>
```

Follow

@g hmm you are right my regex would balk at your html string inside data-tag. i need to fix it by also matching attr=value pairs inside any tags...

· · Web · 0 · 0 · 0

@g how about this:
<\s*[\w-]+(\s*[\w-]+\s*=\s*"[^"]+")*\s*>

@fourlegs perfect! 😃

I've had this exact problem for Nazdeeq.com, I needed to inject a header and scripts right after the opening body tag in a reverse proxy. Eventually I changed the whole stack but attributes gave me a lot of trouble initially.

@g yeah.. it's really hard to get all use cases covered so the better route, if possible, is to urlencode any attribute values.

Sign in to participate in the conversation
Mastodon for Tech Folks

This Mastodon instance is for people interested in technology. Discussions aren't limited to technology, because tech folks shouldn't be limited to technology either!