We use it for triaging test failure (running tens of thousands of tests for CPU design verification).
That use is acceptable because it is purely informational. In general you should avoid regexes at all costs. They’re difficult to read, and easy to get wrong. Generally they are a very big red flag.
Unfortunately they tend to get used where they shouldn’t due to lazy developers not parsing things properly.
regexes are a well established solution for parsing strings. what exactly is the “proper” alternative you propose?
There are some tools/libraries that act as a front-layer over regex.
They basically follow the same logic as ORMs for databases:
- Get rid of the bottom layer to make some hidden footguns harder to trigger
- Make the used layer closer to the way the surrounding language is used.
But there’s no common standard, and it’s always language specific.
Personally I think using linters is the best option since it will highlight the footguns and recommend simpler regexes. (e.g. Swapping
[0-9]
for\d
)
Writing the script that got me fired
Please explain more! What happened?
Did you destroy a database? Expose credentials? Nuke the company intentionally?
I hope you are joking
On average I’ve probably had to work with them or write one from scratch only a handful of times per year over my career. Not often enough to be an expert or anything but I’m not so afraid of them as I used to be.
Yesterday, when I had a file with a list of JSON objects, and I wanted to move the date field at the end to the beginning, so I used regex find and replace to move it. Something like
\{(.*?), ("date": ".*?")
in Search, and then{$2, $1
in replace (or something close to it).Yes, I refactor code and data using regex. I can’t be arsed to learn AWK (even though I should).
AWK doesn’t work with json IIRC. You have to use jq to deal with json.
While yes, the way I had it structured looked like a CSV if you squinted a little, I do fully agree AWK can’t be used for just any old JSON.
jq
is dope, but that language still feels pretty confusing IMO.
Yesterday. Gotta grep those logs.
I used it to check a user input format.
Asking this question is like asking when was the last time you had to search through text.
Usually many times a day… Even today which have been mostly meetings.
Yesterday, for capturing URLs.
https?//[a-zA-Z0-9_-]*
I am kinda learning RE right now 😅
What about ftp? 🤔
If we want to include every protocol then the RE could be complex.
Depending on the use-case it maybe should. On the other hand, some things are better left to library implementations rather than custom regex, e.g. email validation
Earlier this week for a character range.
/edit: Now I remember. For setting up a new entry in Jenkins CI build failure analysis - identifying the build failure cause in the log.
This sentence is the uncanny valley for structure.
Every day pretty much with Unix tools. Vim, awk, sed, etc.
A few hours ago.
I just wanted to make a list of AD group names into a powershell array.
Today, to configure fail2ban. Before that, yesterday to select which tests to run.
Usually I use glob patterns for test selection.
But I did use reges yesterday to find something else. A java security file definition.
Today.