If you’re a rails developer, you’ve probably seen code that looks like this:
What usually happens here is whenever you don’t know the regex for pretty common
attributes like email, you’ll most likely rely on the first few results from google
REGEXP_FROM_GOOGLE. For example: this came up as my first search result for
the term ‘email regex’. Here’s a sample taken from that page:
This is good for most languages, except in ruby. Let’s see why:
2 things are at play here:
A lot of people use line anchors
$to enclose their regex, which is* used in a lot of rails tutorials/blogs/code samples. Usually, in other languages, the same
email_regexwill not accept
"email@example.com\n<script>dangerous script</script>". However…
Ruby regexes default to multiline mode.
And the latter reason is what trips most people up, including me.
To avoid this, you should just replace the line anchors
Fortunately, according to the most recent rails security guide on regexes,
rails will now raise an exception when you use
$ in ActiveRecord’s format validator (validates_format_of).
Unfortunately, not everyone has read that part of the ralis documentation guide yet, and not everyone is using the latest rails. Maybe other ORMs haven’t even caught up with ActiveRecord yet. It will take some time before people will notice this issue.
It’s good to know that some good books like Hartl’s Rails Tutorial have already adopted new regex examples. Hopefully, there will be less incoming rails developers that will use less secure regexps.