XprsYrslf @ Twitter
 

How to validate an Email address with Regex & PHP -update

Regular Expressions on xkcdJust because everyone is allowed on the Internet, it doesn’t mean they are qualified to use it correctly or they are just people who make mistakes. Programmers need to correct these things and therefore validation of external input is very important for a secure website. Email addresses are one of the hardest things to validate because you got so many possibilities.

Validating of an email address mostly begins with a good regular expression which represents any possible email address. Searching on Google lead me to Ian Dunn, he made a nice list on possibilities an email address can have and gathered all attempts for a regex and tested it on this list. The most accurate, near perfect, regular expression came from Alexandre De Dommelin, which I used in my script.

I wasn’t fully pleased with the regular expression, so I wanted to validate the domain even more. A function in PHP 4+ allows to check whether MX records on a certain domain, just what I needed. Now the validation is perfect, I hope. The function is described below, it should be reusable in any project using PHP 4 or newer, please include credits. Code released under Creative Commons Attribution-Share Alike 2.0 Belgium License

function check_email($email) {
	//Function written by Jeroen Op 't Eynde - XprsYrslf.be
	//Creative Commons Attribution-Share Alike 2.0 Belgium License
	//Pattern from: http://fightingforalostcause.net/misc/2006/compare-email-regex.php
	$pattern = "/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9_][-a-z0-9_]*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|asia|cat|jobs|tel|[a-z][a-z])|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(:[0-9]{1,5})?$/i";
	if (function_exists('checkdnsrr')){
		$domain=strstr($email,'@');
		if(preg_match($pattern,$email) && checkdnsrr($domain,"MX")) return $email; //Linux: PHP 4.3.0 & Windows: PHP 5.3.0
		else return false;
	} else {
		if(preg_match($pattern,$email)) return $email; //PHP 4 or 5
		else return false;
	}
}

Please report any bugs/comments here or via the contact form.

Update:
On debugging a project, PHP threw some notices on the  split() function. It seems to be a deprecated function. I simply replaced it with the strstr() function. Below is the line I took out.

list($user,$domain) = split('@',$email);

7 Responses to “How to validate an Email address with Regex & PHP -update”


ivo says:

check also the php own filter functions to validate a mail address:

http://nl.php.net/manual/en/filter.filters.validate.php

Jeroen says:

Interesting, but it doesn’t validate quiet as much as above regex does. Maybe it is better in performance to use a function like that, but that is not the point here. Otherwise taking both functions together could increase performance after some testing (shorter regex, etc.).

I did a quick test with De Dommelin’s script and replaced preg_match() with filter_var($v, FILTER_VALIDATE_EMAIL). This is the output and compared to his output of his regex here (or here.

Should be VALID :
IPAndPort@127.0.0.1:25 : NOT VALID
&*=?^+{}’~@validCharsInLocal.net : NOT VALID
(these 2 run not valid instead of valid, I can understand that it isn’t allowed by PHP)

Should NOT be VALID :
missingDot@com : VALID
localEndsWithDot.@domain.com : VALID
two..consecutiveDots@domain.com : VALID
TLDDoesntExist@domain.moc : VALID
numbersInTLD@domain.c0m : VALID
local@SecondLevelDomainNamesAreInvalidIfTheyAreLongerThan64Charactersss.org : VALID

Jeff Dickey says:

Just make sure you leave this nailed up on the wall as *the* definitive anti-pattern for this domain: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address/Mail-RFC822-Address.html

More proof, if any was needed, that perl encourages maintenance-impossible code and reliance on schedulable miracles…

Jeroen says:

Interesting extension, but I think it is not always available or you just don’t have the choice like with shared hosting.
If you still want to use this validation when the extension is not available, you have to load in a regex of 6400+ characters, which is sometimes longer than the code were it is needed. (http://ex-parrot.com/~pdw/Mail-RFC822-Address.html)
Therefore, if you take out the TLDs, it is not maintenance-impossible and you will check if the MX record exists for the domain with the checkdnsrr().
I guess it depends on the possibilities you have and what you prefer.

Jeroen says:

A little update on the function so PHP doesn’t trow E_NOTICE

Richard Lynch says:

MX servers will lie. A lot. It’s pretty pointless to ask them.
If you want to be sure it’s valid, send an email and make the user click through.

Your Regex is wrong. I haven’t read it, but the *right* regex is THREE PAGES LONG. So yours is wrong.

Try using the imap_rfc882 function to parse the emails, and then check that you got valid fields back from that.

Jeroen says:

I think you don’t understand the code fully, checkdnsrr() doesn’t ask the MX servers anything, it just checks if they exist.

The regex is tested against various, maybe any kind of (un)existing email addresses, please check the references. ‘I haven’t read it’ is not an excuse. I know about the 6400+ character regex as stated two comments above yours, including my opinion on the matter.

Indeed, checking the email with a click through routine is the only way to see if it is valid, but do you want to do that every time you fill in your email somewhere, like for this comment?

Leave a Reply