Regex to Split Single Line Address
I had the need come up in a project to split an address that gets inputted in a single line and has no validation. Apart from the obvious fix of requiring validation and saving the fields in different columns I had to come up with the best way to to split out that address so that we could wrap it in a rich snippet. This is a Regex I found, I cannot find it again..so if someone knows the original source/author please let me know so I can give the correct credit.
Since I am having such a hard time finding anything that is close to this I figured it necessitated a blog post to get the information out there.
//Splits Address formatted like this.. 1045 E Test Lane, Gilbert, AZ 85296
Regex splitAddressRegex =
new Regex(@"(?(^[^,]*,[^,]*,[\w\s]*$) # If check Condition for 2 commas if so match below
(?[^,]*) # Place into capture group Line1
(?:,\s) # Match but don't place into capture.
(?[^,]*) # Place into capture group City
(?:,\s) # Match but don't place into capture.
(?\w\w) # Place into capture group State
(?:\s*) # Ignore spaces
(?[\d\-]*) # Place int Zip
(?:$|\r\n) # Match/No group either $ or \n\r
| # Else its a bigger address
(?[^,]*) # Place into capture group Line1
(?:,\s) # Match but don't place into capture.
(?[^,]*) # Place into capture group Line1
(?:,\s) # Match but don't place into capture.
(?[^,]*) # Place into capture group City
(?:,\s) # Match but don't place into capture.
(?\w\w) # Place into capture group State
(?:\s*) # Ignore spaces
(?[\d\-]*) # Place int Zip
(?:$|\r\n) # Match/No group either $ or \n\r
)", RegexOptions.IgnorePatternWhitespace);
It’s a pretty in-depth and ugly Regex but you end up with the address split into groups. So if you run the address: 1045 E Test Lane, Gilbert, AZ 85296
You end up with groups such as..
matchedAddress.Groups["Line1"].Value //1045 E Test Lane
The comments in the Regex snippet show you the group names. Again, I did not make this and would love to give credit but can’t find the original source and definitely want to have it published as it saved me a lot of time.