I'm trying to parse some funky address data into separate fields either using ArcMap's Field Calculator or a Python Script. Currently, my address data including street number, street name, and unit/apt/suite number are all in one field. Some examples of the diversity in my data include:
1111 2ND ST
1753 21ST ST 2ND FLOOR
2125 ARIZONA AVE #303
2424 12TH ST SUITE 100
Thankfully, there are only a handful of suffixes which include:
AV | AVE | BL | CT | DR | LANE | PKWY | PL | RD | ST | TER | WAY
What python methods could I employ to search for these unique identifiers and then split whatever follows them (e.g., "2ND FLOOR", "#303", "SUITE 100")?
Answer
You would use .split method for this.
inStr = "1753 21ST ST 2ND FLOOR"
result = inStr.split(' ST ')
print result
>>>['1753 21ST', '2ND FLOOR']
Keep in mind that result is the list which you would need to iterate if you want to get hands at individual element of the address. If you will need to work with the address items in a more advanced way, you will inevitably come to using regular expressions with ArcGIS or Python.
To take care of all possible cases, I've provided a list of separators. The assumption is that the separators are always embraced by whitespaces on both sides - ' ST ' and ' AVE ' and so forth:
inStrings = ["1753 21ST ST 2ND FLOOR","2125 ARIZONA AVE #303","2424 12TH TER SUITE 100"]
outAddresses = []
separators = ["AV","AVE","BL","CT","DR","LANE","PKWY","PL","RD","ST","TER","WAY"]
for address in inStrings:
for separator in separators:
if len(address.split(' {0} '.format(separator))) > 1:
outAddresses.append(address.split(' {0} '.format(separator)))
print outAddresses
>>>[['1753 21ST', '2ND FLOOR'], ['2125 ARIZONA', '#303'], ['2424 12TH', 'SUITE 100']]
Output can be also be written with list comprehensions (harder to read yet more compact and sometimes faster):
outAddresses = [address.split(' {0} '.format(separator)) for separator in separators for address in inStrings
if len(address.split(' {0} '.format(separator))) > 1]
No comments:
Post a Comment