One of the best things you can do with your contact list is to create a standard for capitalization.
Whether your data is spread across multiple fields or jammed into one (shame on you!), a parsing function can make your reports look more professional.
proper(”John D. ROCKWELL, S. Kinnickinnic Av and Lake Pkwy, milwaukee, Wi”)
becomes
John D. Rockwell, S. Kinnickinnic Av & Lake Pkwy, Milwaukee, WI
Way back in 1999, I modified a function that I found in the Microsoft ® Access Solutions Database.
If you look closely, you’ll see several items relevant to Philadelphia and law enforcement.
This function is written for Access but, if you know how to add functions in Excel, it should work with spreadsheet data, too.
The basic idea is to read each character and look at what comes before and after it. Based on what it finds, the function will capitalize the character and move on to the next. The default behavior is to capitalize the first letter of each word.
In addition to the basic functionality, you can add all kinds of rules for capitalizing location indicators (Rd., Ln, Drive), states (VA instead of Va) and organizations (FBI, CIA, NBC).
For example, I added a rule to change and to & - making sure not to change Anderson to &erson!
There is one limitation in my version:
Names such as O’Brien and Wilson-Smythe are properly capitalized, but MacDonald is changed to Macdonald, and van Buren to Van Buren
If your data has many similar names that are not properly capitalized, just add a new rule to look for them!
This tip is published below for educational purposes only. If you use it, make sure you only work with sample data!
I didn’t doctor it for this blog: what you see is how I used it. It doesn’t win any clarity awards, but it gets the job done!
‘ *** ProperNouns
‘
‘ *** This Module can be exported to MODULES.MDB
‘
‘ The ProperNouns Module contains functions which clean up names and addresses.
‘
‘ ———————————————————————————–
Function Proper(x)
‘ Capitalize first letter of every word in a field.
‘ Use in an event procedure in AfterUpdate of control;
‘ for example, [Last Name] = Proper([Last Name]).
‘ Names such as O’Brien and Wilson-Smythe are properly capitalized,
‘ but MacDonald is changed to Macdonald, and van Buren to Van Buren.
‘ Note: For this function to work correctly, you must specify
‘ Option Compare Database in the Declarations section of this module.
‘ *** This routine was copied from the Solution Database and modified
‘ *** to ignore ordinal numbers (1st, 2nd, 3rd, 4th).
‘ *** Additionally, the abbreviation for states are capitalized.
‘ ***
‘ *** Revision History
‘ *** 09/09/1999 - apostrophes are checked to prevent Upper case ”s”
‘ *** 11/20/1999 - spaces are added between most symbols - fixed ordinal bug
‘ *** 11/29/1999 - ”And” converted to ”&”
‘ *** 01/10/2000 - fixed ”And” bug, which changed Anderson to &erson
‘ *** 09/12/2000 - double quotes converted to single quotes (fixes John”s)
‘ *** 09/12/2000 - Capitalized specific three-letter strings (see org)
‘ *** 02/22/2002 - Fixed bug in three-letter capitalization. Added misc. two-letter strings
‘ *** 04/08/2002 - Created a ”Scrub” function to eliminate extraneous characters
Dim raw As String, temp As String, c As String, OldC As String
Dim aa As String, ab As String
Dim i As Integer, j As Integer, k As Integer
If IsNull(x) Then
Exit Function
End If
Const symbols = ”#&/()” ’ these should have spaces around them
Const digits = ”0123456789″
Const ordinal = ”StNdRdTh”
Const state = ”AL AS AR AZ CA CO CT DC DE FL GA HI ID IL IN IA KS KT LA ME ” _
& ”MD MA MI MN MS MO MT NE NV NH NJ NM NY NC ND OH OK OR PA RI ” _
& ”SC SD TN TX UT VT VA WA WV WI WY ”
Const org = ”ATF CJC DAO DEA FBI IAD III IRS JFK PHA PNC PPL UPS VII ”
Const misc2 = ”II IV IX VI GP”
’ Add missing spaces (e.g.: change Phila.,PA to Phila., PA) and (P/O to P / O)
raw = Scrub(x)
For i = 1 To Len(raw)
aa = Mid$(raw, i, 1)
ab = Mid$(raw, i + 1, 1)
If InStr(symbols, aa) Then
temp = temp & aa
If ab ” ” Then
temp = temp & ” ”
End If
ElseIf InStr(symbols, ab) Then
If aa ” ” Then
temp = temp & aa & ” ”
Else
temp = temp & aa
End If
Else
temp = temp & aa
If InStr(1, ”0123456789′ -”, aa) = 0 Then
If (aa ”z”) And ab ” ” Then
If Not (aa = ”.” And ab = ”,”) Then
temp = temp & ” ”
End If
End If
End If
End If
Next i
’ Initialize OldC to a single space because first
’ letter needs to be capitalized but has no preceding letter.
OldC = ” ”
For i = 1 To Len(temp)
c = Mid(temp, i, 1)
If c >= ”a” And c <= ”z” And (OldC ”z”) _
And Not (OldC = ”‘” And c = ”s”) Then
Mid(temp, i, 1) = UCase(c)
End If
OldC = c
Next i
raw = Trim(temp) & ” ”
temp = ”"
For j = 1 To Len(raw) - 2 ’ Check ordinals
aa = Mid$(raw, j, 1)
ab = Mid$(raw, j + 1, 2)
temp = temp & aa
If InStr(1, digits, aa) > 0 Then
x = InStr(1, ordinal, ab)
If x > 0 Then
temp = temp & LCase(Mid$(ordinal, x, 2))
j = j + 2 ’ skip ordinal in Temp
End If
End If
Next j
For k = 1 To Len(state) Step 3 ’ Check States
x = InStr(1, temp, Mid(state, k, 2))
If x > 2 Then
If Mid(temp, x - 2, 2) = ”, ” Then
If x + 2 > Len(temp) Or _
(Mid(temp, x + 2, 1) = ” ”) Then
temp = Left(temp, x - 1) & UCase(Mid(temp, x, 2)) & Mid(temp, x + 2)
End If
End If
End If
Next k
For k = 1 To Len(org) Step 4 ’ Check Organizations and other 3-letter initials
x = InStr(1, temp, Mid(org, k, 3))
Select Case x
Case Is = 1 ’ check for space BEFORE 3-letter initials
If Len(temp) = 3 Then
temp = UCase(Mid(temp, x, 3)) & Mid(temp, x + 3)
ElseIf Mid(temp, x + 3, 1) = ” ” Then
temp = UCase(Mid(temp, x, 3)) & Mid(temp, x + 3)
End If
Case Is > 1 ’ check for space BEFORE and AFTER 3-letter initials
If Mid(temp, x - 1, 1) = ” ” And _
(x + 3 > Len(temp) Or _
Mid(temp, x + 3, 1) = ” ”) Then
temp = Left(temp, x - 1) & UCase(Mid(temp, x, 3)) & Mid(temp, x + 3)
End If
End Select
Next k
For k = 1 To Len(misc2) Step 3 ’ Check miscellaneous 2-letter initials
x = InStr(1, temp, Mid(misc2, k, 2))
Select Case x
Case Is = 1 ’ check for space BEFORE 2-letter initials
If Len(temp) = 2 Then
temp = UCase(Mid(temp, x, 2)) & Mid(temp, x + 2)
ElseIf Mid(temp, x + 2, 1) = ” ” Then
temp = UCase(Mid(temp, x, 2)) & Mid(temp, x + 2)
End If
Case Is > 1 ’ check for space BEFORE and AFTER 2-letter initials
If Mid(temp, x - 1, 1) = ” ” And _
(x + 2 > Len(temp) Or _
Mid(temp, x + 2, 1) = ” ”) Then
temp = Left(temp, x - 1) & UCase(Mid(temp, x, 2)) & Mid(temp, x + 2)
End If
End Select
Next k
’ Convert And to ampersand
x = InStr(1, temp, ” And ”)
If x > 0 Then
temp = Left$(temp, x) & ”& ” & Mid$(temp, x + 5)
End If
’ process names beginning with Mc or Mac
x = InStr(temp, ” mc”)
If x Then
aa = UCase(Mid$(temp, x + 3, 1))
i = 4
If aa = ” ” Then
aa = UCase(Mid$(temp, x + 4, 1))
i = 5 ’ skip the space
End If
temp = Left(temp, x) & ”Mc” & aa & Mid$(temp, x + i)
End If
Proper = temp
End Function
After years of manually processing documents, I wrote a program that greatly simplifies - and speeds up - tedious tasks such as this one.
Visit www.parsermonster.com for more information.
|