Digital Tinker

 
 
 
  • About

    Some days you get the data, some days the data gets you …
 
Middle Class Bailout - Fixed-Priced Services October 3rd, 2008

There is a time to cut costs and a time to ensure quality at any price.

Right now is not the right time for either service extreme.

Like many small business owners, my company is experiencing a crunch. While looking for ways to remain relevant and competitive, I joined oDesk.com,which is a marketplace for freelance programmers and writers.

oDesk has a very structured hourly-centric system that exposes the service extreme:

Offshore freelancers are willing to work for $4.00 per hour

New freelancers and those with lower feedback scores

Freelancers with high feedback scores are charging a premium for their services.

The group of freelancers in the middle constitutes a thick cluster of fierce bidding competition.

All of this is good for the freelance marketplace. The cream rises, the crud curdles and the best of the middle joins the elite after building some feedback.

Take a look at the freelance question from the viewpoint of your business. Can you afford to cut costs, taking a chance on offshore talent? On the other hand, can you really afford to pay a premium for the elite freelancer? oDesk clearly states that you are paying for time, not a product. If you get poor results, you’ll just have to hire someone else to fix the mess. If you went for the low-end solution, chances are, you won’t want another bargain basement solution. If you were disappointed with the boutique brand, you’ll wish you had gone to the middle. Either way, the fix means yet another expense.

What if you could present your problem without risk? Instead of paying for time, why not pay a fixed price for quality?
My Digital Tinker service initial consultation will ensure that I can even do the job. The proposed solution is one of a menu of fixed cost services. You won’t pay a dime until the job is done.

How does this benefit you?

You get to pre-qualify the service.

You control the cost.

You control your cash flow.

Your business needs every dollar to be well-spent. Contact me for help with your data management challenges.

Download the Digital Tinker Data Cleansing Packet.

Testimonial Interview with Rachel Tolbert September 19th, 2008

Background Investigations Database

The following testimonial was solicited from Rachel Tolbert, a Fingerprint Identification Supervisor at the Philadelphia Police Department.


1. What one thing has the database(s) helped you accomplish that was more difficult before?

The databases that were created by Mitchell Allen have helped me accomplish a lot in a shorter amount of time. The one I use the most often creates an alphabetical listing of various pieces of information that are needed to fax to other departments. It creates a neat, orderly, professional account of the information that is required.

Also an added advantage of one of the databases is it does the math for you. Occasionally, the tallied numbers may be incorrectly totaled. As long as you verify the information was input correctly, the program is going to give you the necessary correct totals.
These figures are used biweekly and monthly, and decisions are made based on these accurate figures, so this database is producing very important information.

2. How has your work day benefited from using the database?

I can do a preview on it prior to printing, to determine if all my information was input correctly. This has also helped me to send the most accurate information. It has been an invaluable tool over the years. I use it every single day.

3. Would you recommend Mitchell to others?

Definitely. I hate to think about how long it would take us everyday to organize the information that has been done for us- as we input it onto the page he created just for our purposes. It really makes this one aspect of our job so much easier.

4. Why would you recommend him?

I would recommend him because he is thorough, competent and very professional.

Read more testimonials

Forum Analysis September 15th, 2008

Some software ideas are best left in the ad hoc bin. This is another way of saying that, just because I can create something, I don’t necessarily want to.

My friend Erin was looking for a quick way to determine which forums on her At Home Resources website were performing well and which topics should be abandoned.

I did an unsolicited analysis - just for fun.

Once I had completed the project, Erin was so happy with the reports, I started thinking about ways to make this service available to others. Just as quickly, I turned away from the idea. There were two main reasons for not forging ahead:

1. I’m quite sure that forum software already provides this information
2. I didn’t think it would be sustainable

So, the software sits in my archive folder, a dim reminder of a concept that did its job and doesn’t need to be developed further.

Four Microsoft ® Access queries, one report and no front end form. A classic example of a quick-and-dirty solution.

Check out the ad hoc Forum Analysis.

How I Created the Ambient Orb Blog Index September 12th, 2008

Here’s a change of pace. In this article, I’ll show you how I used Microsoft ® Word, Microsoft ® Excel and Notepad to clean up an old blog post for republishing.

Project Purpose

For my personal blog, I decided to do a links posts, which is used to highlight older posts that new visitors may enjoy and long-time readers can revisit. There are some SEO benefits, so that incentive was there as well.

On my old blog, I regularly maintained one such post. It was called the Ambient Orb Blog Index and, rather than just having a few selected posts, I put all of them in this one post! I wouldn’t recommend that today, but back then, the website had no other way of effectively categorizing and showcasing blog posts.

Anyway, to give a sense of context for my new links post, I wanted to share the original Ambient Orb Blog Index. Of course, that meant that I would have to remove all of the dead links.

Examining the Raw Data

The original post was saved in Microsoft ® Word format.
Here are a few lines from the file:

<li><font size=2 color=blue><b><a href="http://www.writingup.com/mitchell_allen/am_i_an_artist_or_an_engineer">OOO</a>: Am I an Artist or an Engineer?</b></font>

<li><font size=2 color=blue><b><a href="http://www.writingup.com/mitchell_allen/beethoven_hearing_aids_and_suicide">OOO</a>: Beethoven, Hearing Aids and Suicide</b></font>

<li><font size=2 color=blue><b><a href="http://www.writingup.com/mitchell_allen/a_tortured_history_of_mass_storage">OOO</a>: A Tortured History of Mass Storage</b></font>

As each link was unique, I couldn’t do a mass search and replace. My first idea was to “bracket” the data that I wanted to delete. I figured that would allow me to use Microsoft ® Word’s wildcard search. Bracketing is a term I use to describe using a weird character to enclose or delimit (separate) substrings within a longer string. Usually, bracketing involves searching for a common substring and either appending the weird character to the end of the common substring or, if the common substring is unneeded, replacing that common substring altogether.

First Cleansing

For this project, I wanted to get rid of bits of unneeded text while bracketing the links. So, the first thing I did was to change every occurrence of

<a href="http://www.writingup.com/mitchell_allen/

to
~

To complete the bracket and get rid of even more useless text, I changed every occurrence of

">OOO</a>:

to the same ~.

Here are the modified lines at this point:

<li><font size=2 color=blue><b&gt~am_i_an_artist_or_an_engineer~Am I an Artist or an Engineer?</b></font>

<li><font size=2 color=blue><b&gt~beethoven_hearing_aids_and_suicide~Beethoven, Hearing Aids and Suicide</b></font>

<li><font size=2 color=blue><b&gt~a_tortured_history_of_mass_storage~A Tortured History of Mass Storage</b></font>

Now, I had to get rid of the bracketed text.

Weak Soap

In order to understand the wildcard search and replace, let me digress a moment. Look at this nonsense text:


frogface

frogs in the field

swollen river flows peacefully

Let’s perform a wildcard search and replace. To change every occurrence of text bracketed by the letter f to the (equally nonsensical) string consisting of fishf, we do the following:

1. Click Edit on the menu and select Replace … (or press Ctrl+H)

2. In the Find what: box, type f*f

3. Press the More button

4. Check Use wildcards

5. In the Replace with: box, type fishf

6. Press the Replace All button

This is the result of the wildcard search and replace:

fishface

fishfield

swollen river fishfully

The asterisk (*) is a wildcard operator that takes the place of zero or more arbitrary chararacters. When you don’t know or care what’s being matched, asterisk is your friend.

On the first line of nonsense text, the asterisk matches rog. On the second line, it matches rogs in the . Finally, on the third line, it matches lows peace.

Keep in mind, the search and replace function does things one at a time, so we don’t wind up with our asterisk matching ace and ield swollen river.

By the way, if you’d like to see all of the wildcard search operators, check the wildcard box and click the Special button.

Now, back to the project. I had fully expected to be able to replace ~*~ with a character string of my choice (actually, just deleting it would have been best, but I didn’t want to confuse you). However, no matches were found! I never figured out what I did wrong, but I must have introduced a spurious space character to the end of the search string because, when I recreated the files for this article, the wildcard search worked perfectly. So, instead of being done with this project, I had to try another tack.

After muttering a few choice curses, I went to my industrial-strength solution.

Strong Soap

Microsoft ® Excel has powerful conversion utilities. Once I had bracketed my data, I could literally show Microsoft ® Excel how to read it and convert it into a spreadsheet.

First, I pasted the slightly scrubbed data into Notepad and saved the file. I didn’t want any formatting characters to be imported into the spreadsheet. Next, I opened Microsoft ® Excel, clicked Data and selected Get External Data.

This invoked a powerful wizard that walked me through choosing the Notepad file, selecting the ~ as a delimiter and choosing the destination on the spreadsheet.

In seconds, I had a worksheet that looked like this:
worksheet 1

Click to view full size

Now that the dead links were separated, all I had to do was concatenate column A with column C and place the result in column D.

=A1 & C1

Here is a screen shot, with column B hidden:
worksheet 2

Click to view full size

Squeaky Clean

The only thing I had to do after all that, was to paste column D into a new Microsoft ® Word document and fix the syntax of the font tag by adding quotes (which weren’t needed on the old blog software).

The result can be viewed at my blog: Ambient Orb - Blog Index.

Parsing Strings in Excel September 9th, 2008

Jumbalaya Jam

Before I learned about HTML in 2003, I learned how to get rid of the tags that define it.
One of my passions is designing games and I wanted to recreate Boggle® on the computer.
My wife loved to play an online game called Jumbalaya, which is similar to Boggle®.
I decided to borrow her word lists for my Boggle® dictionary, so that it would reject nonsense words.
Because she played so many games in one sitting, I wrote a program in Visual Basic to read through the Internet Explorer Cache, locate the pages, read each one and extract just the words.

Real Estate Ads

A more recent project involved sifting through online advertising for real estate. I wanted a neat list that I could store in my own database.

This time, I wrote a program to copy the actual web page (just like selecting a portion of text to paste into another location).
I felt guilty because I thought I was doing something called scraping a website. However, when I finally learned the true meaning of scraping, which is to republish other people’s website content as if it were your own, I realized that I wasn’t doing anything unethical.

Parsing Power

The real power behind both mini-projects was the parsing engine.
To parse means to separate a string of data into parts.
It is very useful to parse an HTML file into its tags and values. Your web browser does this in order to display the web page on your screen.
It is also useful to be able to parse a string of characters, in order to find specific patterns. Google does this when it separates your search into keywords, by finding the spaces and punctuation between the words.

You don’t always have to use a programming language: Microsoft® Access and Microsoft® Excel provide string manipulation functions that can help you parse strings.
Keep in mind, though, these two programs don’t always agree on the names or even how to use the function!

For example, suppose you want to see if a string contains a comma. If it does, you want to know where it starts (this is called the string position).

Here is the way each program expects you to use their function:

Access: InStr([start, ]string1, string2)

string1 is the string being searched

string2 is the smaller string being sought

start is an optional parameter which tells the function where in string1 to begin looking for
string2

Excel: =SEARCH (find_text,within_text,start_num)

within_text is the string being searched

find_text is the smaller string being sought

start_num is an optional parameter which tells the function where in within_text to begin looking for find_text

If the search function does not find the string being sought, Access returns a 0, but Excel gives you a #VALUE error.

This means that you have to go through hoops to get Excel to parse arbitrary strings.

To understand my suggested solution below, open up Excel and type Digital,Tinker into cell A1.
In cell A2, type or paste the following formula:

=SEARCH(",", A1)

When you press Enter, A2 will display 8.

If you remove the comma from A1, A2 will display #VALUE.

Experts Only

I am always learning and never claim to know all the answers. I hope that, if you consider yourself an expert, you’ll do two things:

  • Share your techniques with us
  • Be opened-minded about unusual techniques you find by other readers and by me

Now, consider the reason the Excel function fails and think of the easiest way to avoid it.

Here’s what I think: just add the smaller string to the end of the string being searched!

Adding one string to another string is called concatenating and is symbolized by the ampersand ( & )
In cell A3, type or paste the following formula:

=SEARCH(",", A1 & ",")*(SEARCH(",", A1 & ",") <LEN(A1))

When you press Enter, A3 displays a 0.
Add the comma to A1, press Enter and both A2 and A3 will display the string position correctly.

This works because of a little-known feature of nearly all programming languages. Normally, when you evaluate an equation, the result is TRUE or FALSE.
However, if you use that result in another equation, Excel treats it as a 1 if TRUE or a 0 if FALSE.
To see this, type or paste just the evaluation into A4:

=(SEARCH(",", A1 & ",") <LEN(A1))

A4 will display either TRUE or FALSE, depending on whether a comma is in A1.

Now, change A4 to multiply that evaluation by 1:

=(SEARCH(",", A1 & ",") <LEN(A1))*1

Now you’ll see a 1 or a 0 in A4, depending on the presence of a comma in A1.

Looking again at cell A3, you can now see that by multiplying the result of the search by the result of the evaluation, we’re basically saying, “multiply the search result by either 0 or 1″.
As you may recall from elementary school mathematics, any number multiplied by 0 equals 0, while any number multiplied by 1 equals that number.

(In some programming languages, such as Visual Basic and Access, -1 is used. You could negate the final result or use the ABS( ) function to change it into a positive number.)

As for the mysterious LEN( ), Excel returns the length of the string of text in A1. So, the evaluation does the SEARCH, compares the result to the length of the original string and returns 1 (TRUE) if the string position of the comma is less than the length of the string in A1. That would mean that a comma was found before the one we concatenated to the end!

By the way, in Excel, you are always working with a copy of something in your formulas. We never really concatenate a comma to the end of A1. We are using a copy of A1 in our other cells.

If you come up with anything shorter than this suggestion, I’d love to hear it!