Posts Tagged ‘data’

Delimiters

February 4, 2010

In case you never heard of or otherwise don’t understand what a delimiter is, I’ll explain briefly.

When a file is sent in plain text, you need to know how to split up the records. So, for example, if we wanted to automatically process a file of client information and it had a line like this:

Joe Smith 43 012345678 718-555-4321 4-1-66 11-87 11th St. Queens OH 23123 1-11-2001 9-13-2009

How should we process it? One solution would be to go based on “spaces”. We would end up with this data set:

First name:Joe

Last name: Smith

Age: 43

Client ID: 012345678

Phone #: 718-555-4321

DOB: 4-1-66

Address 1: 11-87

Address 2: 11th

Address 3: St.

City: Queens

State: OH

Zip Code: 23123

Client Join date: 1-11-2001

Client’s last visit: 9-13-2009

You can see from the bold portion above that in this case the address is going to cause problems. Also, consider what would have happened if there had been an apartment number?

When using plain old text files there are two main solutions to this problem. One is to use fixed length fields. This would require the sender and receiver to agree to use a designated sizes for each datum. So if we agreed that a first name would not be bigger than fifty characters, then even Joe has only three characters, the sender would still have to include 47 unused spaces (or other “filler” material). For example, if we agreed that the first name is a 10 character field and ‘%’ would be our filler, Joe would become:

Joe%%%%%%%

The other popular way of dealing with this problem is to agree to use a character that would never appear in the data be the “flag” that we have the entire data piece. For example, if we used ‘%’ as our delimiter, the line becomes:

Joe%Smith%43%012345678%718-555-4321%4-1-66%11-87 11th St. Apt C12%Queens%OH%23123%1-11-2001%9-13-2009%

First name:Joe

Last name: Smith

Age: 43

Client ID: 012345678

Phone #: 718-555-4321

DOB: 4-1-66

Address: 11-87 11th St. Apt C12

City: Queens

State: OH

Zip Code: 23123

Client Join date: 1-11-2001

Client’s last visit: 9-13-2009

For more information about this, you can try Wikipedia.