csv.Sniffer gets the sniffles

Since I regularly work with both comma delimited and tab delimited files, I’d like for my Quiffer project to be able to automatically detect if a csv file is tab delimited, and if it is, specify tab (‘\t’) as the delimiter when reading it using Python’s csv.reader class.

Here’s a sample of tab delimited data — an Amazon transaction file:

Transaction Summary for December 1, 2016 to December 8, 2016
Transactions: 3

Date Order ID SKU Transaction type Payment Type Payment Detail Amount Quantity Product Title
Dec 3, 2017 114-5190104 DI-2QQ3-4X5H Order Payment Amazon fees Shipping commission $-0.50 Disney's 2012 Lady and the Tramp Sketchbook Ornament
Dec 3, 2017 114-5190104 DI-2QQ3-4X5H Order Payment Product charges $29.86 1 Disney's 2012 Lady and the Tramp Sketchbook Ornament
Dec 3, 2017 114-5190104 DI-2QQ3-4X5H Order Payment Other Shipping $3.99 Disney's 2012 Lady and the Tramp Sketchbook Ornament

I need Quiffer to be able to parse each row into a list, like so:


['Date', 'Order ID', 'SKU', 'Transaction type', 'Payment Type', 'Payment Detail', 'Amount', 'Quantity', 'Product Title']
['Dec 3, 2017', '114-5190104', 'DI-2QQ3-4X5H', 'Order Payment', 'Amazon fees', 'Shipping commission', '$-0.50', '', "Disney's 2012 Lady and the Tramp Sketchbook Ornament"]
['Dec 3, 2017', '114-5190104', 'DI-2QQ3-4X5H', 'Order Payment', 'Product charges', '', '$29.86', '1', "Disney's 2012 Lady and the Tramp Sketchbook Ornament"]
['Dec 3, 2017', '114-5190104', 'DI-2QQ3-4X5H', 'Order Payment', 'Other', 'Shipping', '$3.99', '', "Disney's 2012 Lady and the Tramp Sketchbook Ornament"]

Of course, the above is the default behavior when a comma delimited file is fed to csv.reader without a delimiter specified, as in the following code:


import csv

src_file = 'sample_data/comma_sample.csv'

with open(src_file, newline='', encoding='utf-8') as csv_file:
    data = list(csv.reader(csv_file))

When a tab-delimited file is opened with this code, however, things get messy:


[['Transaction Summary for December 1', ' 2016 to December 8', ' 2016'],
['Transactions: 3'],
[],
['Date\tOrder ID\tSKU\tTransaction type\tPayment Type\tPayment Detail\tAmount\tQuantity\tProduct Title'],
['Dec 3', " 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tAmazon fees\tShipping commission\t$-0.50\t\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"],
['Dec 3', " 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tProduct charges\t\t$29.86\t1\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"],
['Dec 3', " 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tOther\tShipping\t$3.99\t\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"]]

Since csv.reader is expecting a comma delimited file, it splits each row in two at the comma after the month in the Date column.  The rest of the columns are mashed together in a single string, with ‘\t’ in between.

I’d rather not code two different versions of Quiffer for different types of files, with the only difference being the ‘delimiter’ specification in csv.reader.  Surely, I thought, there must be a way for Python to examine a file and determine how its fields are delimited, and parse it accordingly.

A while back I discovered a class in the Python csv module called csv.Sniffer.  My understanding is that Sniffer could figure out if a file was tab delimited or comma delimited, and provide this information to csv.reader in the ‘dialect’ parameters.  I wrote the following to test it with the Amazon sample file:

import csv

def sniff_test(file):
    with open(file, newline='') as csvfile:
        dialect = csv.Sniffer().sniff(csvfile.read(1024))
    
        csvfile.seek(0)

        reader = csv.reader(csvfile, dialect)

        return [row for row in reader]

This was the output:

['Transaction Summary for December 1, 2016 to December 8, 2016']
['Transactions: 3']
[]
['Date\tOrder ID\tSKU\tTransaction type\tPayment Type\tPayment Detail\tAmount\tQuantity\tProduct Title']
["Dec 3, 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tAmazon fees\tShipping commission\t$-0.50\t\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"]
["Dec 3, 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tProduct charges\t\t$29.86\t1\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"]
["Dec 3, 2017\t114-5190104\tDI-2QQ3-4X5H\tOrder Payment\tOther\tShipping\t$3.99\t\tDisney's 2012 Lady and the Tramp Sketchbook Ornament"]

I tried again, using a list of delimiters, as suggested in How to use csv.Sniffer for 2 different CSV-types? and the code threw a Could not determine parameter error. I did some more experimenting, and to make a long story short, I found that what was giving Sniffer a stuffy nose was the inconsistent layout of the Amazon file.  Note that the first three lines are like a report header, and do not contain any delimiters.  I removed these three lines and made a copy of the file without them, and Sniffer worked.  Nevertheless, I don’t want to have to physically alter my files before parsing them with Quiffer. Solution: have Quiffer read the file a line at a time and look for tabs, and if tabs are present, open as a tab delimited file.  Here’s the latest, tested version:

import csv

def read_data(src_file):
    """
    Import rows of data from a csv file into a list of rows, where each row is
     a list containing column values.
    """

    with open(src_file, newline='', encoding='utf-8') as csv_file:

        # Check the file for tabs.
        line = csv_file.readline()

        while line != "":
            if '\t' in line:

                # tab was found, so read it as csv with delimiter specified
                csv_file.seek(0)

                return list(csv.reader(csv_file, delimiter='\t'))
                
            line = csv_file.readline()

        # if no tabs present, read as a normal csv
        csv_file.seek(0)

        return list(csv.reader(csv_file))

Reading CSV data

Last week I ran across this article on Real Python on how to code a simple CSV file reader, using test driven development, and work with data from the files.  For a long time I’ve understood the value of TDD but wasn’t quite sure how to apply it meaningfully to writing code involving data from external files.  This article helped tremendously — and just in time for me to write code for the very first step of my Quiffer project:  reading financial transactions from a CSV file.

The sample data I’m using for this project is from a PayPal transaction file download.  Here are my first tests, for which I have written code that passes:

import unittest
from quiffer import read_data

class QuifferTest(unittest.TestCase):

    def setUp(self):
        self.data = 'sample_data/paypal_sample.csv'

    def test_read_data_headers(self):
        self.assertEqual(
            read_data(self.data)[0],
            ['Date', 
             ' Time', 
             ' Time Zone', 
             ' Name', 
             ' Type', 
             ' Status', 
             ' Currency', 
             ' Gross', 
             ' Fee', 
             ' Net', 
             ' From Email Address', 
             ' To Email Address', 
             ' Counterparty Status', 
             ' Address Status', 
             ' Item Title', 
             ' Item ID', 
             ' Shipping and Handling Amount', 
             ' Insurance Amount', 
             ' Sales Tax', 
             ' Option 1 Name', 
             ' Option 1 Value', 
             ' Option 2 Name', 
             ' Option 2 Value', 
             ' Auction Site', 
             ' Buyer ID', 
             ' Item URL', 
             ' Closing Date', 
             ' Reference Txn ID', 
             ' Invoice Number', 
             ' Custom Number', 
             ' Receipt ID', 
             ' Balance']
            )

    def test_read_data_name(self):
        self.assertEqual(read_data(self.data)[1][3], 'Almech Devices, LLC')

    def test_read_data_name_chinese(self):
        self.assertEqual(read_data(self.data)[3][3], '广州满翼易有限公司')

    def test_read_data_gross(self):
        self.assertEqual(read_data(self.data)[1][7], '-2.75')

if __name__ == '__main__':
    unittest.main()

As you can see, I’m following right along with the RealPython article’s methods. Here’s the Quiffer code so far:


import csv

def read_data(data):
    with open(data, encoding='utf-8') as csv_file:
        data = list(csv.reader(csv_file))
       
    return data

data = 'sample_data/paypal_sample.csv'

paypal_data = read_data(data)

I added the “encoding=’utf-8′” to the ‘open’ statement, because otherwise it threw a UnicodeDecode error. The original article had line 5 as:

data = [row for row in csv.reader(f.read().splitlines())]

..but thanks to what I learned from Trey Hunner’s CSV Python chat, I simplified this to:

data = list(csv.reader(csv_file))

I’m sure there are more tests I can add to be more thorough, so that’s the next step.  If anyone has any suggestions, hit me up on Twitter.

‘Quiffer’: Getting CSV Data into GnuCash via QIF Import

I use the excellent open source accounting software GnuCash to manage my personal finances.  I discovered it earlier this year and found it to be just as good as Quicken and without the evil.   I’m old-fashioned when it comes to managing finances.  I don’t need to have access to my financial records on all my devices; I am content to have a single app running on my laptop (backed up using Dropbox, of course) and to enter my receipts into GnuCash when I get home each day.  I prefer apps I can easily get my data into and out of.  Finally, I don’t believe handing my bank account login information to a third party is a good idea, preferring to download financial transactions from my accounts, in CSV format, manually.

GnuCash has the option to import transactions from QIF format, which allows for “split” (also known as double entry) transactions.   For example, this is what a QIF file would look like for my monthly electric payment:

!Account
NAssets:Checking Account
^
!Type:Bank
D2017-11-28
T-50.00
PFlorida Power and Light
SExpenses:Electric
$50.00
^

This transaction has two entries: the deduction of $50.00 from the bank account to the power company, and the increase of $50.00 in the GnuCash account “Expenses:Electric.”  The same transaction as downloaded in a CSV file from my bank would be something like this:

11/28/17,Florida Power and Light,-50.00

Obviously, this is a radically different file format than QIF.  Also note that the bank only has one “side” of the transaction, the deduction from the bank account.  Getting transactions like these into GnuCash, then, requires the additional step of adding the correct GnuCash expense account to each transaction — not just converting from CSV to QIF.

Fortunately, I’ve discovered Python is very good at converting between very different text file formats.

In my transition to GnuCash, the biggest challenge was converting four years of financial transactions from Microsoft Access 2010 to QIF.  I started by exporting from Access to CSV.   The resulting CSV file had over 20,000 lines.  I was able to throw together some messy Python code to convert the 20,000 lines of CSV to QIF, and then get all the data into Gnu Cash.  There were so many lines that I worked with them in batches so I could check the account balances as I went.

I still use a slightly cleaner version of this code on a monthly basis to import transactions from PayPal, my Amazon seller account and eBay into QIF format, for import into GnuCash.  This makes it easy to keep detailed sales and expense records.  Of all the Python code I’ve written, this QIF creator has been by far the most useful to me.  Now I hope to make it useful to others.

The more I learn about Python, the more I see how much my code can be improved.   My existing code for CSV to QIF conversion is neither user-friendly nor GitHub worthy, so I plan to clean up, refactor and redesign it.  From this point forward, I’ll call this project “Quiffer.”

To illustrate the challenges faced by this project, here’s a sample PayPal CSV file, which is the most complicated file type I deal with regularly:

 
Date, Time, Time Zone, Name, Type, Status, Currency, Gross, Fee, Net, From Email Address, To Email Address, Counterparty Status, Address Status, Item Title, Item ID, Shipping and Handling Amount, Insurance Amount, Sales Tax, Option 1 Name, Option 1 Value, Option 2 Name, Option 2 Value, Auction Site, Buyer ID, Item URL, Closing Date, Reference Txn ID, Invoice Number, Custom Number, Receipt ID, Balance
11/28/2017,21:26:38,EST,"Almech Devices, LLC",Express Checkout Payment Sent,Completed,USD,-2.75,0,-2.75,bxx4@gmail.com,bxxxk@gmail.com,Verified,Confirmed,Tempered Glass Protective Screen Protector Film for Samsung Galaxy S5 Active,131511092340,0,0,0,,,,,Ebay,rxx04,http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=131511092340,,,,,,58.28
11/28/2017,21:26:36,EST,rosie castle,Express Checkout Payment Sent,Completed,USD,-7.98,0,-7.98,bxx4@gmail.com,ixx976@gmail.com,Verified,Confirmed,"The Good Dinosaur (DVD, 2016) NEW MOVIE! NOW SHIPPING ! ",222735118701,0,0,0,,,,,Ebay,rxx4,http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=222735118701,,,,,,61.03
11/28/2017,21:26:34,EST,广州满翼易有限公司,Express Checkout Payment Sent,Completed,USD,-5.5,0,-5.5,bxx4@gmail.com,vxxxl@48express.com,Non-U.S. - Verified,Confirmed,Cotton Linen Harry Potter Pillow Case Sofa Waist Throw Cushion Cover Home Decor [S1],282606610348,0,0,0,,,,,Ebay,rxx04,http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=282606610348&var=581788364837,,,,,,69.01

Note there are a few difficulties here:

  1. There are numerous columns I don’t need.
  2. The date format will need to be changed to YYYY-MM-DD for the QIF file.
  3. There are multiple columns (Name, Type, Status, Item Title) containing data that I want to appear in GnuCash’s “Description” field.  These columns will therefore need to be combined into a single column.
  4. Transactions that are payments for products I’ve sold include an amount in separate column, “Fee”, which is the fee charged by PayPal to receive funds.  Such transactions will actually need FOUR entries in GnuCash: two each for the payment and fee.
  5. My code needs to include a means of determining, automatically, the correct GnuCash accounts where the income and expense will be posted.

The first step for Quiffer will be to import the data from the CSV file into a Python data structure.  That will be the topic of my next post.  Stay tuned.

A recap of the adventure so far

Since it’s been so long since I’ve posted, here’s a recap of all the twists, turns and detours of my Python Adventure so far….

It all began in the summer of 2013 with a Codecademy Python course.

I read that web development is an important and possibly profitable skill to learn, so I tried a tutorial  but found it to be just a little over my head. After deciding to learn desktop GUI building, an attempt to complete an RSS reader project taught me how to cope with and learn from unexpected problems in coding. Then along came my grandiose idea of designing my own Microsoft Access-like application for managing my finances, using SQLite and a Python GUI, not stopping to ask myself why, if web development was over my head, coming up with my own graphical database app was not.

Soon, real-world responsibilities intruded into my Python adventure, so I took a detour to learn how to use Python to wrestle text data into Microsoft Access at my job.  To further hone my practical skills, I worked on the Real Python course , learned about GitHub and virtual environments, started using PyCharm on a Linux virtual machine as my development setup, and committed to Python 3 over Python 2. Soon, along came a quick look at unit testing courtesy of Dive Into Python.

I wrestled with the learning process..how much time should I spend improving my existing code as I learn how to accomplish tasks more cleanly and efficiently, versus learning new things? Inspired by a little Tkinter-based contact database I ran across, I thought up what I thought was a simple yet powerful idea for my own Access-like graphical database. I learned about imports and how to set a location for Python to look for modules to import , then came a detour to brush up on my SQL knowledge. After much work, I found Tkinter to be frustrating to use for table-like GUI objects, so I switched to Pyside Qt for my GUI. The necessity of actually tracking my finances while waiting to finish my Python GUI app for tracking my finances led me to put my financial data into LibreOffice Base because it was open source, available, and quick, though with some annoying drawbacks.

Months went by.  I then came to realize it didn’t make much sense to spend so much time, in effect, trying to re-invent Microsoft Access, so I decided to give web development another try, with the Real Python web development course but it soon fell by the wayside because what I was learning, though fascinating, was time consuming and not of immediate practical use to me.

In April 2017, I discovered GnuCash, a professional-quality, well-established, open source financial management tool, in the same class as Quickbooks.  This was a fantastic discovery for my finances, as it made tracking income, expenses and bank accounts a breeze.  On the other hand, it was the final nail in the coffin of the central quest chronicled in My Python Adventure: my own financial management application.  It was also the end my of emphasis on learning experiences over practicality.  As is obvious from reading through this blog, there are simply too many demands on my time for me to be able to devote the necessary hours to learning how to build applications that have already been built, and way better than my present skills could achieve.  Going forward, my Python projects will be smaller, simpler and useful to me personally.  I believe this philosophy will prove the best way to keep me inspired and growing in my skill level, and holds out the best hope of providing me the building blocks for something greater in the future.

Fun with app.config

In my pre-Python days, whenever I heard the word “Flask”, this is the first image that came to mind:

images

Very appropriate now, since experimenting plays a central role in learning to code.  Even more appropriate for me, since the first Python web framework I’m performing experiments with is a “flask” of a different type.   Here’s a script I wrote to help me learn how app.config works in a Flask app, the blog I learned to create from Real Python Part 2, Chapter 7.  See comments for explanation.

# config fun

from flask import Flask

# configuration
DATABASE = 'blog.db'
USERNAME = 'admin'
PASSWORD = 'admin'
SECRET_KEY = 'wouldnt_you_like_to_know_ha_ha'

app = Flask(__name__)

# pulls in app configuration by looking for UPPERCASE variables
app.config.from_object(__name__)

def print_cfg():
 """
 A little function to print app.config in alphabetical order
 """
 print()
 for key in sorted(app.config):
 print(key,app.config[key])

# show the starting configuration
print("Starting config:")
print_cfg()

# now to test different ways of updating app.config - first I just add a new
# value in like we did above
app.config['FAVORITE FOOD'] = 'spam'

# show me some spam!
print("Added a new one")
print_cfg()

# now I'll add a new value using a file 'cfg.py'
app.config.from_object('cfg')

# 'foo' + 'bar != 'fubar'
print("And another one")
print_cfg() 

The file ‘cfg.py’ referenced in line 37 above contains the following line only:

FAVORITE_FOO = 'bar'

Results from first call of print_cfg. Note all configuration items are listed in nice alphabetical order.

APPLICATION_ROOT None
DATABASE blog.db
DEBUG False
EXPLAIN_TEMPLATE_LOADING False
JSONIFY_MIMETYPE application/json
JSONIFY_PRETTYPRINT_REGULAR True
JSON_AS_ASCII True
JSON_SORT_KEYS True
LOGGER_HANDLER_POLICY always
LOGGER_NAME __main__
MAX_CONTENT_LENGTH None
PASSWORD admin
PERMANENT_SESSION_LIFETIME 31 days, 0:00:00
PREFERRED_URL_SCHEME http
PRESERVE_CONTEXT_ON_EXCEPTION None
PROPAGATE_EXCEPTIONS None
SECRET_KEY wouldnt_you_like_to_know_ha_ha
SEND_FILE_MAX_AGE_DEFAULT 12:00:00
SERVER_NAME None
SESSION_COOKIE_DOMAIN None
SESSION_COOKIE_HTTPONLY True
SESSION_COOKIE_NAME session
SESSION_COOKIE_PATH None
SESSION_COOKIE_SECURE False
SESSION_REFRESH_EACH_REQUEST True
TEMPLATES_AUTO_RELOAD None
TESTING False
TRAP_BAD_REQUEST_ERRORS False
TRAP_HTTP_EXCEPTIONS False
USERNAME admin
USE_X_SENDFILE False

Here are the abbreviated results after adding a very important, and delicious, value from the module using UPPERCASE:

APPLICATION_ROOT None
DATABASE blog.db
DEBUG False
EXPLAIN_TEMPLATE_LOADING False
FAVORITE FOOD spam
JSONIFY_MIMETYPE application/json
.
.
.
USE_X_SENDFILE False

Finally, another value added from the file ‘cfg.py’:

APPLICATION_ROOT None
DATABASE blog.db
DEBUG False
EXPLAIN_TEMPLATE_LOADING False
FAVORITE FOOD spam
FAVORITE_FOO bar
JSONIFY_MIMETYPE application/json
.
.
.
USE_X_SENDFILE False

So..basically app.config is a dictionary.

Blogtastic

monty-python

Hopefully, in my latest quest, I won’t end up like these guys…

So I’ve completed the Flask Blog app in Chapter 7  of Real Python part 2.   This simple little beginner app has all the basic functions of a web site: login, entering data, logging out.  I’ve learned a ton so far…not the least of which includes finally getting a grasp of Git and virtualenv and even a decent (though far from perfect) understanding of Python decorators.  At the same time, I see I have a long way to go in my latest quest.

I don’t have a lifestyle in which I can regularly devote long hours of concentration to my web development learning, so I’m taking a different approach than in my previous Python adventures. I’m taking notes as I go through Real Python, to help me remember what I learned, and also remember where I left off.  I’m going to make sure I have a thorough grasp of each chapter’s material before I move on to the next.  I’ll do this by making notes of concepts I need to learn more about, and set goals to work on to do so.   I’ve found Evernote to be helpful in this regard, along with regular updates to my README. Stay tuned…

 

Overwhelmed by Practicality

Capture

Once upon a time, I took a glance at web development and found it was “killer” — in more ways than one.  I then threw my heart and soul into….reinventing the wheel, basically.  Then time, age, distractions and responsibilities crept up on me. I have had to make some decisions about how to make the best use of my time: how much am I able and willing to devote to learning new skills, given my current responsibilities and priorities?

How much sense does it make to create an application that duplicates what not one, but two existing applications (Access and Base) already can do for me perfectly adequately?  Such an effort could be justified if it helped me learn skills that could be useful, and even profitable, in the future.   How useful would the ability to create desktop-only software actually be, in a 21st century world that is rapidly turning to the web as its platform of choice for ALL computing, not just browsing, social networking and online shopping?  Answer: Not very.

Now, two years almost to the day after I took the HTML course at Codecademy, I am embarking on a new quest: learn web development with Python.  The tools are right in front of me:  good old RealPython has not one, but two huge ebooks on web development, that have been gathering digital dust on my hard drive.   My goal is to work through it slowly, so that I understand the concepts.  I have already made progress in grasping virtualenv and command-line Git.  At the moment my brain is tied in knots by Python decorators, but then again, it was once tied in knots by Python importing, so I’m confident I’ll wrestle through it.  At least I was able to complete the Flask Blog exercise.  I even typed out all the code by hand instead of copying and pasting, just like Michael told me to!

I would like to eventually create a web-based finance application, just like I did in Access and Base.  That will take time, however, and in the meantime, I still need to pay bills and track expenses, so Base will do, for now. I would say one of the most useful accomplishments during this blog’s lifetime was to get my financial data from my Access *.accdb file into SQLite, making the live data 100% portable.

As I (slowly, I expect) progress through the two RealPython web development courses, feel free to stop by GitHub to check on my progress!

Oh, and one more thing I’ve learned since my last post: Cities: Skylines is an even cooler game than SimCity4.

20160618213529_1

Making Dates Behave

[UPDATE: 2016-07-05: Kudos to me for re-inventing the wheel..again!  Someone pointed out to me that there is already a module that does this, and it does it much better than me. So, ignore all the code below, and go get dateutil instead! ]

My financial management SQLite database, for which I hope to eventually design a Python/Qt GUI front end, relies on data downloaded from various financial institutions. My current focus is developing a Python application to automatically import and correctly format this data so that it can be appended to my database.

A huge annoyance in this process is that different institutions export dates in different formats. For example, I deal regularly with PayPal, Ebay and Amazon. PayPal’s downloaded transactions have their dates like “7/29/2015.” Ebay’s are “Jul-29-15” (why on earth is ANYONE still using two digit years???) Amazon’s are “Dec 9, 2015.” Appending dates to my database, however, requires them to be in the increasingly standard, reasonable, logical format, “2015-12-17”.

Python to the rescue! Here is my solution: a function called “date_fixer.” Give it a date in any of the above three formats, and it spits out the date as “YYYY-MM-DD”. Detailed documentation and explanation are in the comments. I’m sure it’s not coded as efficiently as it could be, but at this point, I only care that it works! Hope others find it useful as well. Cheers!

def date_fixer(datestring):
    """
    Converts a date, 'datestring', to YYYY-MM-DD (ISO 8601) format.
    'datestring' must be in month, day, year order. The month, day and year
    can be separated by a space, "/" or "-".  The month can be either a number 
    from 1 to 12, the full month name or the firt 3 letters of the month name.   
    Day must be in the range 1 - 31. Year can be either two or four digits. 
    Examples: "12/17/2015", "December 17, 2015", "12-17-15", "12-17-2015",
    "1-2-12", "Dec 17 15", "Dec 17, 2015".  
    """

    from datetime import date
    
    # dict for looking up month number by abbreviation
    months = {"JAN":1,"FEB":2,"MAR":3,"APR":4,"MAY":5,"JUN":6,
              "JUL":7,"AUG":8,"SEP":9,"OCT":10,"NOV":11,"DEC":12}
    
    try:
    
        # determine which character separates m, d, y
        if datestring.find(" ") != -1:
            delim = " "

        elif datestring.find("/") != -1:
            delim = "/"
            
        elif datestring.find("-") != -1:
            delim = "-"
        
        else:
            return "Error: '{}' is not a valid date string.".format(datestring)

        # Get the position of the first occurance of the delimiter    
        delim1 = datestring.find(delim)
        delim1 = datestring.find(delim)

        # Get the position of the 2nd occurance of the delimiter
        delim2 = datestring[delim1+1:].find(delim)
        delim2 = delim1 + delim2 + 1

        # the month is the segment of datestring that is from 
        # 0 to the 1st delimiter occurance.
        month = datestring[:delim1]

        # month is all or part of the month name. 
        if not month.isnumeric():
            
            # get the 1st three letters of 'month' & look up in 'months' dict.
            month_abr = month[0:3]
            
            try:
            
                # look up month abbreviation in months dict
                month = months[month_abr.upper()]
                
            except KeyError:
                
                return "Error: '{}' is not a valid month.".format(month)
        
        month = int(month)
        
        if month > 12 or month < 1:
        
            return "Error: '{}' is not a valid month.".format(str(month))
            
        # the day is the segment between the two occurances of the delimiter.
        day = datestring[delim1+1:delim2]

        # strip out the comma if there's one after the day.
        if day[len(day)-1] == ",":
            day = day[:len(day)-1]

        day = int(day)
        
        if day > 31:
        
            return "Error: '{}' is not a valid day.".format(str(day))
            
        year = ""
        
        # check for consective numerals in the segment after the 2nd delimiter.
        for char in range(delim2+1,len(datestring)):
        
            # adds each consective numeral to 'year.'
            # stops when a non-numeric character is encountred. 
            if datestring[char].isnumeric:  
                year = year + datestring[char]
           
        if int(year) < 100:  # year is 2 digits
        
            if int(year) <50:
                
                year = int(year) + 2000  # assume year is in 21st century
                
            else:
                year = int(year) + 1900 # assume year is in 20th century
        
        year = int(year)
        
        return date(year, month, day).isoformat()
    
    except ValueError:
        
        return ("Error: Unable to process datestring '{}'.".format(datestring))

LibreOffice Base as a Transition and Learning Tool

In the year that has flown past me since my last post, here’s what has transpired…

I became more preoccupied with transitioning all of my essential computing tasks to open source, especially financial management, than with continuing my Python GUI project. As I wrote in my last post, LibreOffice Base has drawbacks that, for me, make it a deal-breaker as a permanent solution for my database needs. After further research and experimentation, I found it workable as a transition phase in my ever-lengthening goal of a GUI-based Python database application. Here’s what I’ve accomplished with Base so far: 1) I moved nearly three years of financial data from tables in Microsoft Access to a SQLite database. 2) I learned how to connect LibreOffice to my SQLite database, courtesy of this precious little ODBC driver. 3) I learned how to design a rudimentary GUI using Base’s forms and widgets that does almost everything my MS Access database did, enabling me to stop using Access altogether. 4) I set up an experimental desktop PC with Linux Mint, to confirm my Base database functions just as well in Linux as in Windows (it does). 5) I got better at SQL, since LibreOffice Base depends on it much more than Access for basic tasks. Example: Base doesn’t allow the construction of INSERT, DELETE or UPDATE queries like Access does; these functions must be done by issuing SQL commands directly. For repeated tasks, such as recurring transactions (example: payroll), I took to saving the SQL commands in text files and pasting them into Base’s SQL window. This showed me, among other things, that I needed to get much more proficient with SQL before I could hope to take on the complicated task of making my own graphical database front end.

Once I got my finances into Base, my motivation to continue with my Python database project waned. I love learning new skills, but unless they have an immediate usefulness to me, they can easily fall by the wayside. Last year Python was new and cool and fun, but this year it became clear how much I still have to learn in order to make it do the kinds of things I REALLY want to do, and that there are perfectly acceptable ways to do those things right now (such as using Base for my finances). I’m not a young man anymore, so time is getting shorter with each passing year. I have a wife, a social life and a full time job that all, rightfully, remain my priorities. There’s also…Minecraft!(along with two nephews and a couple of friends’ kids who love to play it with me).

However…

I’ve always believed that the reason a cliché becomes a cliché is because it’s true. It’s long been said that necessity is the mother of invention. In my case, exasperation with a gaping hole in LibreOffice Base’s functionality is the mother of a Python-based solution! Details (and some Python code!) to follow…in my next post. Stay tuned!

Off “Base”

Several years ago I took a look at Base, the database component of OpenOffice and now, LibreOffice. It seemed extremely rudimentary compared to MS Access. For some reason, it came into my head to check it out again recently. Base seems to have come a long way since my first encounter. I noted that it has the ability to construct complicated SQL queries graphically, and build forms that include drop-down boxes and lookups. It meets my main qualifications for writing essential apps such as my current finance management project: it’s open source and cross platform. For a few days, it appeared so suited to what I’m trying to accomplish that it seemed this blog’s title would have to change to “My LibreOffice Base Adventure.” But it wasn’t meant to be. The main reason is that as I continued to experiment with Base, it soon became apparent that, in order to do what I want, I would need to go beyond using only tables, forms and queries, and learn to code in LibreOffice Basic — LibreOffice’s equivalent to Microsoft’s Visual Basic that allows more sophisticated programming of the behavior of Base’s database objects. Why, I asked myself, after coming as far as I have with Python, then PySide, would I want to set all that aside to learn yet ANOTHER language? To be honest, what I really needed to do was get back on track with what I’ve already built, and take it to the next level. Amazing how the thrill of learning something new can sometimes distract from persevering with what I already have began learning. So now, back to work.  I found this PySide tutorial this past week and hope to use it to brush up on what I’ve already learned, and then continue forward.