Making Dates Behave

[UPDATE: 2016-07-05: Kudos to me for re-inventing the wheel..again!  Someone pointed out to me that there is already a module that does this, and it does it much better than me. So, ignore all the code below, and go get dateutil instead! ]

My financial management SQLite database, for which I hope to eventually design a Python/Qt GUI front end, relies on data downloaded from various financial institutions. My current focus is developing a Python application to automatically import and correctly format this data so that it can be appended to my database.

A huge annoyance in this process is that different institutions export dates in different formats. For example, I deal regularly with PayPal, Ebay and Amazon. PayPal’s downloaded transactions have their dates like “7/29/2015.” Ebay’s are “Jul-29-15” (why on earth is ANYONE still using two digit years???) Amazon’s are “Dec 9, 2015.” Appending dates to my database, however, requires them to be in the increasingly standard, reasonable, logical format, “2015-12-17”.

Python to the rescue! Here is my solution: a function called “date_fixer.” Give it a date in any of the above three formats, and it spits out the date as “YYYY-MM-DD”. Detailed documentation and explanation are in the comments. I’m sure it’s not coded as efficiently as it could be, but at this point, I only care that it works! Hope others find it useful as well. Cheers!

def date_fixer(datestring):
    """
    Converts a date, 'datestring', to YYYY-MM-DD (ISO 8601) format.
    'datestring' must be in month, day, year order. The month, day and year
    can be separated by a space, "/" or "-".  The month can be either a number 
    from 1 to 12, the full month name or the firt 3 letters of the month name.   
    Day must be in the range 1 - 31. Year can be either two or four digits. 
    Examples: "12/17/2015", "December 17, 2015", "12-17-15", "12-17-2015",
    "1-2-12", "Dec 17 15", "Dec 17, 2015".  
    """

    from datetime import date
    
    # dict for looking up month number by abbreviation
    months = {"JAN":1,"FEB":2,"MAR":3,"APR":4,"MAY":5,"JUN":6,
              "JUL":7,"AUG":8,"SEP":9,"OCT":10,"NOV":11,"DEC":12}
    
    try:
    
        # determine which character separates m, d, y
        if datestring.find(" ") != -1:
            delim = " "

        elif datestring.find("/") != -1:
            delim = "/"
            
        elif datestring.find("-") != -1:
            delim = "-"
        
        else:
            return "Error: '{}' is not a valid date string.".format(datestring)

        # Get the position of the first occurance of the delimiter    
        delim1 = datestring.find(delim)
        delim1 = datestring.find(delim)

        # Get the position of the 2nd occurance of the delimiter
        delim2 = datestring[delim1+1:].find(delim)
        delim2 = delim1 + delim2 + 1

        # the month is the segment of datestring that is from 
        # 0 to the 1st delimiter occurance.
        month = datestring[:delim1]

        # month is all or part of the month name. 
        if not month.isnumeric():
            
            # get the 1st three letters of 'month' & look up in 'months' dict.
            month_abr = month[0:3]
            
            try:
            
                # look up month abbreviation in months dict
                month = months[month_abr.upper()]
                
            except KeyError:
                
                return "Error: '{}' is not a valid month.".format(month)
        
        month = int(month)
        
        if month > 12 or month < 1:
        
            return "Error: '{}' is not a valid month.".format(str(month))
            
        # the day is the segment between the two occurances of the delimiter.
        day = datestring[delim1+1:delim2]

        # strip out the comma if there's one after the day.
        if day[len(day)-1] == ",":
            day = day[:len(day)-1]

        day = int(day)
        
        if day > 31:
        
            return "Error: '{}' is not a valid day.".format(str(day))
            
        year = ""
        
        # check for consective numerals in the segment after the 2nd delimiter.
        for char in range(delim2+1,len(datestring)):
        
            # adds each consective numeral to 'year.'
            # stops when a non-numeric character is encountred. 
            if datestring[char].isnumeric:  
                year = year + datestring[char]
           
        if int(year) < 100:  # year is 2 digits
        
            if int(year) <50:
                
                year = int(year) + 2000  # assume year is in 21st century
                
            else:
                year = int(year) + 1900 # assume year is in 20th century
        
        year = int(year)
        
        return date(year, month, day).isoformat()
    
    except ValueError:
        
        return ("Error: Unable to process datestring '{}'.".format(datestring))