Exercise 5.1

Objectives:

Explore a few definitional aspects of functions/methods
Making functions more flexible
Type hints

In Exercise 2.6 you wrote a reader.py module that had a function for reading a CSV into a list of dictionaries. For example:

>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str,int,float])
>>>

We later expanded to that code to work with instances in Exercise 3.3:

>>> import reader
>>> from stock import Stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', Stock)
>>>

Eventually the code was refactored into a collection of classes involving inheritance in Exercise 3.7. However, the code has become rather complex and convoluted.

(a) Back to Basics

Start by reverting the changes related to class definitions. Rewrite the reader.py file so that it contains the two basic functions that you had before you messed it up with classes:

# reader.py

import csv

def read_csv_as_dicts(filename, types):
    '''
    Read CSV data into a list of dictionaries with optional type conversion
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = { name: func(val) 
                       for name, func, val in zip(headers, types, row) }
            records.append(record)
    return records

def read_csv_as_instances(filename, cls):
    '''
    Read CSV data into a list of instances
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = cls.from_row(row)
            records.append(record)
    return records

Make sure the code still works as it did before:

>>> import reader
>>> port = reader.read_csv_as_dicts('Data/portfolio.csv', [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>> import stock
>>> port = reader.read_csv_as_instances('Data/portfolio.csv', stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>

(b) Thinking about Flexibility

Right now, the two functions in reader.py are hard-wired to work with filenames that are passed directly to open(). Refactor the code so that it works with any iterable object that produces lines. To do this, create two new functions csv_as_dicts(lines, types) and csv_as_instances(lines, cls) that convert any iterable sequence of lines. For example:

>>> file = open('Data/portfolio.csv')
>>> port = reader.csv_as_dicts(file, [str, int, float])
>>> port
[{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, 
 {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23}, 
 {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1}, 
 {'name': 'IBM', 'shares': 100, 'price': 70.44}]
>>>

The whole point of doing this is to make it possible to work with different kinds of input sources. For example:

>>> import gzip
>>> import stock
>>> file = gzip.open('Data/portfolio.csv.gz')
>>> port = reader.csv_as_instances(file, stock.Stock)
>>> port
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44), 
 Stock('MSFT', 200, 51.23), Stock('GE', 95, 40.37), Stock('MSFT', 50, 65.1), 
 Stock('IBM', 100, 70.44)]
>>>

To maintain backwards compatibility with older code, write functions read_csv_as_dicts() and read_csv_as_instances() that take a filename as before. These functions should call open() on the supplied filename and use the new csv_as_dicts() or csv_as_instances() functions on the resulting file.

(c) Design Challenge: CSV Headers

The code assumes that the first line of CSV data always contains column headers. However, this isn't always the case. For example, the file Data/portfolio_noheader.csv contains data, but no column headers.

How would you refactor the code to accommodate missing column headers, having them supplied manually by the caller instead?

(d) API Challenge: Type hints

Functions can have optional type-hints attached to arguments and return values. For example:

def add(x:int, y:int) -> int:
    return x + y

The typing module has additional classes for expressing more complex kinds of types including containers. For example:

from typing import List

def sum_squares(nums: List[int]) -> int:
    total = 0
    for n in nums:
        total += n*n
    return total

Your challenge: Modify the code in reader.py so that all functions have type hints. Try to make the type-hints as accurate as possible. To do this, you may need to consult the documentation for the typing module.

[ Solution | Index | Exercise 4.4 | Exercise 5.2 ]

. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ex5_1.md

ex5_1.md

Exercise 5.1

(a) Back to Basics

(b) Thinking about Flexibility

(c) Design Challenge: CSV Headers

(d) API Challenge: Type hints

Files

ex5_1.md

Latest commit

History

ex5_1.md

File metadata and controls

Exercise 5.1

(a) Back to Basics

(b) Thinking about Flexibility

(c) Design Challenge: CSV Headers

(d) API Challenge: Type hints