Tuesday, 27 March, 2018 UTC


Summary

As serialized data structures, Python programmers intensively use arrays, lists, and dictionaries. Storing these data structures persistently requires either a file or a database to work with. This article describes how to write a list to file, and how to read that list back into memory.
To write data in a file, and to read data from a file, the Python programming language offers the standard methods write() and read() for dealing with a single line, as well as writelines() and readlines() for dealing with multiple lines. Furthermore, both the pickle and the json module allow clever ways of dealing with serialized data sets as well.

Using the read and write Methods

To deal with characters (strings) the basic methods work excellent. Saving such a list line by line into the file listfile.txt can be done as follows:
# define list of places
places = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.txt', 'w') as filehandle:  
    for listitem in places:
        filehandle.write('%s\n' % listitem)
In line 6 the listitem is extended by a linebreak "\n", firstly, and stored into the output file, secondly. To read the entire list from the file listfile.txt back into memory this Python code shows you how it works:
# define an empty list
places = []

# open file and read the content in a list
with open('listfile.txt', 'r') as filehandle:  
    for line in filehandle:
        # remove linebreak which is the last character of the string
        currentPlace = line[:-1]

        # add item to the list
        places.append(currentPlace)
Keep in mind that you'll need to remove the linebreak from the end of the string. In this case it helps us that Python allows list operations on strings, too. In line 8 of the code above this removal is simply done as a list operation on the string itself, which keeps everything but the last element. This element contains the character "\n" that represents the linebreak on UNIX/Linux systems.

Using the writelines and readlines Methods

As mentioned at the beginning of this article Python also contains the two methods writelines() and readlines() to write and read multiple lines in one step, respectively. To write the entire list to a file on disk the Python code is as follows:
# define list of places
places_list = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.txt', 'w') as filehandle:  
    filehandle.writelines("%s\n" % place for place in places_list)
To read the entire list from a file on disk the Python code is as follows:
# define empty list
places = []

# open file and read the content in a list
with open('listfile.txt', 'r') as filehandle:  
    filecontents = filehandle.readlines()

    for line in filecontents:
        # remove linebreak which is the last character of the string
        current_place = line[:-1]

        # add item to the list
        places.append(current_place)
The listing above follows a more traditional approach borrowed from other programming languages. To write it in a more Pythonic way have a look at the code below:
# define empty list
places = []

# open file and read the content in a list
with open('listfile.txt', 'r') as filehandle:  
    places = [current_place.rstrip() for current_place in filehandle.readlines()]
Having opened the file listfile.txt in line 5, re-establishing the list takes place entirely in line 6. Firstly, the file content is read via readlines(). Secondly, in a for loop from each line the linebreak character is removed using the rstrip()method. Thirdly, the string is added to the list of places as a new list item. In comparison with the listing before the code is much more compact, but may be more difficult to read for beginner Python programmers.

Using the pickle Module

The different methods explained up to now store the list in a way that humans can still read it. In case this is not needed the pickle module may become quite handy for you. Its dump() method stores the list efficiently as a binary data stream. Firstly, in line 7 (in the code below) the output file listfile.data is opened for binary writing ("wb"). Secondly, in line 9
the list is stored in the opened file using the dump() method.
# load additional module
import pickle

# define a list of places
placesList = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.data', 'wb') as filehandle:  
    # store the data as binary data stream
    pickle.dump(placesList, filehandle)
As the next step we read the list from the file as follows. Firstly, the output file listfile.data is opened binary for reading ("rb") in line 4. Secondly, the list of places is loaded from the file using the load() method.
# load additional module
import pickle

with open('listfile.data', 'rb') as filehandle:  
    # read the data as binary data stream
    placesList = pickle.load(filehandle)
The two examples here demonstrate the usage of strings. Although, pickle works with all kind of Python objects such as strings, numbers, self-defined structures, and every other built-in data structure Python provides.

Using the JSON Format

The binary data format pickle uses is specific to Python. To improve the interoperability between different programs the JavaScript Object Notation (JSON) provides an easy-to-use and human-readable schema, and thus became very popular.
The following example demonstrates how to write a list of mixed variable types to an output file using the json module. In line 4 the basic list is defined. Having opened the output file for writing in line 7, the dump() method stores the basic list in the file using the JSON notation.
import json

# define list with values
basicList = [1, "Cape Town", 4.6]

# open output file for writing
with open('listfile.txt', 'w') as filehandle:  
    json.dump(basicList, filehandle)
Reading the contents of the output file back into memory is as simple as writing the data. The corresponding method to dump() is named load(), and works as follows:
import json

# open output file for reading
with open('listfile.txt', 'r') as filehandle:  
    basicList = json.load(filehandle)

Conclusion

The different methods shown above range from simple writing/reading data up to dumping/loading data via binary streams using pickle and JSON. This simplifies storing a list persistently, and reading it back into memory.

Acknowledgements

The author would like to thank Zoleka Hatitongwe for her support while preparing the article.