BYU logo Computer Science

Lab 9 — Parsing files

In this lab you will practice parsing files. Download the lab 9 files to get started.


Write a function in called create_pylib(filename). This function reads a PyLib (like a Madlib but with help from you and your Python skills). This function takes one parameter:

  • filename: the name of a file with a PyLib template

The function reads this file and prints out a completed story! For each line in the file, the function replaces any bracketed categories (like [noun]) with a word from that category. We’ve provided you with a get_word(category) function that given a category gives you back a random word of that category. There will be exactly one bracketed category per line. Print out the completed Pylib line by line.

Here is an example story.txt:

When covid ends the first place I am traveling to is [location].
I cannot wait to [verb] when I get there!
I will make sure to bring [noun].
Until then, I will [adverb] await the trip.

Counting Orders

In this problem, you’re going to explore how your new string parsing skills can help reveal insights into large amounts of data: in this case, stock data for Gamestop trades. You’ll also consider how these sorts of techniques can be used unfairly to limit equal access to financial resources.

File contents

In this problem, you are going to be working with files that have trading data in them. Each line of the file is a trade, consisting of the following:

  • a trade ID
  • stock symbol
  • trade type — located between || and |
  • number of shares transacted — located between $$ and $
  • the entity conducting the transaction — located between && and &

The only two trade types are “BUY” and “SELL.” Here is an example of some lines from a file with trades:

4965 GME ||SELL| $$8$ &&KAREL_CO&
2725 GME ||SELL| $$13$ &&KAREL_CO&
9543 GME ||SELL| $$4$ &&J_DOE&
8390 GME ||BUY| $$3$ &&KAREL_CO&
9114 GME ||SELL| $$5$ &&NEMO&

Find a value

First, write a helper function in called find_value(line, start, end) that finds a trade type or number of shares. For example:

>>> find_value('4965 GME ||SELL| $$8$ &&KAREL_CO&', '||', '|')
>>> find_value('4965 GME ||SELL| $$8$ &&KAREL_CO&', '$$', '$')

You should use s.find() to get the right and left indexes and a slice to get the value, like we have done in lecture.

Total trades

Next, write a function in called total_trades(filename). It takes one parameter:

  • filename: the name of a file contaning trades for the NYSE stock Gamestop (GME)

The function determines the total number of shares bought and sold.

To parse the trades, loop through every line in the file, and get the trade type and the number of shares. You can use your find_value() function to get these values. Then use the accumulator pattern to total the number of shares sold and the number of shares bought.

Reminder: Remember you will need to convert string values to integer values when you want to sum integers.

For example:

>>> total_trades('gamestop_trades.txt')
3 shares bought
30 shares sold

Market Manipulators

Often, when a single entity or group of entities decide to sell a large amount of stock at once, the price of that stock falls. With the rise in automated trading, concerns arise over large groups taking advantage of this trend and dumping stock to deliberately tank a stock price.

Write a function in called find_percent_seller(filename, entity_name). It takes two parameters:

  • filename: the name of a file contaning trades for the NYSE stock Gamestop (GME)
  • entity_name: the name of an entity in this data set

The function calculates the percentage of stock sold by the named entity.

You will find it helpful to re-use the helper function suggested in the previous section.

For example:

>>> find_percent_seller('gamestop_trades.txt', 'KAREL_CO')
KAREL_CO sold 70 percent of GME stock sold

Market Manipulators Ethical Discussion

With the rise in automated trading, the gap between the capabilities of large corporations and individual retail investors in access to the financial markets has grown considerably. The distance to trading centers, computational power, and priority access all have significant impacts on trading ability within the world of automated trading. In what ways has automation widened this gap? In what ways can the same automation that has grown the gap help narrow it (think about what you just implemented!)? And if the gap is a problem, who is responsible for addressing it?

Additional viewing:

Additional reading:


What we want you to get from this lab:

  • You can parse the lines in a file, one line at a time

  • You can convert between string and integer types as needed

  • You are comfortable using the accumulator pattern when writing a function

  • You can clearly document your functions with docstrings

  • You can figure out what went wrong when something unexpected happens

  • Hopefully you had fun!


Turn in a zip file that has your code.

Every function should have a docstring and the second and third problems should have doctests.

Py-libsYour solution works2
Counting ordersYour solution works2
Market manipulatorsYour solution works2
DocumentationAll functions have good docstrings2
Doctestsfind_a_value(), total_trades(), and find_percent_seller() functions have doctests for useful test cases2


This lab is based on one offered in CS 106A at Stanford University.