Lab 9 — Parsing files
In this lab you will practice parsing files. Download the lab 9 files to get started.
Py-Libs
Write a function in pylib.py
called create_pylib(filename)
. This function
reads a PyLib (like a Madlib but with help from you and your Python skills).
This function takes one parameter:
filename
: the name of a file with a PyLib template
The function reads this file and prints out a completed story! For each line in
the file, the function replaces any bracketed categories (like [noun]) with a
word from that category. We’ve provided you with a get_word(category)
function
that given a category gives you back a random word of that category. There will
be exactly one bracketed category per line. Print out the completed Pylib line
by line.
Here is an example story.txt
:
When covid ends the first place I am traveling to is [location].
I cannot wait to [verb] when I get there!
I will make sure to bring [noun].
Until then, I will [adverb] await the trip.
Counting Orders
In this problem, you’re going to explore how your new string parsing skills can help reveal insights into large amounts of data: in this case, stock data for Gamestop trades. You’ll also consider how these sorts of techniques can be used unfairly to limit equal access to financial resources.
File contents
In this problem, you are going to be working with files that have trading data in them. Each line of the file is a trade, consisting of the following:
- a trade ID
- stock symbol
- trade type — located between
||
and|
- number of shares transacted — located between
$$
and$
- the entity conducting the transaction — located between
&&
and&
The only two trade types are “BUY” and “SELL.” Here is an example of some lines from a file with trades:
4965 GME ||SELL| $$8$ &&KAREL_CO&
2725 GME ||SELL| $$13$ &&KAREL_CO&
9543 GME ||SELL| $$4$ &&J_DOE&
8390 GME ||BUY| $$3$ &&KAREL_CO&
9114 GME ||SELL| $$5$ &&NEMO&
Find a value
First, write a helper function in trades.py
called
find_value(line, start, end)
that finds a trade type or number of shares. For
example:
>>> find_value('4965 GME ||SELL| $$8$ &&KAREL_CO&', '||', '|')
SELL
>>> find_value('4965 GME ||SELL| $$8$ &&KAREL_CO&', '$$', '$')
8
You should use s.find()
to get the right and left indexes and a slice to get
the value, like we have done in lecture.
Total trades
Next, write a function in trades.py
called total_trades(filename)
. It takes
one parameter:
filename
: the name of a file contaning trades for the NYSE stock Gamestop (GME)
The function determines the total number of shares bought and sold.
To parse the trades, loop through every line in the file, and get the trade type
and the number of shares. You can use your find_value()
function to get these
values. Then use the accumulator pattern to total the number of shares sold and
the number of shares bought.
Reminder: Remember you will need to convert string values to integer values when you want to sum integers.
For example:
>>> total_trades('gamestop_trades.txt')
3 shares bought
30 shares sold
Market Manipulators
Often, when a single entity or group of entities decide to sell a large amount of stock at once, the price of that stock falls. With the rise in automated trading, concerns arise over large groups taking advantage of this trend and dumping stock to deliberately tank a stock price.
Write a function in trades.py
called
find_percent_seller(filename, entity_name)
. It takes two parameters:
filename
: the name of a file contaning trades for the NYSE stock Gamestop (GME)entity_name
: the name of an entity in this data set
The function calculates the percentage of stock sold by the named entity.
You will find it helpful to re-use the helper function suggested in the previous section.
For example:
>>> find_percent_seller('gamestop_trades.txt', 'KAREL_CO')
KAREL_CO sold 70 percent of GME stock sold
Market Manipulators Ethical Discussion
With the rise in automated trading, the gap between the capabilities of large corporations and individual retail investors in access to the financial markets has grown considerably. The distance to trading centers, computational power, and priority access all have significant impacts on trading ability within the world of automated trading. In what ways has automation widened this gap? In what ways can the same automation that has grown the gap help narrow it (think about what you just implemented!)? And if the gap is a problem, who is responsible for addressing it?
Additional viewing:
Additional reading:
- The World of High-Frequency Algorithmic Trading
- Man Vs. Machine: Pros and Cons of High-Speed Trading
- Has High Frequency Trading Ruined the Stock Market for the Rest of Us?
- Market manipulation on Wikipedia
- High frequency trading on Wikipedia
- Information Inequality: How High Frequency Traders Use Premier Access To Information To Prey on Institutional Investors
- Has Regulation Affected the High Frequency Trading Market?
Lessons
What we want you to get from this lab:
-
You can parse the lines in a file, one line at a time
-
You can convert between string and integer types as needed
-
You are comfortable using the accumulator pattern when writing a function
-
You can clearly document your functions with docstrings
-
You can figure out what went wrong when something unexpected happens
-
Hopefully you had fun!
Points
Turn in a zip file that has your code.
Every function should have a docstring and the second and third problems should have doctests.
Task | Description | Points |
---|---|---|
Py-libs | Your solution works | 2 |
Counting orders | Your solution works | 2 |
Market manipulators | Your solution works | 2 |
Documentation | All functions have good docstrings | 2 |
Doctests | find_a_value() , total_trades() , and find_percent_seller() functions have doctests for useful test cases | 2 |
Credits
This lab is based on one offered in CS 106A at Stanford University.