BYU logo Computer Science

Lab 23 - Write a File-processing Script

For this lab, you will write a script that processes files.

The script iterates through each word in the specified file. If the word matches a known emoji, the word is replaced with that emoji.

A file containing known words and their corresponding emojis should be provided by the user.

Requirements

  • The script should be named add_emojis.py
  • Require an option named --file that takes a file to be processed
    • The text will be read from this file
  • Require an option named --output that specifies the output file
    • The processed text will be written to this file
  • Require an option named --emojis that takes a file containing word: emoji pairs
    • The emoji file should be in .json format. See an example below.
  • The casing of each word should be ignored, but if the word is not replaced with an emoji, the original word should be printed.
    • i.e. “Dog”, “DOG”, and “dog” should all match “dog”.

Example of a required argument

An argument is required if you specify required=True.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
    "--emojis",
    type=str,
    help="The emoji mapping file (json format)",
    required=True
    )
args = parser.parse_args()

If the argument isn’t required you should add a default. Otherwise the value will be None.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
    "--emojis",
    type=str,
    default="emojis.json",
    help="The emoji mapping file (json format)"
    )
args = parser.parse_args()

Example emoji file

{
  "dog": "🐶",
  "cat": "🐱",
  "bird": "🐦"
}

How to load a json file in python

import json
filename = "emojis.json"
with open(filename) as file:
    data = json.load(file)
print(data)

Edge cases to watch out for

  • Make sure you only replace a word if the whole word matches.
    • If you were using the example emoji file above, you shouldn’t turn “birds” into “🐦s” or “scatter” into “s🐱ter”

Extra Practice

If you finish the above requirements and would like more practice, take a stab at the following features, or add some of your own!

Just be sure to copy add_emojis.py to more_emojis.py before adding more features.

Turn in both add_emojis.py and more_emojis.py

  • Handle punctuation

    • If a word ends with punctuation, process only the alphanumeric part at the beginning.
    • e.g. For the example emoji mapping above, “I have a bird.” will become “I have a 🐦.”
  • Support an option named --word that indicates which word should be substituted

    • If --word is specified, only the value for that option will be replaced.
    • If the specified word is not in the emojis file, the program should not crash, and the output file should still be written as expected.
    • If --word is not specified, ALL the emojis should be processed.
  • Support an option named --extra that takes a string of the format word:emoji (e.g “banana:🍌”).

    • Parse the word and emoji from this option and include them in the emoji dictionary when processing the file.
  • Support as many isntances of --extra as the user wants.

    • e.g. --extra "banana:🍌" --extra "bananas:🍌🍌" --extra "duck:🦆"
    • Use add_argument("--extra", ..., action="append") to support multiple instances of that argument
  • Make the --output argument optional

    • If --output is not specified, print the processed results to STDOUT

Lessons

What we want you to get from this lab:

  • You understand how write an entire Python program from scratch

  • You can figure out what went wrong when something unexpected happens

  • Hopefully you had fun!

Points

Turn in add_emojis.py and (optionally) more_emojis.py (even if you implement only some of it).

TaskDescriptionPoints
Add EmojisYour program works10
Bonus Points
Handle punctuationYour solution works2
Word optionYour solution works2
Extra optionYour solution works2
Many extra optionsYour solution works2
Output is optionalYour solution works2