BYU logo Computer Science

Types and Files

Types

  • every variable has a type
  • types are figured out for you by Python
student = "Emma"
age = 7

variable types

Operations depend on the variable type

age = 7
student = 'Emma'

print(age + age)
print(student + student)
    14
    EmmaEmma

If you think you have an integer but actually have a string…

age = '7'
print(age + age)
    77

If you have a string but wish you had an integer…

age = '7'
print(age + age)
    77

Reminder: convert the string to an integer using int()

age_converted_to_integer = int(age)
print(age_converted_to_integer + age_converted_to_integer)
    14
age = int(age)
print(age + age)
    14

If you have an integer but wish you had a string…

score = 10
output = "score: "
output += score
    ---------------------------------------------------------------------------

    TypeError                                 Traceback (most recent call last)

    /var/folders/9x/cb134v3d2nb22_rksynbspqm0000gn/T/ipykernel_12650/4122253352.py in <module>
          1 score = 10
          2 output = "score: "
    ----> 3 output += score


    TypeError: can only concatenate str (not "int") to str

Reminder: convert the integer to a string using str()

score = str(score)
output += score
print(output)
    score: 10

example: sum_digits(s)

  • sum all the digits in a string
def sum_digits(s):
    # use the accumulator pattern
    sum = 0
    # loop over all the digits
    for i in range(len(s)):
        if s[i].isdigit():
            sum += int(s[i])
    return sum


sum_digits('abc4d5e10')
    10
def sum_contiguous_digits(s):
    # use the accumulator pattern
    sum = 0
    substring = ''
    # loop over all the digits
    for i in range(len(s)):
        if s[i].isdigit():
            substring += s[i]
            print(i,s[i],"this is a digit, so add it to substring")
        elif len(substring) > 0:
            print(i,s[i],"this is not a digit, so convert substring to an int", substring)
            sum += int(substring)
            substring = ''
    if len(substring) > 0:
        sum += int(substring)
    return sum


sum_contiguous_digits('abc45e10')
    3 4 this is a digit, so add it to substring
    4 5 this is a digit, so add it to substring
    5 e this is not a digit, so convert substring to an int 45
    6 1 this is a digit, so add it to substring
    7 0 this is a digit, so add it to substring





    55

for character in

  • all this time we have been using
for i in range(len(s)):
    # do something with character s
  • we could be using
for character in s:
    # do something with character s

example: sum_digits(s) — using for character in

def sum_digits(s):
    # use the accumulator pattern
    sum = 0
    # loop over all the digits
    for character in s:
        if character.isdigit():
            sum += int(character)
    return sum


sum_digits('abc4d5e10')
    10

example: right_left(s)

  • take a string, cut it in half (right half and left half)

  • make a new string that has “rightrightleftleft”

  • ‘aabb’ -> ‘bbbbaaaa’

finding the middle of a string

s = 'aabb'
middle = len(s) / 2
print(middle)
s[middle]
    2.0



    ---------------------------------------------------------------------------

    TypeError                                 Traceback (most recent call last)

    /var/folders/9x/cb134v3d2nb22_rksynbspqm0000gn/T/ipykernel_12650/1260919677.py in <module>
          2 middle = len(s) / 2
          3 print(middle)
    ----> 4 s[middle]
          5
          6


    TypeError: string indices must be integers

integer division

s = 'aabb'
middle = len(s) // 2
print(middle)
s[middle]
    2





    'b'
def right_left(s):
    middle = len(s) // 2
    left = s[:middle]
    right = s[middle:]
    return right + right + left + left

right_left('aabb')
right_left('aabbb')
    'bbbbbbaaaa'

modular division

23 % 10
    3
36 % 10
    6
17 % 5
    2
45 % 5
    0

example: crazy_str(s)

  • first letter lowercase, second uppercase, and then continue alternating lowercase and uppercase
def crazy_str(s):
    # accumulator pattern
    result = ''
    for i in range(len(s)):
        if i % 2 == 0:
            # even number
            result += s[i].lower()
        else:
            # odd number
            result += s[i].upper()
    return result

crazy_str('WHAT is going on?')
    'wHaT Is gOiNg oN?'

Files

  • each file is a set of lines

  • each line is a sequence of characters

  • every line ends with a newline character ‘\n’ (same as hitting Return or Enter on keyboard)

  • create a file called hibye.txt:

Hi and↩️
Bye↩️

A file is just a long string

  • if you could read the entire file into a single string, it would look like this:
file_contents = "Hi and\nBye\n"
  • each \n represents a newline character

Actually, this is easy to do

# open a file for reading
file = open('hibye.txt')
# the variable "file" is a file, just like "bit" was a bit and "image" was an image
# file.read() reads the entire file and returns a string
contents = file.read()
contents
    'Hi and\nBye\n'

Reading a file line by line

# in this block of code, set the variable 'file' to the file that is named 'hibye.txt'
with open('hibye.txt') as file:
    # loop over all lines in a file
    # looks a lot like "for pixel in image" and "for character in string"
    count = 0
    for line in file:
        # use line in here
        if count == 0:
            print(line)
        count += 1
    print(f"I read {count} lines")
    Hi and

    I read 2 lines

why are there extra newlines?

  • the first line is Hi and\n
  • when you print('Hi and\n'), you get a newline from the string, and then print adds an extra newline

strip()

  • removes both leading and trailing whitespace characters in a string
  • one of the most helpful functions you will use with files!
s = "Apples and\n"
print(s.strip())

word = " \t  bananas   \n "
print(word.strip())

Apples and bananas

with open('hibye.txt') as file:
    for line in file:
        # remove whitespace
        print(line.strip())
    Hi and
    Bye

carriage returns and newlines

  • MacOS ends lines with \n — newline character

  • Windows ends lines with \r\n — carriage return character and newline character

  • newline is also called line feed

  • \r\n is called CRLF (carriage return line feed)

  • carriage return returns to the beginning of the current line, line feed moves to the next line)

Python handles this for you

  • when you use for line in file, Python automatically detects both \n and \r\n

remember crazy_str(s)?

with open('hibye.txt') as file:
    for line in file:
        # change the line into a crazy string
        crazy_line = crazy_str(line)
        # don't print anything at the end of the line
        print(crazy_line, end='')
    hI AnD
    bYe