References
References are an important topic that we’ve been using all along. Clearly understanding references will give you a better understanding of how Python programs work
Variables are references to a value
When you create a variable, you are creating a reference to a value.
We are used to having multiple names for the same thing. A mom is also a daughter, possibly a sister, or an aunt. She may be called Elizabeth or Liz or Libby or Dr. Peterson.
If you ever went on a mission or met a missionary, this should be a familiar concept! We call a missionary Sister Peterson, not Elizabeth.
Setting a variable equal to another variable makes a second reference to the same value.
Changing a value means both references point to the new value
The variables fruits
and basket
are references to the same list. Changing
fruits
will change basket
, changing basket
will change fruits
.
fruits = ['apple', 'banana', 'pear']
basket = fruits
fruits[0] = 'strawberry'
print(f"Fruits: {fruits}")
print(f"Basket: {basket}")
print("\n")
basket.append('cherry')
print(f"Fruits: {fruits}")
print(f"Basket: {basket}")
Fruits: ['strawberry', 'banana', 'pear']
Basket: ['strawberry', 'banana', 'pear']
Fruits: ['strawberry', 'banana', 'pear', 'cherry']
Basket: ['strawberry', 'banana', 'pear', 'cherry']
Likewise, fruit_prices
and sale_items
are references to the same dictionary.
fruit_prices = {'banana': 0.5, 'apple': 0.75, 'pear': 1.00, 'peach': 1.50, 'apricot': 0.25, 'pineapple': 3.00}
sale_items = fruit_prices
banana = fruit_prices['banana']
sale_items['banana'] = 0.25
sale_items['apple'] = 0.5
sale_items['pear'] = 0.5
print(f"Sale items: {sale_items}")
print(f"Regularly-priced items: {fruit_prices}")
print(banana)
Sale items: {'banana': 0.25, 'apple': 0.5, 'pear': 0.5, 'peach': 1.5, 'apricot': 0.25, 'pineapple': 3.0}
Regularly-priced items: {'banana': 0.25, 'apple': 0.5, 'pear': 0.5, 'peach': 1.5, 'apricot': 0.25, 'pineapple': 3.0}
0.5
Here is another exmaple. my_list
points to the list [1, 2, 3]
in the
dictionary, and can still access that list even after the reference for
a[first]
changes to hello
.
a = {'first': [1, 2, 3], 'second': [4, 5, 6]}
my_list = a['first']
my_list.append(10)
print(a)
a['first'] = "hello"
print(a)
print(my_list)
{'first': [1, 2, 3, 10], 'second': [4, 5, 6]}
{'first': 'hello', 'second': [4, 5, 6]}
[1, 2, 3, 10]
Which values work this way? It depends on whether the type of the variable is mutable or immutable. If a type is mutable, this means I can change the value and the reference now uses the new value.
Type | Shorthand | Category | Mutable? |
---|---|---|---|
String | str | Text | ❌ |
Integer | int | Numeric | ❌ |
Float | float | Numeric | ❌ |
Boolean | bool | Boolean | ❌ |
Tuple | tuple | Sequence | ❌ |
Range | range | Sequence | ❌ |
List | list | Sequence | ✔️ |
Dictionary | dict | Mapping | ✔️ |
Changing a variable means changing its reference
student = 'Emma'
sister = student
print(f'{student} and {sister}')
sister = 'Sarah'
print(f'{student} and {sister}')
Emma and Emma
Emma and Sarah
For any immutable type, the only way to change the variable is to have it reference something else.
Here is another example of changing a reference:
number = 100
a = number
print(f'{number} and {a}')
a = 50
print(f'{number} and {a}')
print('\n')
100 and 100
100 and 50
If I change the variable to point to an entirely different type, Python keeps track of this automatically.
number = 100
a = number
print(f'{number} and {a}')
print('\n')
a = 'hello'
print(f'{number} and {a}')
100 and 100
100 and hello
So what happens when I use the addition operation (+)? It creates a new reference. Notice we made this work by redoing the assignment of student to a new value!
The same thing happens when concatenating strings.
Function parameters create new references
This is why PyCharm complains if you have a variable outside a function with the
same name as a function parameter. You can’t use the global student
variable
inside the add_student()
function because a function parameter is also called
student
.
The parts of your code where a variable is valid are called its “scope”. A function parameter has its scope limited to inside the function. This is good! We want limited scopes. A global variable has its scope cover the entire file. This is bad! We try to avoid this when we can.
Function Pattern #1: return new data
- don’t change any parameters
- create and return new data
Example: sale prices
fruit_prices = {'banana': 0.5, 'apple': 0.75, 'pear': 1.00, 'peach': 1.50, 'apricot': 0.25}
def sale_items(fruit_prices, discount):
sale_prices = {}
for fruit, price in fruit_prices.items():
sale_prices[fruit] = price*discount
return sale_prices
sales = sale_items(fruit_prices, 0.5)
print(sales)
# original price dictionary unchanged
print(fruit_prices)
{'banana': 0.25, 'apple': 0.375, 'pear': 0.5, 'peach': 0.75, 'apricot': 0.125}
{'banana': 0.5, 'apple': 0.75, 'pear': 1.0, 'peach': 1.5, 'apricot': 0.25}
Function Pattern #2: modify parameter
- modify one or more of the parameters
- don’t return anything
Example: add a new item
def add_item(fruit_prices, fruit, price):
fruit_prices[fruit] = price
# we passed in fruit_prices and the function changed it
add_item(fruit_prices, 'plum', 0.10)
print(fruit_prices)
# sales is unchanged
print(sales)
{'banana': 0.5, 'apple': 0.75, 'pear': 1.0, 'peach': 1.5, 'apricot': 0.25, 'plum': 0.1}
{'banana': 0.25, 'apple': 0.375, 'pear': 0.5, 'peach': 0.75, 'apricot': 0.125}
Document your function behavior!
Make clear whether the function is modifying parameters it is passed.
def sale_items(fruit_prices, discount):
"""
This function applies a discount to all fruit prices. It leaves the original
dictionary unchanged and returns a new one with sale prices.
:param fruit_prices: a dictionary of fruits and their prices
:param discount: a discount to be applied (float)
:return: a new dictionary of fruits and their sale prices
"""
def add_item(fruit_prices, fruit, price):
"""
This function adds a fruit and its price to a dictionary of fruit prices.
It modifies the dictionary it is given to include this new fruit.
Pre-condition: a fruit_prices dictionary that maps fruits to prices; it may be empty;
the fruit we are adding may already exist in the dictionary
Post-condition: the given fruit and price are in the dictionary, overwriting any entry for that fruit
that may have previously existed
:param fruit_prices: a dictionary of fruits and their prices
:param fruit: the name of a fruit
:param price: the price of the fruit
"""
Making a deep copy
In some cases, it may be convenient to make a copy of a variable. You can do
this with copy.deepcopy()
.
import copy
new_prices = copy.deepcopy(fruit_prices)
import copy
fruit_prices = {'banana': 0.5, 'apple': 0.75, 'pear': 1.00, 'peach': 1.50, 'apricot': 0.25}
def sale_items(fruit_prices, discount, letter):
# make a deep copy
sale_prices = copy.deepcopy(fruit_prices)
for fruit, price in sale_prices.items():
if fruit.startswith(letter):
sale_prices[fruit] = price*discount
return sale_prices
sales = sale_items(fruit_prices, 0.5, 'a')
print(sales)
# original price dictionary unchanged
print(fruit_prices)
{'banana': 0.5, 'apple': 0.375, 'pear': 1.0, 'peach': 1.5, 'apricot': 0.125}
{'banana': 0.5, 'apple': 0.75, 'pear': 1.0, 'peach': 1.5, 'apricot': 0.25}
Let’s go back to our census example. Imagine we want to uppercase all the last names in the census, but we want to keep the original dictionary unchanged.
Here are our census functions:
def round_to_nearest_10(number):
remainder = number % 10
return number - remainder
def people_by_age(filename):
people = {}
with open(filename) as file:
for line in file:
last, first, relationship, gender, race, age, marital_status = line.strip().split(',')
age = int(age)
# rounds to nearest 10s
age_group = round_to_nearest_10(age)
# initialize a new entry
if age_group not in people:
people[age_group] = []
# append a new person
people[age_group].append([last, first])
return people
people_by_age('census.txt')
census_people = people_by_age('census.txt')
census_people
{50: [['Baer', 'William'], ['Sposato', 'Carolina']],
30: [['Baer', 'Ruth']],
10: [['Baer', 'Robert'],
['Baer', 'William'],
['Sposato', 'Antonio'],
['Sposato', 'Ralph']],
20: [['Sposato', 'Albert'],
['Sposato', 'Carlo'],
['Sposato', 'Frances'],
['Zappala', 'Mariano'],
['Zappala', 'Anna']]}
import copy
def uppercase_last_names(census_data):
# make a deep copy
new_census_data = copy.deepcopy(census_data)
for age_group, people in new_census_data.items():
for person in people:
person[0] = person[0].upper()
return new_census_data
uppercase_people = uppercase_last_names(census_people)
print(uppercase_people)
print('\n')
print(census_people)
{50: [['BAER', 'William'], ['SPOSATO', 'Carolina']], 30: [['BAER', 'Ruth']], 10: [['BAER', 'Robert'], ['BAER', 'William'], ['SPOSATO', 'Antonio'], ['SPOSATO', 'Ralph']], 20: [['SPOSATO', 'Albert'], ['SPOSATO', 'Carlo'], ['SPOSATO', 'Frances'], ['ZAPPALA', 'Mariano'], ['ZAPPALA', 'Anna']]}
{50: [['Baer', 'William'], ['Sposato', 'Carolina']], 30: [['Baer', 'Ruth']], 10: [['Baer', 'Robert'], ['Baer', 'William'], ['Sposato', 'Antonio'], ['Sposato', 'Ralph']], 20: [['Sposato', 'Albert'], ['Sposato', 'Carlo'], ['Sposato', 'Frances'], ['Zappala', 'Mariano'], ['Zappala', 'Anna']]}