BYU logo Computer Science

Using the Internet

This is what the Arpanet (the precurser to the Internet) looked like in December 1969. You can think of it like a graph — nodes and edges, communicating with each other.

the Arpanet, December 1969

Here is what the network of a worldwide Internet Service provider looks like. This is what just what one company operates. The Internet is a bunch of these connected in many places.

Level 3 network diagram

When you zoom out, this is kind of what the Internet looks like. This is really only part of it, and the image pre-dates our world of smart toasters and smart doorbells and smart speakers.

Internet visualization

From your perspective, the Internet is a system that lets your devices connect to various sites on the Internet:

diagram of the Internet

When you use a web browser, you’re connecting from your machine to some other machine on the Internet.

Netflix screenshot

An important part of this is the URL shown in the URL bar:

url bar in the web browser

What if I told you that you could write a one-line Python program to download anything you want from the Internet?

stranger things mind blown

You will need to install the requests library:

pip install requests

Now you can download anything:

import requests

response = requests.get('https://www.google.com/')

OK, but what’s in the response?

One important thing is the status code:

response.status_code
  • 200: OK - Successfully downloaded the item
  • 301: Moved - That item is at a different URL now
  • 403: Forbidden - You are not authorized to access that item
  • 404: Not Found - Couldn’t find anything at that URL
  • 500: Server Error - The server hosting that item has crashed
response = requests.get('https://amazon.com/something/that/probably/doesnt/exist.html')
response.status_code

Another important thing is the item itself. This is Google’s home page:

response = requests.get('https://www.google.com/')
response.text

Let’s do something more useful:

import requests
from byuimage import Image

# download the comic
response = requests.get('https://imgs.xkcd.com/comics/galaxies_2x.png')

# write the comic to a file -- use 'wb' because we are 'writing binary'
# note that we want to use response.content instead of response.text because this is not a text file
with open('galaxies_2x.png', 'wb') as file:
    file.write(response.content)

# load the image with the image library and show it
image = Image('galaxies_2x.png')
image.show()

APIs

(Not important) API stands for Application Programming Interface

(Important) If a web site supports an open API, it is providing a way for you to access its resources

Let’s look at an example: cat facts

/facts/random is also called an “endpoint”

hostname = 'https://cat-fact.herokuapp.com'

response = requests.get(hostname + '/facts/random')
if response.status_code == 200:
    print(response.text)
else:
    print(f"error: {response.status_code}")

We got back a dictionary! In string form!

This is JSON. It is one of a handful of common formats used to exchange structured information on the web. We can convert it into a Python dictionary using:

cat_fact = response.json()
print(cat_fact)

Let’s print it in a more nicely formatted way:

import json

formatted = json.dumps(response.json(), indent=4)
print(formatted)

We can get the fact with:

cat_fact['text']

So now we can do this:

def get_random_fact():
    hostname = 'https://cat-fact.herokuapp.com'

    response = requests.get(hostname + '/facts/random')
    if response.status_code == 200:
        cat_fact = response.json()
        return cat_fact['text']
    else:
        return f"error: {response.status_code}"



print(get_random_fact())
print(get_random_fact())
print(get_random_fact())

There is lots more we can do with this API

  • /facts/random/?animal_type=horse — get a random fact about a horse (or substitute some other animal type)
  • /facts/random/?amount=2 — get two random facts (or substitute some other number)
  • /facts/random/?animal_type=cat&amount=3 — get three random facts about a cat

Everything after a ? is called a parameter. They have the format keyword=value. You combine them with &.

hostname = 'https://cat-fact.herokuapp.com'

url = f"{hostname}/facts/random/?animal_type=cat&amount=3"
response = requests.get(url)
print(response.json())

Notice that we get back a list of cat facts!

def get_random_cat_facts(number):
    # initialize a result list
    results = []

    # setup the url
    hostname = 'https://cat-fact.herokuapp.com'
    url = f"{hostname}/facts/random/?animal_type=cat&amount={number}"

    # make the request
    response = requests.get(url)
    if response.status_code == 200:
        cat_facts = response.json()
        for fact in cat_facts:
            results.append(fact['text'])
        return results
    else:
        return f"error: {response.status_code}"

print(get_random_cat_facts(5))

If you know the ID of a fact you can get it directly:

  • /facts/:factID
def get_fact(fact_id):
    hostname = 'https://cat-fact.herokuapp.com'
    url = f"{hostname}/facts/{fact_id}"
    response = requests.get(url)
    if response.status_code == 200:
        cat_fact = response.json()
        return cat_fact['text']
    else:
        print(f"error: {response.status_code}")

fact_id = '621563361cf2c5b445eb4a6c'
print(get_fact(fact_id))

The Star Wars API

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/films')
print(response.json())

Let’s make this look prettier:

import json

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/films/1')
formatted = json.dumps(response.json(), indent=4)
print(formatted)

What is the path to get the characters in the film?

What is the path to get the planets in the film?

What is the path to get the opening crawl of the film?

So now we can print out some of the information about a Star Wars film

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/films/1')
film = response.json()
print(film['result']['properties']['title'])
print('\n')
print(film['result']['properties']['opening_crawl'])

Notice that the film has a list of URLs for characters, planets, spaceships, etc. Let’s look at the characters.

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/films/1')
film = response.json()
characters = film['result']['properties']['characters']
print(characters)

We can use these URLs to find out information about each of the people in the film.

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/people/1')
print(response.json())

And likewise we can find out about the planets:

base = 'https://www.swapi.tech/api'
response = requests.get(base + '/planets/1')
print(response.json())

Let’s put it all together!

def get_character_and_home_world(url):
    character = requests.get(url).json()
    name = character['result']['properties']['name']
    homeworld_url = character['result']['properties']['homeworld']
    homeworld = requests.get(homeworld_url).json()
    world = homeworld['result']['properties']['name']
    return (name, world)

def print_film_and_characters(film_id):
    base = 'https://www.swapi.tech/api'
    response = requests.get(base + '/films/1')
    film = response.json()
    print(film['result']['properties']['title'])
    print('\n')
    character_urls = film['result']['properties']['characters']
    for url in character_urls:
        (name, world) = get_character_and_home_world(url)
        print(f"{name}: {world}")

print_film_and_characters(1)

If you want to check out other APIs, see Public APIs.

You are ready to use any API that lists “No” for “Auth”, meaning they don’t require authentication (an account). If you want to do a little more learning, you could try one that requires an API Key for authentication. You will need to learn how to get a key and how to send it in your requests. Those requiring OAuth will require significantly more work to understand (not recommended at this point).