IS4010: AI-Enhanced Application Development

Week 7: Working with External Data

Brandon M. Greenwell

Working with External Data 🌐

From local files to the entire web

  • The progression: Your app β†’ Files β†’ JSON β†’ APIs β†’ The world
  • Why this matters: No app exists in isolation - data flows between systems constantly
  • Real-world reality: Twitter, Stripe, GitHub, OpenAI - all expose APIs you can use
  • Career skill: API integration is one of the most in-demand programming capabilities
  • This week: Build apps that persist data locally AND fetch live data from the internet

Part 1: Files & JSON πŸ“

πŸ€” The problem: Programs have amnesia

When the script ends, everything disappears

  • Every program so far: Variables, lists, dictionaries - all gone when the program exits
  • Real apps need memory: Contact lists, game progress, user preferences, shopping carts
  • The solution: Persistence - saving data to permanent storage (files, databases)
  • This enables: Apps that remember state between runs, data analysis on large datasets, sharing data between programs

Reading and writing files in Python

The standard pattern: with open() context manager

# Write to a file (overwrites existing content)
with open("my_note.txt", "w") as f:
    f.write("Hello from our application!")

# Read the content back
with open("my_note.txt", "r") as f:
    content = f.read()
    print(content)  # Output: Hello from our application!
  • Why with?: Automatically closes the file even if errors occur
  • File modes: "r" (read), "w" (write/overwrite), "a" (append)
  • Best practice: Always use with - prevents file corruption and resource leaks
  • πŸ“š Python file I/O documentation

🚧 The problem with plain text files

Text is unstructured and hard to parse

# Saving a contact list to a text file - messy!
with open("contacts.txt", "w") as f:
    f.write("Alice|alice@example.com|555-1234\n")
    f.write("Bob|bob@example.com|555-5678\n")

# Reading it back - lots of manual parsing
with open("contacts.txt", "r") as f:
    for line in f:
        parts = line.strip().split("|")
        name, email, phone = parts
        print(f"{name}: {email}")
  • Problems: Custom delimiters (|), fragile parsing, no nested data, no type info
  • What if: Email contains |? What about optional fields? Nested addresses?
  • The real solution: We need a standard data format everyone agrees on

Enter JSON: The universal data format 🌍

JavaScript Object Notation - The language of the web

  • JSON is a simple, human-readable text format for representing structured data
  • Created: Early 2000s by Douglas Crockford - became the web’s standard (JSON.org)
  • Why it won: Simple, language-agnostic, maps perfectly to most programming languages
  • Ubiquity: APIs, config files, databases, logs - JSON is everywhere
  • Python advantage: JSON maps directly to Python’s built-in types

JSON structure maps to Python

Almost identical syntax

JSON Type Python Type Example
Object {} Dictionary {"name": "Alice", "age": 30}
Array [] List [1, 2, 3, 4, 5]
String String "hello"
Number int/float 42, 3.14
Boolean Boolean true β†’ True, false β†’ False
Null None null β†’ None
  • Key insight: If you know Python dicts/lists, you already know JSON
  • This makes: Reading/writing JSON in Python incredibly easy

Using Python’s json library

Two key functions: dump() and load()

import json

# Python dictionary (complex, nested structure)
contacts = {
    "people": [
        {"name": "Alice", "email": "alice@example.com", "age": 30},
        {"name": "Bob", "email": "bob@example.com", "age": 25}
    ],
    "count": 2
}

# Write Python object to JSON file
with open("contacts.json", "w") as f:
    json.dump(contacts, f, indent=4)  # indent=4 makes it readable

# Read JSON file back into Python object
with open("contacts.json", "r") as f:
    data = json.load(f)
    print(data["people"][0]["name"])  # Output: Alice
  • json.dump(obj, file): Python object β†’ JSON file
  • json.load(file): JSON file β†’ Python object
  • indent=4: Makes JSON human-readable (use in development, skip in production)
  • πŸ“š Python json module docs

🎯 What the generated JSON looks like

contacts.json

{
    "people": [
        {
            "name": "Alice",
            "email": "alice@example.com",
            "age": 30
        },
        {
            "name": "Bob",
            "email": "bob@example.com",
            "age": 25
        }
    ],
    "count": 2
}
  • Human-readable: You can open it in any text editor
  • Standard format: Any language can read this (JavaScript, Java, C#, etc.)
  • Portable: Email this file to a teammate, they can load it instantly

πŸ’‘ Why JSON matters for APIs (preview)

The connection to Part 2

  • Files are local: Your computer reads/writes JSON files
  • APIs are remote: Other computers send/receive JSON over the internet
  • Same format: The JSON you just learned works for BOTH
  • This means: Once you can work with JSON files, you can work with web APIs
  • Coming up: How to fetch JSON from remote servers using HTTP requests

Part 2: Working with APIs πŸ”Œ

What is an API?

Application Programming Interface

  • An API is a set of rules that allows different software applications to communicate
  • Simple analogy: A restaurant menu
    • Menu (API) tells you what you can order (available operations)
    • You don’t need to know how the kitchen works (implementation details)
    • You just make a request, and get food back (data)
  • APIs define: What operations are available, what data to send, what you get back
  • πŸ“š What is an API? (MDN)

🌐 Web APIs: Apps talking over the internet

HTTP as the communication protocol

  • Web APIs use HTTP (HyperText Transfer Protocol) to communicate over the internet
  • Same protocol: Your web browser uses HTTP to load websites
  • The exchange:
    1. Your app sends an HTTP request to a URL
    2. Remote server processes the request
    3. Server sends back an HTTP response with data (usually JSON!)
  • Key insight: Reading API data is just like reading a JSON file, but from a remote server

πŸ” Anatomy of a web API request

URL structure and HTTP methods

https://api.github.com/users/octocat/repos
β””β”€β”¬β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”‚          base URL         endpoint path
protocol
  • Base URL: https://api.github.com - the server you’re talking to
  • Endpoint: /users/octocat/repos - the specific resource you want
  • HTTP method: GET (retrieve data), POST (send data), PUT (update), DELETE (remove)
  • For now: We’ll focus on GET requests to retrieve data
  • πŸ“š HTTP methods explained

Installing the requests library

pip install requests
  • Not built-in: Unlike json, we need to install requests
  • Why requests?: Clean, simple API - much easier than Python’s built-in urllib
  • Industry standard: Used by millions of Python developers
  • Alternative: Python 3.11+ has urllib3, but requests is still preferred for simplicity

Making your first API request

GET request to a public API

import requests

# Make a GET request to the PokeAPI
url = "https://pokeapi.co/api/v2/pokemon/pikachu"
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response into a Python dictionary
    data = response.json()
    print(f"Name: {data['name'].title()}")
    print(f"Height: {data['height']} decimetres")
    print(f"Weight: {data['weight']} hectograms")
else:
    print(f"Error: Received status code {response.status_code}")
  • requests.get(url): Makes HTTP GET request, returns response object
  • response.status_code: HTTP status code (200 = success)
  • response.json(): Parses JSON response β†’ Python dict (same as json.load()!)
  • πŸ“š PokeAPI documentation

πŸ“Š HTTP status codes

The server’s way of telling you what happened

Code Meaning Example
200 OK - Success Data retrieved successfully
201 Created New resource created
400 Bad Request Invalid data sent
401 Unauthorized Need authentication
404 Not Found Resource doesn’t exist
500 Server Error Something broke on server
  • Always check: response.status_code before processing data
  • Error handling: Different codes need different responses
  • πŸ“š HTTP status codes reference

πŸ”— The JSON connection: Same format, different source

Comparison: Files vs APIs

Reading JSON from a file:

import json

with open("data.json", "r") as f:
    data = json.load(f)
    print(data["name"])

Reading JSON from an API:

import requests

response = requests.get(url)
data = response.json()
print(data["name"])
  • Same result: Both give you a Python dictionary
  • Same skills: Working with dicts, lists, accessing nested data
  • Different source: One is local, one is remote
  • Key takeaway: JSON knowledge transfers directly between files and APIs

πŸš€ Real-world API examples

APIs you can use right now (no API key needed)

πŸ›‘οΈ Error handling with APIs

Networks are unreliable - plan for failure

import requests

url = "https://api.example.com/data"

try:
    response = requests.get(url, timeout=5)  # 5 second timeout
    response.raise_for_status()  # Raises exception for 4xx/5xx codes

    data = response.json()
    print(f"Success: {data}")

except requests.exceptions.Timeout:
    print("Error: Request timed out")
except requests.exceptions.ConnectionError:
    print("Error: Could not connect to server")
except requests.exceptions.HTTPError as e:
    print(f"Error: HTTP {e.response.status_code}")
except requests.exceptions.JSONDecodeError:
    print("Error: Response was not valid JSON")
  • Timeouts: Set with timeout= parameter (seconds)
  • raise_for_status(): Converts error codes into exceptions
  • Why this matters: APIs can be down, slow, or change without notice

⚑ API best practices

Being a good API citizen

  • Read the docs first: Every API has different rules and endpoints
  • Respect rate limits: Most free APIs limit requests (e.g., 1000/day)
  • Cache responses: Don’t request the same data repeatedly
  • Use timeouts: Don’t let your app hang forever waiting for response
  • Check status codes: Handle errors gracefully
  • API keys: Keep them secret (use .env files, never commit to git)
  • πŸ“š API development best practices

🎯 Putting it all together: Files + APIs

A complete data workflow

import json
import requests

# 1. Fetch live data from API
response = requests.get("https://pokeapi.co/api/v2/pokemon/ditto")
pokemon_data = response.json()

# 2. Process the data (extract what we need)
simplified = {
    "name": pokemon_data["name"],
    "height": pokemon_data["height"],
    "weight": pokemon_data["weight"],
    "types": [t["type"]["name"] for t in pokemon_data["types"]]
}

# 3. Save to local JSON file for offline use
with open("pokemon_cache.json", "w") as f:
    json.dump(simplified, f, indent=4)

print("Data fetched from API and saved locally!")
  • The pattern: Fetch β†’ Process β†’ Store
  • Why cache?: Faster loading, works offline, reduces API calls
  • Real apps do this: Weather apps, news readers, social media feeds

πŸ§ͺ Lab 07 overview

Two-part lab: Local persistence + Live API integration

  • Part 1: Build a contact book application
    • Store contacts as JSON file
    • Load contacts when app starts
    • Add, view, and search contacts
    • Data persists between runs
  • Part 2: Choose a public API and build a data viewer
    • Pick an API that interests you
    • Fetch live data with requests
    • Parse JSON response
    • Display formatted output to user
  • Bonus challenge: Combine both - cache API responses to JSON files!

πŸš€ Career relevance: Why this matters

API integration is a core professional skill

  • Every modern app: Integrates with external services (payments, auth, maps, notifications)
  • Job requirements: β€œExperience with RESTful APIs” appears in 80%+ of backend/full-stack job postings
  • Interview questions: β€œHow would you integrate Stripe payment processing?” β€œDesign a weather dashboard”
  • Real companies using APIs:
    • Stripe: Payment processing API ($95B valuation)
    • Twilio: SMS/Voice API ($12B valuation)
    • Mapbox: Maps API (powers Snapchat, Instacart)
  • Your portfolio: API projects demonstrate practical, real-world skills

πŸ“š Key takeaways

  • Persistence: Files let your programs remember state between runs
  • JSON: Universal data format - works for files AND APIs
  • APIs: Let your apps communicate with other systems over the internet
  • HTTP: The protocol that powers the web and most APIs
  • Requests library: Makes HTTP requests simple in Python
  • Error handling: Always check status codes and handle failures gracefully
  • The connection: json.load() (files) and response.json() (APIs) both give you Python dicts

πŸ”— Essential resources

Documentation and learning materials

πŸ’¬ Questions?

Next steps: - Review Lab 07 instructions in labs/lab07/README.md - Choose an API that interests you - Start with simple examples, then build complexity

Office hours: Available for API selection guidance and debugging!