Module 2 Assessment — Python for Data Work#
Assessment Instructions#
This notebook is submitted for automated assessment.
Context: You will acquire, load, and analyse real financial data from authoritative sources:
Dow Jones Industrial Average (DJIA) from Wall Street Journal
USD/GBP Exchange Rate from the Federal Reserve (FRED)
Federal Funds Rate from the Federal Reserve (FRED)
Learning Objectives:
Download real financial data from professional sources
Load CSV data into pandas DataFrames
Clean and transform real-world data
Perform analysis on actual market data
Rules:
Do not rename required variables (e.g.,
djia_df,fx_df,rates_df)Ensure the notebook runs top-to-bottom without errors
Include visualisations where requested
Write clear analysis in Task 5
Total Points: 100
Assessment is performed after submission using automated tests on the server.
Part 0: Data Acquisition (Complete Before Starting)#
Before you begin the assessment tasks, you must download three datasets from authoritative financial sources. This is a critical skill for any data professional working with financial data.
Note: When clicking links below, you may see a “Redirect Notice” page - this is normal, just click to continue.
Dataset 1: Dow Jones Industrial Average (DJIA)#
Source: Wall Street Journal Markets
Link: https://www.wsj.com/market-data/quotes/index/DJIA/historical-prices
Steps:
Click the link above
You may need to create a free WSJ account or sign in
Set the date range:
Start Date: 01/01/2023
End Date: 31/12/2024 (or most recent available)
Click Download a Spreadsheet
Save the file as
djia_data.csvOpen the CSV to verify it contains columns for: Date, Open, High, Low, Close
Dataset 2: USD/GBP Exchange Rate#
Source: Federal Reserve Economic Data (FRED) - St. Louis Fed
Link: https://fred.stlouisfed.org/series/DEXUSUK
Steps:
Click the link above
You’ll see a chart of the “U.S. / U.K. Foreign Exchange Rate”
Above the chart, click Edit Graph (pencil icon)
In the panel that opens, find Customize data range:
Start: 2023-01-01
End: 2024-12-31
Click Apply to update the chart
Click the Download button (top right, looks like a download arrow)
Select CSV (data)
Save the file as
fx_usd_gbp.csv
What this data shows: How many US dollars ($) you need to buy 1 British Pound (£)
Dataset 3: Federal Funds Effective Rate#
Source: Federal Reserve Economic Data (FRED) - St. Louis Fed
Link: https://fred.stlouisfed.org/series/FEDFUNDS
Steps:
Click the link above
You’ll see a chart of the “Federal Funds Effective Rate”
Above the chart, click Edit Graph (pencil icon)
In the panel that opens, find Customize data range:
Start: 2020-01-01 (to see the full rate cycle)
End: 2024-12-31
Click Apply to update the chart
Click the Download button (top right)
Select CSV (data)
Save the file as
fed_funds_rate.csv
What this data shows: The interest rate at which banks lend to each other overnight — this is the key rate controlled by the Federal Reserve.
⚠️ Important: Working with Real-World Data#
Real-world data from different sources often has inconsistencies that you’ll need to handle. When working with these CSV files, be aware that:
Column names may not be exactly what you expect — they might have extra whitespace, different capitalisation, or use codes instead of descriptive names
Date formats vary between sources and may need conversion
Missing values are represented differently (FRED uses “.” for missing data)
Part of this assessment is demonstrating your ability to inspect, clean, and standardise data from multiple sources. Always check your column names with df.columns and inspect the first few rows before writing your analysis code.
Uploading Files to Google Colab#
Once you have downloaded all three CSV files to your computer, you need to upload them to Google Colab.
Run the cell below to upload your files:
# Upload your CSV files to Google Colab
# Run this cell and select all 3 files when prompted
from google.colab import files
print("Please upload your 3 CSV files:")
print(" 1. djia_data.csv")
print(" 2. fx_usd_gbp.csv")
print(" 3. fed_funds_rate.csv")
print()
print("A file picker will open - select all 3 files and click 'Open'")
print("-" * 50)
uploaded = files.upload()
print("-" * 50)
print(f"Successfully uploaded {len(uploaded)} file(s):")
for filename in uploaded.keys():
print(f" ✓ {filename}")
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 4
1 # Upload your CSV files to Google Colab
2 # Run this cell and select all 3 files when prompted
----> 4 from google.colab import files
6 print("Please upload your 3 CSV files:")
7 print(" 1. djia_data.csv")
ModuleNotFoundError: No module named 'google'
Verify Your Uploads#
Run the cell below to confirm all files are ready:
# Verify all required files are present
import os
files_required = {
'djia_data.csv': 'DJIA stock market data',
'fx_usd_gbp.csv': 'USD/GBP exchange rate from FRED',
'fed_funds_rate.csv': 'Federal Funds Rate from FRED'
}
print("Checking for required data files...")
print("=" * 50)
all_present = True
for filename, description in files_required.items():
if os.path.exists(filename):
size = os.path.getsize(filename)
print(f"✓ {filename}")
print(f" Size: {size:,} bytes | {description}")
else:
print(f"✗ {filename} - MISSING!")
print(f" Needed for: {description}")
all_present = False
print()
print("=" * 50)
if all_present:
print("All files present! You're ready to begin the assessment.")
else:
print("WARNING: Some files are missing!")
print("Please re-run the upload cell above and select the missing files.")
Setup#
Run this cell to import required libraries.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Set display options
pd.set_option('display.max_columns', 10)
sns.set_style('whitegrid')
print("Libraries loaded successfully!")
Task 1 — Load & Inspect DJIA Data (20 points)#
Load the Dow Jones Industrial Average (DJIA) historical data from your downloaded CSV file.
Requirements:
Load
djia_data.csvinto a DataFrame calleddjia_dfusingpd.read_csv()Parse the date column properly (hint: use
parse_datesparameter)The data should have columns:
Date,Open,High,Low,CloseSort the data by date (oldest first)
Display the first 5 rows and use
.info()to inspect the data
Hints:
Always inspect column names first with
df.columns— real-world CSV files often have formatting quirksColumn names might need cleaning (e.g.,
df.columns.str.strip()removes whitespace)You may need to rename columns to match the required names
Use
parse_dates=['Date']inpd.read_csv()to automatically convert dates
# First, let's peek at the raw CSV to understand its structure
# This helps you see what columns exist and how data is formatted
# Uncomment and run this to see raw file contents:
# !head -5 djia_data.csv
# YOUR CODE HERE
# Step 1: Load djia_data.csv into djia_df
# djia_df = pd.read_csv('djia_data.csv', parse_dates=['Date'])
# Step 2: Check column names - you may need to rename them
# print(djia_df.columns)
# Step 3: If needed, rename columns to match required format
# djia_df = djia_df.rename(columns={'old_name': 'new_name'})
# Step 4: Sort by Date (oldest first)
# djia_df = djia_df.sort_values('Date')
# Step 5: Display the data
# djia_df.head()
# djia_df.info()
Task 2 — Cleaning & Feature Engineering (20 points)#
Prepare the DJIA data for analysis.
Requirements:
Ensure the
Datecolumn is datetime typeCheck for and handle any missing values
Create a new column called
Daily_Returnthat calculates the percentage change in Close price:Daily_Return = ((Close - Previous Close) / Previous Close) * 100Hint: Use
.pct_change() * 100
# YOUR CODE HERE
# 1. Convert Date to datetime if needed
# 2. Check for missing values with .isna().sum()
# 3. Create Daily_Return column
Task 3 — Visual Analysis (20 points)#
Create visualisations to understand the DJIA data.
Requirements:
Create a time-series line plot of DJIA Close price over time
Create a histogram of Daily_Return values
Both charts must have appropriate titles and axis labels
# YOUR CODE HERE
# Create time-series plot of Close price
# YOUR CODE HERE
# Create histogram of Daily_Return
Task 4 — Multi-Dataset Analysis (20 points)#
Load and analyse foreign exchange (FX) data from your downloaded FRED CSV file alongside the DJIA.
Requirements:
Load
fx_usd_gbp.csvinto a DataFrame calledfx_dfusingpd.read_csv()Rename columns as needed to have:
Date,USD_GBPHandle any missing or invalid values (FRED uses “.” for missing data)
Create a column called
FX_Return(percentage change in USD_GBP)Merge or align the DJIA and FX data by date
Create a comparative plot showing both DJIA Close and USD_GBP over time
FRED Data Notes:
FRED CSV files typically have columns:
DATE,DEXUSUK(the series code)Missing values may appear as “.” - you’ll need to handle these
Daily FX data may have different dates than DJIA (different trading calendars)
Hints:
Use
na_values=['.']inpd.read_csv()to handle FRED’s missing value formatFor dual-axis plots, use
ax.twinx()or plot normalized values
# First, peek at the raw CSV to understand FRED's format
# Uncomment and run this to see raw file contents:
# !head -5 fx_usd_gbp.csv
# YOUR CODE HERE
# Step 1: Load fx_usd_gbp.csv into fx_df
# Note: FRED uses "." for missing values
# fx_df = pd.read_csv('fx_usd_gbp.csv', parse_dates=['DATE'], na_values=['.'])
# Step 2: Rename columns to required format
# FRED columns are typically: DATE, DEXUSUK
# fx_df = fx_df.rename(columns={'DATE': 'Date', 'DEXUSUK': 'USD_GBP'})
# Step 3: Handle missing values
# fx_df = fx_df.dropna() # or use .fillna() if appropriate
# Step 4: Create FX_Return column (percentage change)
# fx_df['FX_Return'] = fx_df['USD_GBP'].pct_change() * 100
# Step 5: Display the data
# fx_df.head()
# fx_df.info()
# Step 6: Create comparative visualisation
# Option A: Dual-axis plot
# fig, ax1 = plt.subplots(figsize=(12, 6))
# ax2 = ax1.twinx()
# ax1.plot(djia_df['Date'], djia_df['Close'], 'b-', label='DJIA')
# ax2.plot(fx_df['Date'], fx_df['USD_GBP'], 'r-', label='USD/GBP')
# plt.title('DJIA vs USD/GBP Exchange Rate')
# plt.show()
Task 5 — Macro Insight (20 points)#
Analyse the relationship between interest rates and market performance using real Federal Reserve data.
Requirements:
Load
fed_funds_rate.csvinto a DataFrame calledrates_dfusingpd.read_csv()Rename columns as needed to have:
Date(datetime),FEDFUNDSHandle any missing values (FRED uses “.” for missing data)
Create a plot showing the FEDFUNDS rate over time (2020-2024)
Write a short analysis (5-8 sentences) in the variable
analysis_textexplaining:What patterns you observe in interest rates (especially the 2022-2023 hiking cycle)
How changes in interest rates might affect stock markets
Any relationship you notice between interest rates, FX, and DJIA
FRED Data Notes:
FRED columns are typically:
DATE,FEDFUNDSMonthly data - each row represents the average rate for that month
The Fed raised rates aggressively in 2022-2023 - your data should show this!
Context for Analysis:
The Fed began raising rates in March 2022 to combat inflation
Higher rates typically pressure stock valuations
Rate differentials between countries affect exchange rates
# First, peek at the raw CSV to understand FRED's format
# Uncomment and run this to see raw file contents:
# !head -5 fed_funds_rate.csv
# YOUR CODE HERE
# Step 1: Load fed_funds_rate.csv into rates_df
# rates_df = pd.read_csv('fed_funds_rate.csv', parse_dates=['DATE'], na_values=['.'])
# Step 2: Rename columns to required format
# rates_df = rates_df.rename(columns={'DATE': 'Date'})
# Step 3: Handle missing values if any
# rates_df = rates_df.dropna()
# Step 4: Display the data
# rates_df.head()
# rates_df.tail()
# rates_df.info()
# Step 5: Create visualisation of FEDFUNDS over time
# plt.figure(figsize=(12, 6))
# plt.plot(rates_df['Date'], rates_df['FEDFUNDS'], 'g-', linewidth=2)
# plt.title('Federal Funds Rate (2020-2024)')
# plt.xlabel('Date')
# plt.ylabel('Rate (%)')
# plt.grid(True)
# plt.show()
# YOUR ANALYSIS HERE
# Write 5-8 sentences analysing the REAL data you've loaded and visualised
analysis_text = """
YOUR ANALYSIS HERE - Replace this text with your analysis based on the real data.
Consider these questions as you write:
1. What pattern do you see in the Federal Funds Rate from 2020-2024?
- When did rates start rising? How high did they go?
- What was happening in the economy that caused the Fed to act?
2. How did the DJIA behave during the rate hiking cycle?
- Did stocks fall when rates rose rapidly?
- What happened to stocks as rates stabilised?
3. What relationship do you notice between USD/GBP and interest rates?
- Higher US rates typically strengthen the dollar
- Did you observe this in your data?
4. What insights can you draw from analysing these three datasets together?
"""
print(analysis_text)
Submission#
To submit your completed assessment:
Save your notebook: File → Download → Download .ipynb
Upload here: Submit Assessment
Note: Make sure your notebook runs without errors before submitting!