Assignment 1
CSci 39579: Introduction to Data Visualization¶
Department of Computer Science, Hunter College, City University of New York
Spring 2026
All students must join the course’s gradescope using the provided access code: WN433V. Students must verify that their Gradescope account has their full name and CUNY email address in the account settings (Account > Edit Account).
Unless otherwise noted, programming assignments are submitted on the course’s Gradescope site and are written in Python. Code must be submitted in the form of a .ipynb with all cells containing output. Also, to receive full credit, the code should be compatible with Python 3.6 and written in good style.
To get full credit for a program, the file must include in the opening comment:
Your name, as it appears in your Gradescope registration.
The email you are using for Gradescope.
A list of any resources you used for the program. Include classmates and tutors that you worked with, along with any websites or tutorials that you used. If you used no resources (other than the class notes and textbooks), then you should include the line: “No resources used.”
For example, for the student, Thomas Hunter, the opening comment of his first program might be:
"""
Name: Thomas Hunter
Email: thomas.hunter1870@hunter.cuny.edu
Resources: Used python.org as a reminder of Python 3 print statements.
"""and then followed by his Python program.
Program 1: School Counts¶
Due date: 26/02/2026
Learning Objective: Altair is a declarative statistical visualization library for Python, based on Vega and Vega-Lite. In this assignment, you are going to create basic visualizations using Altair.
Available Libraries: Core Python 3.6+, Altair, Kaggle, Matplotlib, Plotly, and Pandas.
Data Sources: Kaggle:
Sample Datasets:
Kaggle¶
Kaggle is an awesome resource for aspiring data scientists or anyone looking to improve their visualization skills. It provides:
Interesting data sets
Feedback on how you’re doing
A leader board to see what’s good, what’s possible, and what’s state-of-art

The easiest way to download Kaggle datasets is to download as a zip. Navigate to the relevant dataset and click download then Download as a ZIP file. Once you have the zip file downloaded, extract it’s contents and place it in the same directory as your current .ipynb file.
If you’d prefer, you can set up kagglehub / kaggle api to import the dataset yourself
Create Intresting Visualizations¶
In this assignment, you must pick two of the provided sample datasets, along with a third dataset from kaggle that is not linked above.
For each of the relevant datasets you should:
Create two visualizations using Altair.
Each visualization must be accompanied by a Markdown cell that explains the visualization, your choices, and the point you are trying to convey. (1 Paragraph per Visualization).
You are free to create any visualization that you like. However, you must explain the reasoning behind your choices and the visualization must be appropriate for the information you are attempting to learn or convey.
There must be a total of 6 visualizations with justifications/insights.
Submission instructions¶
Ensure that your notebook contains:
all necessary code,
the visualizations,
relevant explainations
We will grade based on what is available in that repository.
Submit the file to gradescope and make sure the preview shows your notebook
If you’d like a early review of your submission, email me at rv846@hunter
Recommendations¶
Before trying to make a visualization, ask yourself a key question about the dataset that you’d need a visualization to show.
Think about your prior knowledge of the data, and apply it to justify(or disprove) your own beliefs.
For the personal dataset, choose something that intrests you, it does not necessarily have to be a concrete finding.
Focus on making visualizations that efficiently communicate the idea you want to send
This means that your explaination will sound incredibly repetitive to the image
Use the code from Lectures 3 and 4 for reference