Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Syllabus

CSci 39542: Introduction to Data Science

Department of Computer Science, Hunter College, City University of New York
Spring 2026


Description

3 hours, 3 credits: This topics course focuses on computational methods and statistical techniques to analyze data and make inferences. Topics include data collection and cleaning, exploratory data analysis and visualization, and statistical inference and prediction. Students will acquire a working knowledge of data science through hands-on projects with real-world data. Basic proficiency in statistics and Python programming is assumed, as well as experience with abstract data structures.

Prerequisites: CSci 127, Stat 213, and one of: CSci 133 or CSci 235.

Instructor: Ryan Vaz Office hour: 4.00PM - 5.00PM TuTh North 1008 or by appointment


Grading Policy

Course Format:

Expectations:

Completing homework is an essential part of the learning experience. Students are expected to learn both the material covered in class and the material in the textbook and other assigned reading.

Honor Code:

You are encouraged to work together on the overall design of the programs and homework. However, for specific programs and homework assignments, all work must be your own. As a general rule, do your own typing. Submitting work of others, or not safeguarding your work from copying, are academic integrity violations. You are responsible for knowing and following Hunter College’s Academic Integrity Policy:

Hunter College regards acts of academic dishonesty (e.g., plagiarism, cheating on examinations, obtaining unfair advantage, and falsification of records and official documents) as serious offenses against the values of intellectual honesty. The College is committed to enforcing the CUNY Policy on Academic Integrity and will pursue cases of academic dishonesty according to the Hunter College Academic Integrity Procedures. All incidents of cheating will be reported to the Office of Student Conduct in the Vice President for Student Affairs and Dean of Students office.

Lecture Participation:

Participation in lecture is measured by collected classwork. If you miss or do poorly on a classwork, your grade on the final exam will replace the missing or low grade.

Programming Assignments:

Midterm Examination:

The midterm covers material from the lecture notes, code demonstrations, classwork, and submitted programs. There is no make-up midterm examination. Instead, your score on the final exam will replace missing midterm grades (the final exam grade will also replace the midterm grade if you earn a higher grade on the final than the midterm).

Final Exam: There is a 2-part exam consisting of written and coding questions:

Coding exam will be given either during finals week or the last week of class. It is a case-study style exam where you will be given a dataset and asked to perform various data science tasks on it within 24 hours. Expectation and rubric will be provided closer to the exam date.

Writing exam: more details to come ...

Project: A final project is optional for this course. The grade for the project is a combination of grades earned on the milestones (e.g. deadlines during the semester to keep the projects on track) and the overall submitted program. If you choose not to complete the project, your final exam grade will replace its portion of the overall grade.

Grades:

Emergencies:

To respect your privacy, there is no need to provide documentation to take advantage of the dropping/replacing grades policies. It is done automatically. See individual sections above for details. If you are going to miss more than 2 weeks of class and associated work, contact us, so we can make arrangements for you to take the course in a future term.


Materials, Resources and Accommodating Disabilities

This is a zero cost course. All textbook materials are freely available to enrolled students.

Textbook & Readings:

Additional readings and tutorials are available on the resources page.

Technology:

This is a programming-intensive course in the Python programming language. See the resource page on Brightspace for obtaining Python and the packages used, links for submitting assignments and assessments. All software used is freely available.

Computer Access:

A computer (capable of running Python 3) is needed to complete the on-line assessments, and programming assignments and projects. Hunter College is committed to all students having the technology needed for their courses. If you are in need of technology, see Student Life’s Support & Resources Page.

Accommodating Disabilities:

In compliance with the American Disability Act of 1990 (ADA) and with Section 504 of the Rehabilitation Act of 1973, Hunter College is committed to ensuring educational parity and accommodations for all students with documented disabilities and/or medical conditions. It is recommended that all students with documented disabilities (Emotional, Medical, Physical, and/or Learning) consult the Office of AccessABILITY. For further information and assistance, see their contact page.

Hunter College Policy on Sexual Misconduct:

In compliance with the CUNY Policy on Sexual Misconduct, Hunter College reaffirms the prohibition of any sexual misconduct, which includes sexual violence, sexual harassment, and gender-based harassment retaliation against students, employees, or visitors, as well as certain intimate relationships. Students who have experienced any form of sexual violence on or off campus (including CUNY-sponsored trips and events) are entitled to the rights outlined in the Bill of Rights for Hunter College.

See CUNY Policy on Sexual Misconduct Link.


Course Objectives

At the end of the course, students should be able to:

  1. Acquire data sets from multiple sources and write programs that can extract, transform, and load the data into a usable form.

  2. Use exploratory data analysis and visualization techniques as well as linear algebra and statistical inference to extract new insights from the data.

  3. Apply predictive modeling and machine learning techniques to medium and large datasets.

  4. Understand the theory and interpret the results of predictive models and machine learning models.


Important Notes: