EX08 - Analysis for Continuous Improvement


Overview

Courses such as COMP110, just like organizations, services, and products, can be improved through intentional iteration on their design in order to create more value for its stakeholders. This practice is generally known as continuous improvement.

What is creating value? This idea can manifest in many ways, such as: serving unmet needs, reframing problems as opportunities, identifying authentic demand, testing novel ideas, leveraging underutilized resources, extending existing solutions, and so on.

In a course, the stakeholders are typically:

  1. Students Enrolled
  2. Instructional Staff
  3. Academic Institution
  4. Societal Workforce

In this open-ended exercise, you will reflect on your personal experiences and observations in COMP110 and brainstorm modifications to the course that create value beyond its current design.

From your brainstormed ideas, you will consider them in the context of the anonymized course survey data submitted by you and your peers earlier in the semester. You will identify which one of your ideas does not or is least likely to have data to support or refute your idea and suggest how we might collect that data in the future. Then you will identify which one of your ideas does have data to analyze and carry out an analysis of your most promising idea.

Your will then perform an analysis of the data to explore the degree to which the data supports your idea. It is OK and expected that many ideas will not have conclusive support in the data and may even be rejected by the data. This is OK!

Your project will need to satisfy many specifications, so before you begin programming be sure to read this project’s write-up and the provided template notebook in full.

Rubric

Part 1. Creative Ideation (15pts)

  • Good - 5 pts - One brainstormed idea identifies value created for a specific stakeholder group.
  • Better - 10 pts - Three brainstormed ideas which each identify value created for a specific stakeholder group.
  • Best - 15 pts - Five brainstormed ideas which each identify value created for a specific stakeholder group.

Part 2. Identifying Missing Data (10pts)

  • Good - 5 pts - Identifies an idea that does not have applicable data to support or is unlikely to have enough data to support.
  • Better - 10 pts - Additionally is able to suggest a plausible and realistic way to collect data to support the idea in the future.

Part 3. Choosing Your Analysis (10pts)

  • Good - 5 pts - Chooses an idea that has data from the survey which could be analyzed in support of it.
  • Better - 10 pts - Articulates reason for choosing this idea to analyze over others based on its potential to create value.

Part 4. Analysis (50pts) - Each rubric item is independent of the others.

  • 5pts - Each code cell in your analysis is preceded by a markdown cell explaining what you are attempting to do. Walk us through your thought process.
  • 5pts - The code cells are executed and the notebook has saved outputs of code cells submitted.
  • 5pts - Import and make use of read_csv_rows function
  • 5pts - Import and make use of head function
  • 5pts - Import and make use of columnar function
  • 5pts - Import and make use of select function
  • 5pts - Import and make use of count function
  • 5pts - Define and use at least one helper function of your own design. One idea is a function that filters some data based on some criteria. For example, all values in a column that are greater than some threshold.
  • 5pts - Carries out a logical analysis given the stated idea being analyzed.
  • 5pts - Produces a chart or visualization of the relevant data being analyzed.

Part 5. Conclusion (15pts)

  • Good - 5 pts - Summarizes findings of analysis and the degree to which it supports, refutes, or is inconclusive regarding the idea put forth.
  • Better - 10 pts - Puts forth potential costs, downsides, or trade-offs of adopting the idea.
  • Best - 15 pts - Identifies extensions or refinements to this idea to consider as future work.

These are just the baseline requirements. In order to completely analyze the idea you choose to explore, more intermediate steps may be necessary!

Getting Started

You will get the data needed by “pulling” from the course workspace repository. Steps to do this:

  1. Be sure you are in your course workspace. Open the file explorer and you should see your work for the course. If you do not, open your course workspace through File > Open Recent.
  2. Open the Source Control View by clicking the 3-node (circles) graph (connected by lines) icon in your sidebar or opening the command palatte and searching for Source Control.
  3. Click the Ellipses in the Source Control pane and select “Pull, Push” from the drop-down menu, then select “Pull from…” A box will appear and you should select either “origin” or “upstream”, but not “backup”. This will begin the pulling process from the course repository. It should silently succeed. (If you are on macOS and do not see anything in source control anymore, it may be because of a macOS update. You can typically resolve this by opening a new Terminal, typing the command xcode-select --install, pressing enter and following its instructions. You will also need to restart VSCode after doing this.)
  4. Return to the File Explorer pane and open the data directory. You should see it now contains the csv file with the survey results called survey.csv.
  5. In your workspace’s exercises directory, you will see a folder named ex08. Inside that folder, there is a file named analysis.ipynb for this assignment.
  6. Additionally, create another file data_utils.py and copy in your functions from the previous exercise’s file with the same name!

Tour of the Data

row – Row number. Unique for each row of the CSV.

year – Expected graduation year. Possible values: 21, 22, 23, 24, 25. (Note: This is clearly a non-exhaustive list. Just simplified for the sake of the project)

unc_status – UNC status. Possible values: Returning UNC Student, Incoming Transfer Student, Incoming First-year Student

comp_major – Intention to major in CS. Possible values: Yes - BS, Yes - BA, Yes - Minor, No.

primary_major – Primary Major. Possible values: Advertising and PR, Asian Studies, Biology, Business, Chemistry, Clinical Lab Science, Communication, Communications, Computer Science, Cultural Anthropology, Earth Science, Economics, English, Environmental Science/Studies, Exercise and Sports Science, Geology, HPM, Information Science, Interdisciplinary Studies, Linguistics, Mathematics, Media and Journalism, Medical Anthropology, Music Preformance, Neuroscience, Nursing, Nutrition, Peace, War, and Defense, Philosophy, Physics, Political Science, Psychology, Radiology, Sports Administration, Statistics and Analytics, Studio Art, Undecided

prereqs – Prerequisites satisfied. Possible values are any combination of the following: MATH 129P, MATH 130, MATH 152, MATH 210, MATH 231, MATH 232, MATH 233, MATH 347, MATH 381, PHIL 155, PSYC 210, PSYC 215, STOR 112, STOR 113, STOR 120, STOR 151, STOR 155

prior_exp – Prior experience. Possible values: None to less than one month!, 2-6 months, 7-12 months, 1-2 years, Over 2 years

ap_principles – Completed AP Computer Science Principles. Possible values: Yes, No

ap_a – Completed AP Computer Science A. Possible values: Yes, No

other_comp – Completed a different, formal programming class. Possible values: UNC, Another college or community college, High school course (IB or other), On-line course, Other, None

prior_time – Amount of time spent self-directed programming learning. Possible values: None to less than one month!, 1 month or so, 2-6 months, 7-12 months, 1-2 years, > 2 years

languages – Programming languages student can identify by reading w/o reference material. Possible values are any combination of the following: Python, Java / C#, C / C++, JavaScript / TypeScript, Go, LISP / Scheme / Racket, Haskell, R / Matlab / SAS, BASIC, HTML / CSS, SQL, Bash, Other

hours_online_social – Number of hours a day spent interacting with digital technology for personal uses (e.g. social media, entertainment, personal communication)? Possible values: None, 0 to 2 hours, 3 to 5 hours, 5 to 10 hours, 10+ hours.

hours_online_work – Number of hours a day spent interacting with digital technology for work/school uses. Possible values: Possible values: None, 0 to 2 hours, 3 to 5 hours, 5 to 10 hours, 10+ hours.

lesson_time – Student completes each lecture’s lessons during the hours of the day registered for the course. In other words, if in Section 1, lessons are completed between 9:30pm-10:45am on Tu/Th, and if in Section 2, lessons are completed during class time. Possible values (1 being Never and 7 being Always): 1, 2, 3, 4, 5, 6, 7

sync_perf – Student’s performance in this course would improve if every lecture were synchronous with required attendance during the regularly scheduled meeting time. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

all_sync – Student would prefer this course to require every lecture be synchronous with required attendance during the regularly scheduled meeting time. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

flipped_class – If Tuesdays also became required in-person synchronous days, student would be willing to watch videos and complete lessons as homework over the weekend and Mondays to prepare for Tuesday lectures. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

no_hybrid – Moving forward in the semester, student believes in-person lectures should not be live streamed so that everyone is required to attend in-person. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

own_notes – Student keeps own notes for topics covered in lecture. Possible values (1 being Never and 7 being Always): 1, 2, 3, 4, 5, 6, 7

own_examples – When uncertain of how a concept works, student tries to come up with own examples in code. Possible values (1 being Never and 7 being Always): 1, 2, 3, 4, 5, 6, 7

oh_visits – On average, for a single programming exercise or project in this course, student typically needs to seek help in office hours about this many times. Possible values (0 being Zero and 5 being Five or More): 0, 1, 2, 3, 4, 5

ls_effective – Lesson videos are effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

lsqs_effective – Post-lesson questions on Gradescope are effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

programming_effective – Programming assignments are effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

qz_effective – Preparing for quizzes is effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

oh_effective – Office hours 1:1 appointments are effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree, Empty string if student has not attended OH): 1, 2, 3, 4, 5, 6, 7, ""

tutoring_effective – Tutoring is effective in helping student learn the topics of the course. Possible values (1 being Strongly Disagree and 7 being Strongly Agree, Empty string if student has not attended tutoring): 1, 2, 3, 4, 5, 6, 7, ""

pace – Student finds the pace of COMP110 to be moving… Possible values (1 being Very Slowly and 7 being Very Quickly): 1, 2, 3, 4, 5, 6, 7

difficulty – Student is finding COMP110 to be… Possible values (1 being Very Easy and 7 being Very Difficult): 1, 2, 3, 4, 5, 6, 7

understanding – So far, student is feeling like they typically… Possible values (1 being Are Lost and 7 being Understand Everything): 1, 2, 3, 4, 5, 6, 7

interesting – Student believes the topics they are learning in this course are intellectually interesting. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

valuable – Student believes the skills they are learning in this course will be valuable to them in the future. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

would_recommend – Student would recommend this course to other students in the Fall. Possible values (1 being Strongly Disagree and 7 being Strongly Agree): 1, 2, 3, 4, 5, 6, 7

Some notes before you begin

  • Some of the survey questions were optional, so there will not be a data value for every column in every row. This is expected. Instead the value will just be the empty str or "".
  • When you read in the CSV as a list[dict[str, str]] with your read_csv_rows function, every value is interpreted as a str, including numerical ones! Analysis on columns that include likert data (ratings 1-7), for example, will need to be converted to a numeric type for numeric analysis.

Some notes if you are on a M1 Mac!

If you have the new M1 Macbook, the Seaborn plotting package is not yet supported. An alternative is using Matplotlib, and an example is provided below. Feel free to use Google and any online resources to customize your plot to your liking!

import matplotlib.pyplot as plt
import numpy as np
import matplotlib.pyplot as plt

labels = ['Python', 'C++', 'Java', 'Perl', 'Scala', 'Lisp']
y_pos = [0, 1, 2, 3, 4, 5]  # the order the labels will be arranged in bar chart
performance = [10,8,6,4,2,1]

plt.bar(y_pos, performance, align='center', alpha=0.5)
plt.xticks(y_pos, labels)
plt.ylabel('Usage')
plt.title('Programming language usage')

plt.show()

Submission Instructions

Save your notebook!

Then run python -m tools.submission exercises/ex08 to build your submission zip for upload to Gradescope. Don’t forget to backup your work by creating a commit and pushing it to GitHub. For a reminder of this process, see the previous exercises.

All of the points for this project will be handgraded, so your autograder score should be 0/0. This blank screen is expected!

Contributor(s): Kris Jordan, Kaki Ryan, Izzy Ford